Data transform¶
The transformation module is responsible for making the data transformations.
SQL transformation¶
This module makes SQL expressions available to transform. Example:
from pyiris.ingestion.transform import SqlTransformation
sql_transformation = SqlTransformation(name='divide',
description='Unit price division',
to_column="cost",
sql_expression="price/unit")
transformed_dataframe = sql_transformation.transform(dataframe=extracted_dataframe)
Hash transformation¶
This module returns a hash transformation based on an inputted column. Example:
from pyiris.ingestion.transform import HashTransformation
hash_transformation = HashTransformation(name='Hash CPF',
description='Hash CPF to be accord of LGPD',
from_columns=["cpf"])
transformed_dataframe = hash_transformation.transform(dataframe=extracted_dataframe)
Custom transformation¶
This module gives for the user tools to customize the dataframe, with the main custom features. Example of uses:
from pyiris.ingestion.transform.transformations.custom.custom import divide
from pyiris.ingestion.transform.transformations.custom_transformation import CustomTransformation
custom_transformation = CustomTransformation(name='divisao_preco_medio',
description='Dividing two fictitious columns (valor_venda/quantidade) to generate column praco_medio',
method=divide,
to_column='preco_medio',
column1='valor_venda',
column2='quantidade')
transformed_dataframe = custom_transformation.transform(dataframe=extracted_dataframe)
Transform Service¶
The class pyiris.ingestion.transform.TransformService works as a service. You can execute some transformations in sequence, or only one. Follow the example of uses:
from pyiris.ingestion.transform import TransformService, HashTransformation, SqlTransformation
transform_service = TransformService(
transformations=[
SqlTransformation(name='divide',
description='Filtering total prices bigger than R$100',
to_column="test",
sql_expression="total_price > 100"),
HashTransformation(name='Hash CPF',
description='Hash CPF to be accord of LGPD',
from_columns=["cpf"])
]
)
transformed_dataframe = transform_service.handler(dataframe=extracted_dataframe)
To have more information, please, access the code docstring in Pyiris modules.