Correlation Matrix

The transform checks correlation of each variable to the target variable in feature engineering to learn the features that are highly correlated with each other. It is applicable only for numerical columns.

Parameters

The table gives a brief description about each parameter in Copy dataset transform.

Name:

By default, the transform name is populated. You can also add a custom name for the transform.

Raw Dataset:

The file name of the input dataset. You can select the dataset that was uploaded from the drop-down list to check the correlation of each variable to the target variable either positively or negatively. (Required: True, Multiple: False)

The sample input for this transform looks as below:

../../../_images/correlationmatrix_input.png

The dashboard after running the Correlation Matrix transform on the dataset appears as below:

../../../_images/correlationmatrix_output.png

How to use it in Notebook

The following is the code snippet you must use in the Jupyter Notebook editor to run the Correlation Matrix transform:

transform = Transform()
transform.name = "correlation matrix"
transform.templateId = correlation_matrix.id
transform.variables = {
    "inputDataset": dataset_w_one_hot_encoding.name,
}
recipe_corr_matrix = project.addRecipe([dataset_w_one_hot_encoding], name="correlation_matrix")
#recipe_corr_matrix.prepareForLocal(transform, contextId="correlation_matrix")
recipe_corr_matrix.addTransform(transform)
# TODO: Timeing issue. prir EDA may not have completed.
recipe_corr_matrix.run()