Count Unique Values
This transform counts number of distinct elements in a specified axis.
tags: [“EDA”]
Parameters
The table gives a brief description about each parameter in Copy dataset transform.
- Name:
By default, the transform name is populated. You can also add a custom name for the transform.
- Input Dataset:
The file name of the input dataset. You can select the dataset that was uploaded from the drop-down list to count the unique values. (Required: True, Multiple: False)
- Output Dataset:
The file name with which the output dataset is created. This file contains the list of distinct values in each column. (Required: True, Multiple: False)
- axis:
The axis can be 0 or 1. 0 indicates the distinct values in each column and 1 indicates the distinct columns in a dataset. (Required: True, Multiple: False, Datatypes: [“LONG”] , Options: [“CONSTANT”], Default_value: ‘0’, Constant_options: [0,1])
- dropna:
Whether or not the null values to be included. Possible values:
0 - This counts the distinct values along with null values in each column.
1 - This does not include the null values.
(Required: True, Multiple: False, Datatypes: [“LONG”] , Options: [“CONSTANT”], Constant_options: [0,1])
The sample input for this transform looks as below:
The output after running the Count Unique values transform on the dataset appears as below:
How to use it in Notebook
The following is the code snippet you must use in the Jupyter Notebook editor to run the Count Unique Values transform:
template=TemplateV2.get_template_by('Count Unique Values')
recipe_Count_Unique_Values= project.addRecipe([car_data, employee_data, temperature_data, only_numeric], name='Count Unique Values')
transform=Transform()
transform.templateId = template.id
transform.name='Count Unique Values'
transform.variables = {
'input_dataset':'car',
'output_dataset':'car_unique',
'value_1':0,
'value_2':0}
recipe_Count_Unique_Values.add_transform(transform)
recipe_Count_Unique_Values.run()
Requirements
pandas