Recipes

A collection of recipes creates a machine-learning model. You can run the recipes on a dataset to perform a series of transformations. Each transformation does an action. You can apply various transformations in different stages of the building a machine learning model on the uploaded dataset and run them as recipes. After running the recipe, the transformed output is created, which can be a dataset, or a chart. On this dataset output, you can run another recipe until you attain the desired outcome.

Recipe Types

The platform allows you to use three forms of recipes -

  • Template - Allow users to apply pre-defined templates to perform data transformation operations.

  • AI-assisted - Allow users to take the help of AI to generate the code recipe or write the logic for the recipe using Python.

  • Rapid Model - Without using the above two recipe types and with zero coding knowledge, users can rapidly build models and with a few clicks.

Template recipe

The ready-to-use or system templates allow you to transform the data without writing the Python code on the UI. Using these standard templates, you can prepare the data, clean the data, add features, and split the data for testing and training the data and to build models. Running each recipe will transform the data in the flow.

By default, there are hundreds of system templates available. You can use them to transform the data and build simple to complex machine learning flows and subsequently models. You can also create custom templates at the project and tenant levels from Notebook and use them in your flows.

If you want to add a standard template to the flow, see Adding a standard transform or template within a recipe.

Adding a Template recipe

Use this procedure to add a transform within a template.

  1. Select the project to upload a dataset. For more information, see Connectors

../_images/recipeproject.png ../_images/datasetrecipe.png
  1. Click on the dataset block to run various data transformations on this dataset and build an ML model.

  1. Use any of these options to add a transform within a template recipe:

  • Click the plus icon ico1 on the canvas page and then select Template.

  • Select the dataset block. This opens the side panel. Click the plus icon ico1 and select the Template recipe.

  • Select the dataset block to open the side panel in which you click View. This opens the dataset page. Click the plus icon ico1 and then select the Template recipe.

../_images/canvasviewrecipeaddition.png ../_images/pulloutwindowtoaddrecipe.png ../_images/datasettoaddrecipe.png

The page where you can add data Transformations is displayed.

../_images/transformations.png
  1. Click Transformations.

Note

If you want to run a transform on a dataset, you must click on the dataset and add the recipe.

The Transformations side panel is displayed.

../_images/transformpullout.png
  1. Search for the transforms or templates you want to add to the ML flow or data pipeline on the canvas:

There are set of templates available for each stage of machine learning model. All the templates associated with a particular stage are assigned to a specific tag. Possible tags:

  • Data Cleaning

  • Data Preparation

  • Data Analysis

  • Beta

  1. Enter the transform name that you want to add to the data pipeline or filter the transforms by tags from the list.

../_images/filtertemplates1.png
  1. Click on the transform name to open the transform page.

  2. Specify the information in the respective fields of the selected transform. For more information, see Templates library

In this example, we have selected suffix to add suffix to all the columns in a dataset.

../_images/addtransformnew1.png
  1. Click Add to add the transform to the data pipeline and close the transform window.

Note

You can also add multiple transforms simultaneously, using the +New Transform.

  1. Click Test to test the transform and see the output before running this in the data pipeline. (Optional)

  1. Click the Run icon ico5 to run this recipe in the flow. Once the run is successful, this generates an output dataset or a dashboard.

../_images/runtransform1.png
  1. Click back button to navigate to the canvas from the transforms screen.

../_images/transform1.png
  1. Click the output dataset block. This opens the pull-out window.

  2. Click View to view the dataset with suffix to each column.

../_images/outputtransform.png

On this View data page, you can:

  • Click the caret icon ico6 to export the output dataset to a csv file, using the Export option.

  • Click the caret icon ico6 to delete the generated output, using the Delete option.

  • Click the plus icon ico1 to perform the following:

    • Append a file to the source dataset, using the File option. This option is only enabled for the source dataset.

    • Add a template recipe, using the Template option.

    • Add an AI-Assisted recipe, using the AI-assisted option.

    • Add a Rapid Model recipe, using the Rapid Model option.

    • Add a segment to the source dataset, using the +Segment option. This option is enabled only for the source dataset.

Viewing and editing the recipe details

Use this procedure to view the recipe details and edit the type of transform used within a recipe.

To view the recipe details:

  1. Select the recipe block that you have uploaded onto the canvas. This opens the pull-out window.

  2. Click inside the recipe name to modify change the recipe name.

  3. View the recipe details on the pull-out window:

../_images/viewrecipe.png
Recipe type:

The type of recipe used. This is the tag assigned to the transform within this recipe.

Created:

The date and time at which the recipe was created.

Last modification:

The last date and time at which the recipe was modified.

Last build:

The last date and time at which the recipe run was performed.

Inputs:

The input dataset on which the transformation is applied and the recipe run was performed.

Outputs:

The output dataset generated after running the recipe.

Timeout:

The duration after which the recipe stops to run. By default, the duration is set to 2 hours. You can change the duration based on the complexity of the recipe you are running in the flow. If the recipe runs longer than this, the recipe run will be terminated after the set period. It is expressed in hrs.

On this pull-out window, you can also:

  • View the recipe logs, using the Log icon ico10. This shows detailed record of all successful and failed execution of recipe runs. You can view full logs clicking the Logs option to open the logs page in a new browser tab. On this page, click View Full Log to view all logs.

  • View the recipe details, using the View option. This takes you to the respective recipe page.

  • Click the ellipses icon ico12 and select Delete to delete this recipe from the flow.

  • Run the recipe without navigating to the recipe page, using the Run option.

  1. Click View to review the details of the recipe. The recipe page is displayed.

  2. Click Transformations to view the transforms list in the project and select the transform whose details you want to modify, on the Transforms tab.

../_images/transformations1.gif
  1. Click UPDATE.

Exporting the output dataset

Use this procedure to export the output dataset to a csv file.

To export the output dataset:

  1. Select the dataset block, be it input or output dataset that you want to export to a csv file.

The pull-out window opens.

  1. Click the ellipses icon ico12

../_images/exportellip.png
  1. Select the Export option to download the dataset file onto your local system.

Deleting a recipe

Use this procedure to delete a recipe block from the canvas.

To delete a recipe block:

  1. Select the recipe block that you want to delete from the canvas. This opens the pull-out window.

  2. Click the ellipses icon ico12 and select the Delete option to delete the recipe.

../_images/deleterecipe.png

You can also delete the recipe from the Transforms list page, using the delete icon available under caret icon ico6. This page appears when you click View on the side panel window of the recipe block.

  1. A dialog box prompts that deleting the recipe deletes the recipe block, output datasets, and associated recipes.

  2. Click Delete to delete the recipe permanently from the canvas view or click Cancel to discard the action.

Running a specific recipe in the data pipeline

Use this procedure to run a particular recipe in the flow or data pipeline.

To run a recipe block:

  1. Select the recipe block that you want to run from the canvas. This opens the side panel.

  2. Click Run to run the recipe. The status of the recipe block changes to Running. Once the recipe run is successful, the status changes to Success.

../_images/runrecipe.png

You can also view output (dataset, model, or artifact) generated after running this recipe.

Exporting the output dataset to the connector

Use this procedure to save the output dataset to the configured connector that can be a cloud storage solution or a database.

  1. Select the output dataset block that you want to save to the connector, on the canvas. This opens the side panel.

  2. Select the Data connector from the drop-down. You can only see the connectors you have configured in this tenant.

  3. Enter the destination folder name and file name with which you want to save the file to this folder in this connector.

../_images/runrecipe12.png
  1. Click Save to save the destination details.

  2. Click Export to export the file to the connector.

You can delete the configured connector for this output dataset, using the delete icon |ico90|.

AI-assisted recipe

If there are no standard templates to perform data transformations, then AI-assisted recipes can be used. AI-assisted functionality is integrated with AI capabilities to enable business users to ask AI to generate the code for the given prompt. After the code is generated, you can add this to the recipe and run the data pipeline to view the output in the form of datasets or charts.

Ask AI

When you click on the dataset block in the canvas and select the AI-assisted option from the Add recipes drop-down, this opens the code editor where you can can take assistance of the AI to generate the code snippet for the given prompt.

Select the dataset

You must select the dataset on which you want the data transformation to be applied and code to be generated for the given prompt. You also have an option to select the output type to be generated to either a dataset or a chart.

Add the generated code to recipes in the data pipeline

Use the +Add to recipe option in the AI-assisted code editor to add the code generated by the AI to the flow or data pipeline. The Add to recipe option is available only after running the text prompt.

Save and Run the code recipe in the flow

Save the code and use the run option to run the code recipe to generate the output, which can be a dataset or a chart. You can continue building custom templates or code recipes using Ask AI.

Adding an AI-assisted recipe

Use this procedure to add an AI-assisted recipe to the data pipeline or ML flow using the integrated AI tool.

To add and run an AI-assisted recipe:

  1. Click the dataset block on the canvas to open the pull-out window.

Select the recipe

  1. Click the plus icon ico1 and select the AI-assisted recipe in the pull-out window.

../_images/coderecipe.gif

This opens the Ask AI tab where you can type the text prompt in the provided query box to generate the code recipe.

../_images/newcodeeditor.png

If you want to view the column names and data type of each column in the uploaded dataset, you can expand the datasets in the Inputs section on the left.

Note

  • Use the delete icon to delete the uploaded dataset.

  • Use the plus button to add multiple datasets to use in the code recipe. In the drop-down, you can only find the datasets that you have added onto your Project canvas. If there is only one dataset on the canvas, this button remains disabled.

Select the dataaset to run the recipe

  1. By default, the input dataset is selected from the list. Here, the input dataset is Titanic.

Note

You can select maximum of four datasets.

Enter the text prompt

  1. Enter the query in the provided query box. In this example, we have provided the query to concatenate two columns in the dataset and generate the code for this, “Concatenate the First_name and Last_name columns and generate a new column with name”.

Note

(Optional) From the ellipses icon, select Generate Query Limit to run the query only on the selected number of rows in the dataset. You have three options to select from:

  • Full data

  • 100k rows

  • 1 million rows

Select the output type you want to generate

  1. Turn ON the output type toggle you want to generate after running the data transform manually, else the platform will auto-detect the output type based on the given prompt. Possible values:

    • Dataset (By default, the dataset toggle is turned ON)

    • Charts

Generate the code

  1. Click the generate icon ico3 to generate the code. The generate button is enabled only after you select the dataset and provide the query.

The AI consumes the text prompt and generates the related code for concatenating two columns. You can see the output generated by the AI with the dataset size (total columns and rows)

../_images/generatedoutput.gif

You can only view 100 records in the output dataset. If you want to see lesser record, select the number of records you want to view from the drop-down list. You can also view the size of the dataset.

View the code

  1. Click the View Code icon ico311 to view the code generated by the AI for the given prompt. It is optional to view the code. You can go back to View output using the View Output icon.

../_images/outputdatasetai.gif

Add the generated code to recipe

  1. Click + Recipe to add the template for concatenating two columns to the recipe in the data pipeline. If you want to remove the recipe, click Remove from Recipe to remove from the pipeline.

Note

You can use the dataset generated by this recipe as an input for the next prompt using the Query icon.

../_images/addtorecipe.png
  1. Provide the custom name for the recipe.

../_images/updateerecipename.png
  1. Click Add to Recipe. After adding, you can see a green color line indicating the addition of this recipe to the data pipeline.

../_images/changedrecipename.png

Test the recipe on the dataset

  1. Click Test and select a test option to test the recipe on full dataset, 100k rows, or 1 million rows before saving and running this recipe in the data pipeline. Possible options:

  • Test (full data)

  • Test with 100k rows

  • Test with 1 Million rows

../_images/testcoderecipe1.png

You can see the test output dataset in a new tab as shown in the screenshot below:

../_images/testcoderecipe2.png

Save and run the recipe in the datapipeline

  1. Click Save to save the recipe.

  1. Click the Run icon ico5 to run the recipe and generate the output to see on the canvas. In this example, we want to generate a new column called full name after concatenating two columns that is first name and last name.

../_images/runthecoderecipe.png

Check the canvas

  1. Go back to the canvas to view the output.

../_images/outputcoderecipe.png
  1. Click the dataset block on the canvas. This opens the pull-out window.The output dataset will include an extra column resulting from the concatenation process that is the ‘name’ column.

../_images/recipeoutput_1.png
  1. Click View to see a dataset generated after concatenating two columns.

../_images/codrecipeoutput.png

Note

To check recipe logs, click on the recipe block on the canvas and from the side panel, click the logs icon ico4. This gives you access to the record of successful and failed recipes.

In AI-assisted recipes, you can also use code and snippets (default) to perform data transformations.

You can use the Code option in the AI-assisted recipe to write Python code and define logic for data transformation in the provided code editor. Subsequently, run this code recipe in the pipeline to transform the data and produce a dataset or a chart output.

Before running the code recipe in the data pipeline or flow, you can use the Test option to test the code and view the output. If the output is what you are expecting, you can run the custom code recipe in your flow.

If you want to add a code recipe to the flow, see Adding an AI-assisted recipe.

Adding code snippets to the data pipeline

Use this procedure to add code snippets to the data pipeline or flow. First, search for the code syntax in the code snippets list to get the syntax within which you must add the logic and run the data transform or recipe.

To add and run a code snippet:

  1. Click the dataset block on the canvas to open the pull-out window.

  2. Click the plus icon ico1 and select the AI-assisted recipe in the pull-out window. This takes you to the Ask AI tab.

  3. Click the ellipses icon and select Snippets. This displays the SNIPPETS button.

../_images/codesnippetsnew12.gif
  1. Click SNIPPETS.

../_images/codesnippetsnew2.png

This open the search box to find the template you are looking for to clean and prepare the data.

  1. Search for the Syntax within which the logic (code snippet) must be added. Click Copy to copy the syntax and paste in the Code tab.

../_images/codesnippetsnew22.gif
  1. Select the template based on the data transformation you want to perform. If we want to replace a value in the dataset, search for replace value snippets.

../_images/codesnippets1.gif
  1. Click Copy corresponding to the code block you want to use.

  2. Click the Code tab to paste the copied code snippets in the coding workspace.

../_images/codeblocknew1.gif

Now, add the dataset name in which you want to replace the value. Also, you must add the new value with which you want to replace the existing value.

  1. Click TEST to test the code and view how the output looks before you save and run this recipe in the data pipeline.

../_images/testcodesnippet.png
  1. Click Save to save the code recipe. This enables the run button.

  2. Click the Run icon ico5 to run the recipe and generate the output to see on the canvas.

Writing logic for the template from scratch

Use this procedure to write the data transformation template from scratch.

  1. Click the dataset block on the canvas to open the pull-out window.

  2. Click the plus icon ico1 and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.

  3. Click the Code tab.

../_images/codesnippetrecipe.png
  1. Write the logic for the code recipe in the provided coding space using Python, and click Test to test the code you have written.

  2. Click Save and then click the Run icon ico5 to run this transformation in the data pipeline.

Writing logic to generate Artifact

Use this procedure to write a logic to generate an artifact from the code tab and add the generated artifact to the data pipeline.

  1. Click the dataset block on the canvas to open the pull-out window.

  2. Click the plus icon ico1 and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.

  3. Click the Code tab and provide the below code to generate the artifact.

def transform(entities, context):
from utils.notebookhelpers.helpers import Helpers
from utils.dtos.templateOutput import ArtifactOutput

input_df_1 = entities['output_1'] # this is for reading input dataset

import pandas as pd
import numpy as np
output_df_1 = input_df_1.drop(['Age'], axis=1)

artifactsDir = Helpers.getOrCreateArtifactsDir(context, artifactsId = "test-artifact")
output_df_1.head(10).to_csv(artifactsDir + '/test.csv')

return {
    'output_2': output_df_1, # output_2 is the name of the output to be generated. Change the name as per your requirements.
    "test-artifact": ArtifactOutput()
    }

Important

You can test the artifact code by using the Test option.

  1. Click Save and then click the Run icon to add the generated artifact to the data pipeline.

Projects/project_images/artifact11.png

Writing logic to generate a model

Use this procedure to write a logic to generate an ML model from the code tab. You can later use this model on the similar datasets to make predictions.

  1. Click the dataset block on the canvas to open the pull-out window.

  2. Click the plus icon ico1 and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.

  3. Click the Code tab and provide the below code to generate the model.

  def transform(entities, context):
  from utils.notebookhelpers.helpers import Helpers
  from utils.dtos.templateOutput import ModelOutput
  from utils.dtos.rc_ml_model import RCMLModel


  input_df_1 = entities['output_3'] # this is for reading input dataset

  value_file = Helpers.getChildDir(context) + "/value.txt"
  with open(value_file, "w") as f:
      f.write("3")

  class TestModel(RCMLModel):
      def load(self, artifacts):
          value_file = artifacts['value']
          with open(value_file, "r") as f:
              self.value = int(f.read())

      def predict(self, model_input):
          x = float(model_input.values[0][0])
          output = x + self.value
          return output


  return {
      'test-model-code': ModelOutput(TestModel, {"value": value_file})
      }

4. Click **Save** and then click the **Run** icon to generate the model that is trained with the dataset in the pipeline.

Writing logic to add global variables

Use this procedure to add global variables to store artifacts and models built on the source dataset in a project.

  1. Click the dataset block on the canvas to open the pull-out window.

  2. Click the plus icon ico1 and select the AI-assisted Recipe recipe in the pull-out window. This takes you to the Ask AI tab.

  3. Click the Code tab and provide the below code to add global variables.

def transform(entities, context):
from utils.notebookhelpers.helpers import Helpers

input_df_1 = entities['titanic'] # this is for reading input dataset


import pandas as pd
import numpy as np
output_df_1 = input_df_1.drop(columns=['Sex'])

print("value of global variable:")
print(Helpers.get_global_var(context, "test-var"))

return {
    'output_1': output_df_1, # output_1 is the name of the output to be generated. Change the name as per your requirements.
    }
  1. Click Save and then click the Run icon to add the global variables.

Snippets

There are default snippets available to use for data cleaning and data preparation. You can test the code snippet before running the code recipe in the flow using the test option.

If you want to write logic for data transformations using Python from scratch, see Writing logic for template from scratch.

Rapid model recipe

You can use the rapid model recipe to solve an ML problem that falls into one of these categories, such as classification, regression, and binary classification, by creating an ML model on the historic dataset.

Developing ML models using the Rapid Model recipe

Use this procedure to build simple ML models using the Rapid model recipe type. This type of recipe eliminates writing the code template or using the predefined templates to perform data transformations. The platform performs all the data transformation steps automatically after you select the problem type and target column for the uploaded dataset.

  1. Click the dataset block on the canvas to open the pull-out window.

  2. Click the plus icon ico1 and select the Rapid Model Recipe recipe. This takes you to the recipe screen.

../_images/rapidmodelrecipeneww.png
  1. Select the dataset on which you want to perform the transformations and build ML model. By default, the dataset gets populated. However, if you want to run this recipe on any other dataset, select from the drop-down.

  2. Select the Problem Type. Supported problem types by the platform:

  • Binary Classification

  • Regression

  • MultiClass Classification

  • Timeseries Forecasting

  • Anamoly Detection

  • Clustering

  1. Select the target column on which you want to make predictions or build models by typing in search box. This field is displayed only after selecting the problem type.

../_images/rapidmodelrecipe.png
  1. Click Save and then click the Run icon ico5.

../_images/rapidmodelrun.gif

Note

  • The status is set to Running until the model is built.

  • To check the logs of this recipe model, click the Logs icon.

Once the project run is successful, the link is displayed to open the canvas.

  1. Click OPEN CANVAS. If you want to go back to the dataset view to rerun the recipe by changing the target column, you can click GO BACK TO DATASET VIEW.

../_images/rapidmodell.png

You can see the output dataset, chart, model and artifact generated as outputs after running this recipe.

../_images/rapidmodel.png

Adding a DataApp for Binary classification problem type

Use this procedure to create a Dataapp for binary classification, regression, binary experimental, and multi class classification problem types in the Rapid model recipe.

Prerequisites:

You can create a prediction service for the model directly from the canvas by clicking on the model block. This opens the side panel. Clicking on the Prediction service button takes you to the prediction service page.

To create a DataApp for binary classification problem type:

  1. Select the project in which you want to create a DataApp. You can only create DataApps for the binary classificaiton problem type in the Rapid Model recipe.

  2. Select Datapps from the project navigation menu. This opens the page to create a DataApp.

../_images/createapptemp.gif
  1. Click the plus icon ico1. The Create DataApp window is displayed.

../_images/dataAppdetails1.png
  1. Specify this information:

DataApp Name:

The name of the DataApp.

DataApp Description:

It is optional. The description for the DataApp.

Recipe Name:

Select the recipe you want to run in the DataApp.

  1. Click Create to create the DataApp.

../_images/dataappcreated.png
  1. Click the DataApp to view the feature importance, model performance, what-if analysis and predictions.

See also

  • glossary