Building a solution
The following documentation will help you build a solution end to end on RapidCanvas through the notebook interface. Please ensure that you have the latest SDK installed before running this.
Building a solution on RapidCanvas involves the following steps:
Import functions
Authenticate your client
Create a Custom Environment
Create a new project
Build a flow file for the project
Execute the project
Publish the project as a solution
Update solution documentation
In the next section we will go through these steps using a sample project. Download the project files here:Reference Project
After unzipping, move the employee project folder to the root folder where you have installed your SDK.
Opening jupyter notebook
To open jupyter notebook you can use the following
jupyter-notebook
In Jupyter Notebook, you should be able to see the employee_flow.ipynb file and clicking on it will open these following steps
ℹ️ Please note that RapidCanvas only supports the default ipynb kernel in jupyter notebook
Import functions
# Before you go to the next step, execute your import function
import sys
from utils.rc.client.requests import Requests
from utils.rc.client.auth import AuthClient
from utils.rc.dtos.project import Project
from utils.rc.dtos.dataset import Dataset
from utils.rc.dtos.recipe import Recipe
from utils.rc.dtos.transform import Transform
from utils.rc.dtos.template_v2 import TemplateV2, TemplateTransformV2
from utils.rc.dtos.solution import Solution
from utils.rc.dtos.env import Env
from utils.rc.dtos.env import EnvType
from utils.rc.dtos.dataSource import DataSource
from utils.rc.dtos.dataSource import DataSourceType
from utils.rc.dtos.dataSource import GcpConfig
import json
import os
import pandas as pd
import logging
from utils.utils.log_util import LogUtil
LogUtil.set_basic_config(format='%(levelname)s:%(message)s', level=logging.INFO)
Authenticate your client
Authenticate your client using a token or your user credentials
# Requests.setRootHost("https://test.dev.rapidcanvas.net/api/")
AuthClient.setToken()
Creating a custom environment
A custom environment allows you to choose the infrastructure you need to execute your project. Here are the available custom environments and their usage gudelines
You can create a new env by executing the cell below:
env = Env.createEnv(
name="new_custom_envqq",
description="env for my projects",
envType=EnvType.SMALL, #pick one of the pre-defined configs
requirements="jq==1.2.2 yq==3.0.2 plotly==5.8.0", #additional packages to be installed for your custom env
async_flag = True
)
Creating a new project
Create a new project under your tenant
# Create project on platform
project = Project.create(
name='Employee',
description='Employee_promotion',
createEmpty=True,
envId=env.id
)
project.id
This has now created a new project named “Employee” under your tenant. You can check the same on the RapidCanvas UI by logging in here: RapidCanvas UI
Building a flow file for the project
Building a flow file for the project involves the following steps:
Upload your dataset:
Create a new template or use existing templates provided by RapidCavas for data modification:
Create a transform from the template
Create a recipe
Add a transform or a list of transforms to your recipe
Run your recipe
Push output of your recipe to a new table
Uploading your dataset
Execute cell below to create new tables and upload your dataset:
#This creates a dataset on RapidCanvas called "employee" and uploads the employee_promotion_case.csv file to it.
employee = project.addDataset(
dataset_name='employee',
dataset_description='Employee Promotion Dataset',
dataset_file_path='data/employee_promotion_case.csv' #path as per your folder structure in Jypyter
)
#This creates a dataset on RapidCanvas called "region" and uploads the Region_States.csv file to it.
region = project.addDataset(
dataset_name='region',
dataset_description='Region States Dataset',
dataset_file_path='data/Region_States.csv' #path as per your folder structure in Jypyter
)
Uploading your dataset - Google Cloud
This step allows you to create a custom data source. In this example we are connecting to Google Cloud Platform to which local data can be uploaded to and downloaded from
# dataSource = DataSource.createDataSource(
# "gcp-custom-source",
# DataSourceType.GCP_STORAGE,
# {
# GcpConfig.BUCKET: "YOUR BUCKET NAME HERE", #Get in touch with your RapidCanvas POC for your GCP bucket name, service account and access key
# GcpConfig.ACCESS_KEY: "/Users/../../access_key.json"} #Local path to your access key
# )
# dataSource.id
You can use the following commands to upload data from local to your RapidCanvas bucket in Gcloud
# from utils.notebookhelpers.gcs import GCSHelper
# gcs_helper = GCSHelper.init('/path/to/key.json', '<your-root-dir-name>')
# gcs_helper.list_files()
# gcs_helper.upload_file('/path/to/the/file/to/be/uploaded', '/relative/remote/dir/path') # if /relative/remote/path is not passed, file will be uploaded to root of the directory
# gcs_helper.download_file('/path/to/remote/file', '/path/to/dir/to/be/downloaded') # To download files in Gcloud bucket to local
Upload your dataset to your Google Cloud bucket before executing this step
# region_gcp = project.addDataset(
# dataset_name="region_gcp",
# dataset_description="region data from gcp",
# data_source_id=dataSource.id,
# data_source_options={GcpConfig.FILE_PATH: "region_states_gcp.csv"} #provide the file path as per your bucket
# )
# you can review a sample of data here
# region_gcp.getData()
Uploading your dataset - S3
#Update for S3
#Update for S3
Template Usage
You can create a new template or use an existing template provided by RapidCavas for data modification. Execute cell below to use an existing Time difference template:
time_diff_template = TemplateV2(
name="time_diff", description="Calculate the time difference between two dates",project_id=project.id,
source="CUSTOM", status="ACTIVE", tags=["UI", "Scalar"]
)
time_diff_template_transform = TemplateTransformV2(
type = "python", params=dict(notebookName="timediff.ipynb"))
time_diff_template.base_transforms = [time_diff_template_transform]
time_diff_template.publish("transforms/timediff.ipynb")
List existing templates
List existing templates from RapidCanvas library
templates = TemplateV2.get_all()
TemplateV2.clean_view(templates)
To further read about RapidCanvas templates refer to this section: Building a template
Create a transform from the template
A transform can be created from a template using the following:
calculate_age_transform = Transform()
calculate_age_transform.templateId = time_diff_template.id
calculate_age_transform.name='age'
calculate_age_transform.variables = {
'inputDataset': 'employee',
'start_date': 'birth_date',
'end_date': 'start_date',
'how': 'years',
'outputcolumn': 'age',
'outputDataset': 'employee_with_age'
}
Create a recipe
To create your recipe execute the following:
calculate_age_recipe = project.addRecipe([employee], name='calculate_age_recipe')
Add a transform to your recipe
You can add a single transform or multiple transforms to your recipe.
calculate_age_recipe.add_transform(calculate_age_transform)
Run your recipe
To run your recipe, execute the following:
calculate_age_recipe.run()
Output dataset and review sample
To generate output dataset and review a sample, execute the following:
employee_with_age=calculate_age_recipe.getChildrenDatasets()['employee_with_age']
employee_with_age.getData(5)
All these changes are auto updated on RapidCanvas UI. To review the flow created in the project on RapidCanvas UI, click on your project name in the Dashboard page: RapidCanvas UI
Template to build a visualisation
Here is another example of using templates in RapidCanvas
Create a new template or use existing templates provided by RapidCavas for data modification
geo_map_template = TemplateV2(
name="GeolocationMap", description="Plot map based on geolocation",project_id=project.id,
source="CUSTOM", status="ACTIVE", tags=["UI", "Visualization"]
)
geo_map_template_transform = TemplateTransformV2(
type = "python", params=dict(notebookName="GeoMap.ipynb"))
geo_map_template.base_transforms = [geo_map_template_transform]
geo_map_template.publish("transforms/GeoMap.ipynb")
Create a transform from the template
geo_map_transform = Transform()
geo_map_transform.templateId = geo_map_template.id
geo_map_transform.name='geomap_employee_location'
geo_map_transform.variables = {
'GeoDataset': 'employee_with_age',
'Lat': 'lat',
'Long': 'long',
'GeoChartName': 'employee_map_location'
}
Create a recipe
geo_map_recipe=project.addRecipe([employee_with_age], name='geo_map_recipe')
Add a transform or a list of transforms to your recipe
geo_map_recipe.add_transform(geo_map_transform)
# geo_map_recipe.prepareForLocal(geo_map_transform, contextId='new_transform', template_id=geo_map_template.id, nb_name="GeoMap.ipynb")
Run your recipe
geo_map_recipe.run()
You can view the output dashboard on RapidCanvas UI in your project: RapidCanvas UI
Publishing the project as a solution
List of existing solutions
You can look at the list of available RapidCanvas solutions here:
solutions = Solution.get_all()
Solution.clean_view(solutions)
Publishing a new solution
An end to end project can be convered and published as a solution. This allows other users to consume this.
# solutions = Solution.create(name="Sample Employee Solution", sourceProjectId=project.id, description="Sample Solution built on Employee Project", tags=["Sample", "New Users"], isGlobal=False, icon="icon_url")
#Solutions are published locally to your tenant and are accessible by other users in your tenant
#A published solution can be used to create a new project
Your published solution is now accessible as part of the Solutions UI. You can review the solution details here RapidCanvas UI
Update solution documentation
A published solution needs to be documented to inform users about the use case as well as the business impact
Sample solution documentation
You can refer to the documentation of this sample project here: Sample Project Documentation
Users are recommended to follow the documentation structure as listed in the sample project documentation.
You can update the documentation of your solution in GitHub under ` <https://github.com/../../projects/your_projects/project_name/doc/info.rst>`__