Connect to external connectors

RapidCanvas connectors module enables you to interact with different external connectors to import data into the platform and make predictions on this data with the built machine learning models.

List of connectors supported:

Importing data from Google Cloud Storage

The data stored on Google cloud services can be imported to the platform by creating a connection to Google cloud storage. The connection can only be created with a valid JSON access key generated after creating a service account.

To import data from GCS:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data Connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection. The later option appears only when there are no data connectors in the tenant.

../_images/newdatasource_1.png
  1. Click the Google Cloud services tile.

../_images/googlecloudservices.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

../_images/gcsdatasource.png
  1. Specify this information to configure the Google cloud storage Data connector and access files:

Name:

The name of the Data connector.

Bucket:

The name of the bucket in which folders or files are stored in GCS. The bucket name used must be same as the name with which the bucket is created in the Google cloud storage.

Access key:

The valid JSON access key generated after creating a service account in GCS, to authenticate.

../_images/testdatasourcegcs.png
  1. Click the Test icon to check if you are able to establish the connection to the Data connector successfully. Once the connection is established, you can see the files imported from the GCS bucket to the platform. The list of files imported are populated in the table format. You can only view the file names, but cannot view or download.

../_images/filesimportedgcs.png
  1. Click Save to save the Data connector.

../_images/listofdatasources.png

This Data connector gets added to the already existing Data connectors on this tenant. You can use imported files from GCS in the drop-down list of data connectors while uploading a dataset on the canvas.

../_images/dropdowndatasource.png

Importing data from Amazon S3

You can import data from Amazon S3 cloud storage to the RapidCanvas platform. For this, you must establish a connection with Amazon S3 by providing the bucket name, Access key ID and secret access key. Once the connection is established successfully, it provides access to the bucket from where you can import the data to the platform.

To import data from Amazon S3:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/leftnavdatasources.gif
  1. Click the Amazon S3 tile.

../_images/S3datasourcecreate.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

../_images/amazons3createconn.png
  1. Specify this information to configure Amazon S3 Data connector and access folders and files stored inside the folder:

Name:

The name of the Data connector.

Bucket:

The name of the bucket in which folders or files are stored in GCS. The bucket name used must be same as the name with which the bucket is created in the S3.

Access keyid:

The access key ID is like username to connect to the S3 bucket.

Access key secret:

The access key secret is like password to connect to the S3 bucket.

../_images/datasources3.png
  1. Click Test to check if you are able to establish the connection to the Data connector successfully. Once the connection is established, you can see the files imported from the S3 bucket to the platform. The list of files imported are populated in the table format.

  2. Click Save to save the Data connector. This Data connector gets added to the already existing Data connectors on this tenant.

../_images/listofdatasourcesnew.png

Importing data from Azure Blob

You can import the data from Azure Blob to the RapidCanvas platform. For this, you must establish a connection with blob by providing the container name and connection string. After authenticating the request and establishing the connection successfully, you can have access to the resources in this container.

To import data from Azure Blob:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/newdatasource_1.png
  1. Click the Azure Blob tile.

../_images/azureblog.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

../_images/datasourceblob.png
  1. Specify this information to configure Azure blob Data connector and access files stored in this Azure storage account:

Name:

The name of the Data connector.

Containername:

The name of the container in which the data is stored.

Connectstr:

The connection string has authorization details for the platform to access the data stored in the Azure storage account.

../_images/azureblogcreate.png
  1. Click Test to check if you are able to establish the connection to the Data connector successfully. Once the connection is established, you can see the files imported from the Azure blob to the platform. The list of files imported are populated in the table format.

  2. Click Save to save the Data connector. This Data connector gets added to the already existing Data connectors on this tenant.

Importing data from Mongo DB

You can import the datasets from MongoDB to the RapidCanvas platform. For this, you must establish a connection with to the Mongo DB cluster by providing the connection string. After establishing the connection successfully, you can select the required data from the collection.

To import data from Mongo DB:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/newdatasource_1.png
  1. Click the MongoDB tile.

../_images/mongodbconnect.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

  2. Specify this information to configure MongoDB Data connector and fetch files from in this database:

../_images/datasourceconfi_1.png
Name:

The name of the data connector.

Connectstring:

The connection string has authorization details for the platform to access the data stored in the MongoDB storage account.

../_images/configmongo.png
  1. Click Test to check if you are able to establish the connection to the Data connector successfully.

  2. Click Save to save the database details.

  3. Specify this information on the Data tab:

Database:

The name of the database where the data is stored.

Collection:

The collection in the database you want to access.

Jsonquery:

The JSON query you have to pass to fetch the information.

  1. Click RUN QUERY to run this query and fetch the data from the database.

../_images/runquerymongodb.png

Once the connection is established, you can see the data imported from the MongoDB to the platform in the table format.

  1. Click Save to save the Data connector. This Data connector gets added to the already existing Data connectors on this tenant.

Importing data from Snowflake

You can import the datasets from Snowflake to the RapidCanvas platform. For this, you must establish a connection with to Snowflake by providing the account details. After establishing the connection successfully, you can select the warehouse and database from where you want to fetch the datasets.

To import data from Snowflake:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/newdatasource_1.png
  1. Click the Snowflake tile.

../_images/snowflakedbconnect1.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

../_images/createconnectsnowflake.png
  1. Specify this information to configure Snowflake Data connector and access files stored in this Snowflake account:

Name:

The name of the Data connector.

User:

The name of the user account.

Password:

The password for the user account.

Account:

The name of the account.

../_images/runquerysnow1.png
  1. Click Test to check if you are able to establish the connection to the Data connector successfully.

  2. Click Save to save the database details. The Data tab fields are enabled only after saving the connector details.

  3. Specify this information on the Data tab. The fields on this tab are enabled only after you establish connected with the Data connector.

Warehouse:

The warehouse to which you have to connect.

Role:

The permission assigned to the user.

Database:

The name of the database to use.

Schema:

The schema of the file.

Jsonquery:

The JSON query you have to provide in the terminal to fetch the data you want.

  1. Click RUN QUERY to run this query and fetch the data from the database.

../_images/runquerysnow.png

Once the connection is established, you can see the data imported from the Snowflake to the platform in the table format.

Importing data from MySQL

You can import the data from MySQL to the RapidCanvas platform. For this, you must establish a connection with MySQL database by providing the connect string. After establishing the connection successfully, you can query the database to fetch the data.

To import data from Snowflake:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/newdatasource_1.png
  1. Click the MySQL tile.

../_images/mysqldbconnect1.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

../_images/createconnectmysql.png
  1. Specify this information to establish connection with MySQL database and access fetch the data in the table form by querying:

Name:

The name of the Data connector.

Connectstring:

The string to connect to the database.

../_images/connectordetailsmysql.png
  1. Click Test to check if you are able to establish the connection to the Data connector successfully.

  2. Click Save to save the database details. You can only pass the query once the Data tab is enabled. This is enabled after successfully testing the connection.

  3. Specify this information on the Data tab.

  4. Provide the query in the terminal.

  5. Click RUN QUERY to run this query and fetch the data from the database. This button is enabled after you enter the query.

../_images/runquerysql.png

Once the connection is established, you can see the data imported from MySQL to the platform in the table format.

../_images/resultsdatabasesql.png

Importing data from Amazon Redshift

You can import the data from Amazon Redshift to the RapidCanvas platform. For this, you must establish a connection with this database by providing the connect string. After establishing the connection successfully, you can query the database to fetch the data from the available tables and load the data to the platform.

To import data from Amazon Redshift:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/newdatasource_1.png
  1. Click the Amazon RedShift tile.

../_images/myredshiftconnect1.png
  1. Click Create Connection. The Data connectors configuration page is displayed.

../_images/createconnectredshift.png
  1. Specify this information to establish connection with Amazon Redshift database and access fetch the data in the table form by querying:

../_images/datasourceconfiredshift.png
Name:

The name of the Data connector.

Connectstring:

The string to connect to the database.

  1. Click Test to check if you are able to establish the connection to the Data connector successfully.

  2. Click Save to save the database details. You can only pass the query once the Data tab is enabled. This is enabled after successfully testing the connection.

  3. Specify this information on the Data tab.

  4. Provide the query in the terminal.

  5. Click RUN QUERY to run this query and fetch the data from the database. This button is enabled after you enter the query.

../_images/runqueryredshift.png

Once the connection is established, you can see the data imported from Amazon Redshift to the platform in the table format.

../_images/resultsdatabasesqlquery_1.png
  1. Navigate to the Datasets tab to view the projects that are using the datasets fetched from this connector. You can see the datasets exported to this dataset after running the project at the scheduled time in the Data connector as destination section.

  2. Click the Jobs tab to view the jobs configured.

Importing data from Fivetran connectors

Use this procedure to import data from Fivetran connectors. With 300+ connectors, we have explained how to import data from one of the Fivetran connectors, i.e. GoogleDrive.

To import data from Google Drive:

  1. Click the menu icon ico2 and select Connectors. The Connectors page is displayed showing the total number of connectors.

../_images/leftnavdatasources.gif

The Data connectors screen is displayed.

  1. Click the plus icon ico1 on the top. You can also use the +New data connector button on the workspace to create a new connection.

../_images/googledrive_data.png
  1. Select Google Drive. It is a Fivetran connector.

../_images/createconnectgd.png
  1. Enter the name of the connector.

  2. Click Create Connection. This takes you to the fivetran page

../_images/fivetrangoogledrive.png
  1. Click Continue. This opens the page where you can provide the Google Drive details.

  2. Click Copy corresponding to the FiveTran email field to copy this email.

  3. Navigate to your Google Drive and click the folder you want share and sync to the platform, and then click the Share option. You must provide editor access to this folder for all.

../_images/sharethefolder.png
  1. Copy the folder URL where the files on the Google Drive are stored.

../_images/fivetrangoogledrive_1.png
  1. Click Save & Test to sync the datasets in Google drive to the platform. You can see the datasets that are syncing up.

Note

When the sync fails, you can use the manual sync option to restart the syncing process.

../_images/savetestgd.png

You can now view the datasets fetched from Google drive on this Data connector in the platform.

../_images/dataconnectorgoogledrive.png

Importing dataset(s) to the canvas from the local system

Use this procedure to import the file to the canvas from the local system on which you want to perform the predictions and generate a modeling pipeline. The maximum number of files you can upload is 25, and the file size is 5GB.

To import the file from the local system:

  1. Click the project to which you want to upload the file. The Canvas page is displayed.

  2. Click the plus icon ico1 and select Dataset to navigate to the Create New Data set window.

You can also use the + NEW DATASET button. However, this option is displayed only when there are no datasets uploaded onto the canvas. The Create New Data Set window is displayed.

../_images/adddataset.png
  1. By default, the project name is populated in the Project field.

  2. Select the source from where you want to upload.

You can either upload the file from the local system or create a new connection and import files from external Data connectors. For more information, see Connect to data connectors

After establishing the connection and importing the files, the imported files are populated in this drop-down list.

../_images/fileupload.png
  1. Select the Mode to upload the file. Possible options:

  • Single file import - Use this option to import only a single file onto the canvas.

  • Merge - Use this option to merge multiple files into one file. Ensure that schema in all the files is same.

  • Segregate - Use this option to upload multiple files together onto the canvas as separate files.

  1. Select Single file upload.

  2. Click IMPORT FILES FROM LOCAL to browse and upload the file from your local system.

../_images/fileupload_new.png
  1. Click IMPORT. Once the file is imported, you can view the file name and file size.

You can perform these actions:

  • If you want to delete the uploaded file, click the delete icon corresponding to this file name.

  • If you want to rename the file name, click the edit icon in the Dataset name.

  1. Click NEXT.

../_images/uploadsuccess.png
  1. Click FILE CONFIGURATION to expand and view the file configuration settings.

../_images/fileconfiguration.png
  1. Separator and Encoding are auto-detected by the platform when you upload the file and this file has a single column containing all column names separated by a specified separator.

Note

The separator option allows you to split all the values separated by a separator into different columns.

  1. Select separator from the drop-down list if the platform failed to auto-detect. Possible values:

  • Comma

  • Tab

  • Pipe

  • Colon

  • Semicolon

  1. Select the encoding option if it is not auto-detected by the platform. Possible encoding options

  • UTF-8

  • UTF-16

  • ISO-8859-1 (Latin-1)

  • Windows-1252 (CP-1252 or ANSI)

  • ASCII

  1. Click APPLY to apply the separator and encoding options you have selected. Please note that these options are only available for CSV files.

You can now see the schema and sample data for the columns by clicking the SCHEMA and SAMPLE Data options respectively.

../_images/newschema.png
  1. View the detected data type of each column in the dataset and select the new data type if the detected data type is incorrect.

  2. Click CLOSE. Once the dataset is added, you are redirected to the Canvas view page where you can see the uploaded dataset block.

Viewing options in dataset pull-out window

Use this procedure to view all actions you can perform through pull-out window of a dataset.

To view dataset options:

  1. Click on the dataset block on the canvas. This opens the pull-out window.

  2. Perform any of these actions:

  • Click the plus icon to select Template Recipe, AI-assisted Recipe or Rapid Model Recipe options. For more information, Recipes.

  • Click View to view the data in the file you have uploaded. For more information, see Viewing the dataset information.

  • Click the ellipses icon and select Export to export the file onto your local system in csv format. For more information, see Exporting a dataset to the local system.

  • Click the ellipses icon and select**Delete** to delete the dataset. For more information, see Deleting the uploaded dataset.

  1. View the summary of the dataset in the Summary section. You can create the summary in the Notebook.

  1. Review these details:

Created:

The date on which the file was uploaded.

Updated:

The date on which the file was last updated.

Total size:

The total file size.

Rows:

The total number of rows in the file.

Columns:

The total number of columns in the file.

Viewing the dataset information

Use this procedure to view the dataset information.

To view data:

  1. Select the dataset block that you have uploaded onto the canvas. This opens the pull-out window.

  2. Click View to download the dataset to your local system. The records in the dataset are displayed in the tabular format.

  3. Identify the data type of each column in the dataset on the Schema tab.

Name:

The name of the column in the dataset.

Type:

The data type of a column.

  1. Review the segments associated with this dataset, on the Segments tab.

Name:

The name with which segment is created.

Description:

The description for the segment.

Created:

The date and time at which the segment is created.

Rows:

The row limit for segmentation.

Actions:

You can use the Edit icon to edit the segment details and the delete icon to delete the segment linked to this dataset.

  1. Analyze the whole dataset to identify the missing values, total variables (Numeric, text, and categorical), total observations, duplicate values that help you clean up the data, on the Data Analysis tab.

  1. Navigate to the Correlation tab to extract correlations and relationships within the data. The correlation heat map shows how each variable in the dataset is correlated with one another, as a color-coded matrix.

  2. Navigate to the Alerts sub tab of the Data analysis tab where you can view the alerts and tagging given for those alerts.

You can perform the following actions on the dataset page:

  • Append a new file to this dataset, clicking the plus icon and selecting the File option. By default, the criteria is set to Append. For more information, see Adding a file.

  • Add a template recipe to this dataset, using the Template option. For more information, see Adding a template recipe.

  • Add an AI-Assisted recipe to this dataset, using the AI-assisted option. For more information, see Adding an AI assisted recipe.

  • Run the rapid canvas model recipe, using the Rapid Model option. For more information, see Developing ML models using Rapid model recipe.

  • Add a segment to this dataset, using the Segment option. For more information, see Creating a custom segment. This option is available only for the source dataset.

  • Download the dataset, using the Export option. For more information, see Exporting a dataset to the local system.

  • Delete the dataset and associated recipes with the dataset, using the Delete option.

Exporting a dataset to the local system

Use this procedure to download the input and output dataset to your local system in the csv file format.

To export a dataset:

  1. Select the dataset block that you have uploaded onto the canvas. This opens the pull-out window.

  2. Click Export to download the dataset in csv format to your local system.

../_images/exportdataset.png

You can also export the dataset from the dataset page, using the EXPORT option. This dataset page is displayed clicking VIEW DATA in the pull-out window.

Deleting the uploaded dataset

Use this procedure to delete a dataset block from the canvas.

To delete a dataset block:

  1. Select the dataset block that you want to delete from the canvas. This opens the pull-out window.

  2. Click Delete to delete the dataset.

../_images/deletedataset.png
  1. A dialog box prompts that deleting the dataset also deletes the recipes associated with it.

  2. Click Delete to delete the dataset permanently from the project or click Cancel to discard the action.

Reloading latest datasets from fivetran connector

Use this procedure to reload fresh data from the fivetran connector. This dataset syncs with the remote storage and retrieves the latest dataset.

To reload a dataset:

  1. Select the dataset block that you want to reload from the canvas. This opens the pull-out window.

  2. Click Reload to reload the dataset.

../_images/reloaddataset.png

3.A dialog appears. Click Reload to fetch the latest dataset. Ensure that the schema of this dataset same as the current one.