from utils.rc.client.requests import Requests
from utils.rc.client.auth import AuthClient
from utils.rc.dtos.project import Project
from utils.rc.dtos.dataset import Dataset
from utils.rc.dtos.recipe import Recipe
from utils.rc.dtos.env import Env
from utils.rc.dtos.env import EnvType
from utils.rc.dtos.transform import Transform
from utils.rc.dtos.template_v2 import TemplateV2, TemplateTransformV2
from utils.rc.dtos.segment import Segment, ItemExpression, Operator
from utils.rc.dtos.scenario import Scenario
from utils.rc.dtos.global_variable import GlobalVariable
from utils.rc.dtos.segment import ItemExpression, Operator, RecipeExpression
from utils.rc.dtos.project_run import ProjectRun
from utils.rc.dtos.dataSource import DataSource
from utils.rc.dtos.dataSource import DataSourceType
from utils.rc.dtos.dataSource import GcpConfig
import pandas as pd
import logging
from utils.utils.log_util import LogUtil
LogUtil.set_basic_config(format='%(levelname)s:%(message)s', level=logging.INFO)
Creating a job
The following code block is used to create a job within a project for a specific scenario.
project_run = ProjectRun.create_project_run(
project.id,
name: str,
schedule: str,
job_scenario._id
)
Sample code:
project_run = ProjectRun.create_project_run(
'a3f5cc16-12ca-46cb-972a-c1703b85b4f1',
"jobrun3",
"0 9 * * *",
'eac24236-3e6b-4944-8709-b6cccf52d369'
)
Parameters
The following table gives description for each parameter to create a job.
Parameter name |
Parameter description |
Data type |
Required |
Example |
---|---|---|---|---|
project.id |
The id for the project. |
String |
Yes |
‘e71dbffc-1d62-4bb8-928d-f7dd8c214cb8’ |
job name |
The name of the job. |
String |
No |
test-run-v101 |
Job frequency |
The frequency at which the job should run. |
Yes |
||
job_scenario._id |
The scenario id of the job. If the scenario ID is not provided, the job runs on a default scenario. |
string |
Yes |
‘4dae540d-1717-4b87-8c1e-e60357c7f7f4’ |
Running the job manaully
The following code block can be used to run the scheduled job manually.
project_run.run_manually()
{'id': 'e996eec1-3cf7-4d39-aa40-9011fe5264df',
'created': 1686554009657,
'updated': 1686554022781,
'status': 'SUCCESS',
'trigger': 'MANUAL',
'error': None,
'variables': None,
'creator': 'srujana@rapidcanvas.ai',
'updater': 'srujana@rapidcanvas.ai',
'runId': '2023-06-12-07-13-29-655',
'endTime': 1686554022779,
'outEntityNames': ['output_suffix', 'Car_Price']}
Updating the job schedule
The following code block is used to update the job schedule.
project_run.update_project_run('0 5 * * *')
Enabling and disabling the project run
The following code block is used to enable and disable the project run. Calling the disable function will turn the job inactive and set the status of the job to inactive whereas invoking the enable function will make the job active and set the job status to active.
project_run.disable_project_run()
project_run.enable_project_run()
Deleting the project run
Use this code block to delete the job you have created for the scenario within a project.
project_run.delete_project_run()
INFO:Project Run deleted
Finding a job by name
Use this code block to find the job by name.
project_run.find_project_run_by_name(project_id: str, name: str)
Sample code:
project_run.find_project_run_by_name("a3f5cc16-12ca-46cb-972a-c1703b85b4f1","newjob")
Finding the run history of a specific job in a project
Use this code block to fetch the run history of a specific job.
project_run.find_all_project_run_entries()
[{'id': 'aec588fe-43d5-4d28-996f-496b494e043e',
'created': 1686626974000,
'updated': 1686626985000,
'status': 'SUCCESS',
'trigger': 'MANUAL',
'error': None,
'variables': {},
'creator': 'srujana@rapidcanvas.ai',
'updater': 'srujana@rapidcanvas.ai',
'runId': '2023-06-13-03-29-33-521',
'endTime': 1686626985000,
'outEntityNames': ['output_suffix', 'Car_Price']}]
Finding all jobs in a project
Use this code block to find all jobs created in a project.
project_run.find_all_project_runs('project.id')
Sample code:
project_run.find_all_project_runs("a3f5cc16-12ca-46cb-972a-c1703b85b4f1")
Configuring datasource destination to a specific job
Use this code block to configure the external data source destination to a particular job to save the output dataset after running the job to this destination.
project_run.add_project_run_sync(entity_id: str (dataset id), data_source_id: str, data_source_options: dict)
Sample code:
project_run.add_project_run_sync("0685591b-db69-4750-9ff9-e4b4a8398ec4", "5c9a6630-958b-4bd1-bdd6-5bf5bf73a853",
{
GcpConfig.OUTPUT_FILE_DIRECTORY: "test/",
GcpConfig.OUTPUT_FILE_NAME: "dataset.csv"
})