from utils.rc.client.requests import Requests
from utils.rc.client.auth import AuthClient

from utils.rc.dtos.project import Project
from utils.rc.dtos.dataset import Dataset
from utils.rc.dtos.recipe import Recipe
from utils.rc.dtos.env import Env
from utils.rc.dtos.env import EnvType
from utils.rc.dtos.transform import Transform
from utils.rc.dtos.template_v2 import TemplateV2, TemplateTransformV2
from utils.rc.dtos.segment import Segment, ItemExpression, Operator
from utils.rc.dtos.scenario import Scenario
from utils.rc.dtos.global_variable import GlobalVariable
from utils.rc.dtos.segment import ItemExpression, Operator, RecipeExpression
from utils.rc.dtos.project_run import ProjectRun

from utils.rc.dtos.dataSource import DataSource
from utils.rc.dtos.dataSource import DataSourceType
from utils.rc.dtos.dataSource import GcpConfig

import pandas as pd
import logging
from utils.utils.log_util import LogUtil
LogUtil.set_basic_config(format='%(levelname)s:%(message)s', level=logging.INFO)

Creating a job

The following code block is used to create a job within a project for a specific scenario.

project_run = ProjectRun.create_project_run(
    project.id,
    name: str,
    schedule: str,
    job_scenario._id
)

Sample code:

project_run = ProjectRun.create_project_run(
    'a3f5cc16-12ca-46cb-972a-c1703b85b4f1',
    "jobrun3",
    "0 9 * * *",
    'eac24236-3e6b-4944-8709-b6cccf52d369'
)

Parameters

The following table gives description for each parameter to create a job.

Parameters

Parameter name

Parameter description

Data type

Required

Example

project.id

The id for the project.

String

Yes

‘e71dbffc-1d62-4bb8-928d-f7dd8c214cb8’

job name

The name of the job.

String

No

test-run-v101

Job frequency

The frequency at which the job should run.

Yes

job_scenario._id

The scenario id of the job. If the scenario ID is not provided, the job runs on a default scenario.

string

Yes

‘4dae540d-1717-4b87-8c1e-e60357c7f7f4’

Running the job manaully

The following code block can be used to run the scheduled job manually.

project_run.run_manually()
       {'id': 'e996eec1-3cf7-4d39-aa40-9011fe5264df',
'created': 1686554009657,
'updated': 1686554022781,
'status': 'SUCCESS',
'trigger': 'MANUAL',
'error': None,
'variables': None,
'creator': 'srujana@rapidcanvas.ai',
'updater': 'srujana@rapidcanvas.ai',
'runId': '2023-06-12-07-13-29-655',
'endTime': 1686554022779,
'outEntityNames': ['output_suffix', 'Car_Price']}

Updating the job schedule

The following code block is used to update the job schedule.

project_run.update_project_run('0 5 * * *')

Enabling and disabling the project run

The following code block is used to enable and disable the project run. Calling the disable function will turn the job inactive and set the status of the job to inactive whereas invoking the enable function will make the job active and set the job status to active.

project_run.disable_project_run()
project_run.enable_project_run()

Deleting the project run

Use this code block to delete the job you have created for the scenario within a project.

project_run.delete_project_run()
INFO:Project Run deleted

Finding a job by name

Use this code block to find the job by name.

project_run.find_project_run_by_name(project_id: str, name: str)

Sample code:

project_run.find_project_run_by_name("a3f5cc16-12ca-46cb-972a-c1703b85b4f1","newjob")

Finding the run history of a specific job in a project

Use this code block to fetch the run history of a specific job.

project_run.find_all_project_run_entries()
[{'id': 'aec588fe-43d5-4d28-996f-496b494e043e',
  'created': 1686626974000,
  'updated': 1686626985000,
  'status': 'SUCCESS',
  'trigger': 'MANUAL',
  'error': None,
  'variables': {},
  'creator': 'srujana@rapidcanvas.ai',
  'updater': 'srujana@rapidcanvas.ai',
  'runId': '2023-06-13-03-29-33-521',
  'endTime': 1686626985000,
  'outEntityNames': ['output_suffix', 'Car_Price']}]

Finding all jobs in a project

Use this code block to find all jobs created in a project.

project_run.find_all_project_runs('project.id')

Sample code:

project_run.find_all_project_runs("a3f5cc16-12ca-46cb-972a-c1703b85b4f1")

Configuring datasource destination to a specific job

Use this code block to configure the external data source destination to a particular job to save the output dataset after running the job to this destination.

project_run.add_project_run_sync(entity_id: str (dataset id), data_source_id: str, data_source_options: dict)

Sample code:

    project_run.add_project_run_sync("0685591b-db69-4750-9ff9-e4b4a8398ec4", "5c9a6630-958b-4bd1-bdd6-5bf5bf73a853",
{
  GcpConfig.OUTPUT_FILE_DIRECTORY: "test/",
  GcpConfig.OUTPUT_FILE_NAME: "dataset.csv"
})