Jobs

Jobs allow you to run your code as jobs on the DataRobot platform to implement various workloads (tests, metrics etc).

Manage jobs

Use the following commands to manage jobs:

Create job

To create a job, use dr.registry.Job.create, as shown in the following example:

import os
import datarobot as dr

# add files content using `file_data` argument
job = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    file_data={"run.sh": "echo 'hello world'"},
)

# or add files from the folder
job_folder = "my-folder/files"

job_2 = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    folder_path=job_folder,
)

# or add files as a list of individual file paths
job_3 = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    files=[(os.path.join(job_folder, 'run.sh'), 'run.sh')],
)

# if the files should be added to the root of the job filesystem with
# with the same names as on the local file system, the above can be simplified to the following:
job_4 = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    files=[os.path.join(job_folder, 'run.sh')],
)

# or a job can be created without the files,
# and the files can be added later using the `update` method
job_5 = dr.registry.Job.create("my-job")

Create hosted custom metric job from template

To create a hosted custom metric job from gallery template, use dr.registry.Job.create_from_custom_metric_gallery_template, as shown in the following example:

import datarobot as dr

templates = dr.models.deployment.custom_metrics.HostedCustomMetricTemplate.list()
template_id = templates[0].id

job = dr.registry.Job.create_from_custom_metric_gallery_template(
    template_id = template_id,
    name = "my-job",
)

List jobs

To list all jobs available to the current user, use dr.registry.Job.list, as in the following example:

import datarobot as dr

jobs = dr.registry.Job.list()

jobs
>>> [Job('my-job')]

Retrieve jobs

To get a job by unique identifier, use dr.registry.Job.get, as in the following example:

import datarobot as dr

job = dr.registry.Job.get("65f4453e6ea907cb0405ff7f")

job
>>> Job('my-job')

Update jobs

To get a job by unique identifier and update it, use dr.registry.Job.get() and then update(), as in the following example:

import datarobot as dr

job = dr.registry.Job.get("65f4453e6ea907cb0405ff7f")

job.update(
    environment_id="65c4f3ed001d3e27a382608f",
    description="My Job",
    folder_path=job_folder,
    file_data={"README.md": "My README file"},
)

Delete jobs

To get a job by unique identifier and delete it, use dr.registry.Job.get() and then delete(), as in the following example:

import datarobot as dr

job = dr.registry.Job.get("65f4453e6ea907cb0405ff7f")
job.delete()

Manage job runs

Use the following commands to manage job runs:

Create job runs

To create a job run, use dr.registry.JobRun.create, as shown in the following example:

import datarobot as dr
import time

job_id = "65f4453e6ea907cb0405ff7f"

# block until job run is finished
job_run = dr.registry.JobRun.create(job_id)

# or run job without blocking the thread, and check the job run status manually
job_run = dr.registry.JobRun.create(job_id, max_wait=None)

while job_run.status == dr.registry.JobRunStatus.RUNNING:
    time.sleep(1)
    job_run.refresh()

List job runs

To list all job runs, use dr.registry.JobRun.list, as in the following example:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_runs = dr.registry.JobRun.list(job_id)

job_runs
>>> [JobRun('65f856957d897d46b0e54b37'),
     JobRun('65f8567f7d897d46b0e54b32'),
     JobRun('65f856617d897d46b0e54b2d')]

Retrieve job runs

To get a job run with an identifier, use dr.registry.JobRun.get, as in the following example:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run
>>> JobRun('65f856957d897d46b0e54b37')

Update job runs

To get a job run by unique identifier and update it, use dr.registry.JobRun.get() and then update(), as in the following example:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.update(description="The description of this job run")

Cancel a job run

To get a running job run by identifier and cancel it, use dr.registry.JobRun.get() and then cancel(), as in the following example:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.cancel()

Retrieve job run logs

To get job run logs, use dr.registry.JobRun.get_logs, as in the following example:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.get_logs()
>>> 2024-03-18T16:06:46.044946476Z Some log output

Delete job run logs

To delete job run logs, use dr.registry.JobRun.delete_logs, as in the following example:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.delete_logs()