Jobs

You can create custom jobs to implement automation (for example, custom tests and custom metrics) for models and deployments. Each job serves as an automated workload, and the exit code determines if it passed or failed. You can run the custom jobs you create for one or more models or deployments. The automated workloads defined through custom jobs can make prediction requests, fetch inputs, and store outputs using DataRobot’s Public API.

Manage jobs

The sections below outline how to manage custom jobs.

Create job

To create a job, use dr.registry.Job.create:

import os
import datarobot as dr

# Add files content using `file_data` argument
job = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    file_data={"run.sh": "echo 'hello world'"},
)

# Add files from the folder
job_folder = "my-folder/files"

job_2 = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    folder_path=job_folder,
)

# Add files as a list of individual file paths
job_3 = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    files=[(os.path.join(job_folder, 'run.sh'), 'run.sh')],
)

# If the files should be added to the root of the job filesystem with
# the same names as on the local file system, the above can be simplified to the following:
job_4 = dr.registry.Job.create(
    "my-job",
    environment_id="65c4f3ed001d3e27a382608f",
    files=[os.path.join(job_folder, 'run.sh')],
)

# Alternatively, a job can be created without the files,
# and the files can be added later using the `update` method
job_5 = dr.registry.Job.create("my-job")

Create hosted custom metric job from a template

To create a hosted custom metric job from a gallery template, use dr.registry.Job.create_from_custom_metric_gallery_template:

import datarobot as dr

templates = dr.models.deployment.custom_metrics.HostedCustomMetricTemplate.list()
template_id = templates[0].id

job = dr.registry.Job.create_from_custom_metric_gallery_template(
    template_id = template_id,
    name = "my-job",
)

List jobs

To list all jobs available to the current user, use dr.registry.Job.list:

import datarobot as dr

jobs = dr.registry.Job.list()

jobs
>>> [Job('my-job')]

Retrieve jobs

To get a job by unique identifier, use dr.registry.Job.get:

import datarobot as dr

job = dr.registry.Job.get("65f4453e6ea907cb0405ff7f")

job
>>> Job('my-job')

Update jobs

To get a job by unique identifier and update it, use dr.registry.Job.get() and then update():

import datarobot as dr

job = dr.registry.Job.get("65f4453e6ea907cb0405ff7f")

job.update(
    environment_id="65c4f3ed001d3e27a382608f",
    description="My Job",
    folder_path=job_folder,
    file_data={"README.md": "My README file"},
)

Delete jobs

To get a job by unique identifier and delete it, use dr.registry.Job.get() and then delete():

import datarobot as dr

job = dr.registry.Job.get("65f4453e6ea907cb0405ff7f")
job.delete()

Manage job runs

Use the following commands to manage job runs.

Create job runs

To create a job run, use dr.registry.JobRun.create:

import datarobot as dr
import time

job_id = "65f4453e6ea907cb0405ff7f"

# Block until job run is finished
job_run = dr.registry.JobRun.create(job_id)

# or run job without blocking the thread, and check the job run status manually
job_run = dr.registry.JobRun.create(job_id, max_wait=None)

while job_run.status == dr.registry.JobRunStatus.RUNNING:
    time.sleep(1)
    job_run.refresh()

List job runs

To list all job runs, use dr.registry.JobRun.list:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_runs = dr.registry.JobRun.list(job_id)

job_runs
>>> [JobRun('65f856957d897d46b0e54b37'),
     JobRun('65f8567f7d897d46b0e54b32'),
     JobRun('65f856617d897d46b0e54b2d')]

Retrieve job runs

To get a job run with an identifier, use dr.registry.JobRun.get,:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run
>>> JobRun('65f856957d897d46b0e54b37')

Update job runs

To get a job run by unique identifier and update it, use dr.registry.JobRun.get() and then update():

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.update(description="The description of this job run")

Cancel a job run

To get a running job run by identifier and cancel it, use dr.registry.JobRun.get() and then cancel():

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.cancel()

Retrieve job run logs

To get job run logs, use dr.registry.JobRun.get_logs:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.get_logs()
>>> 2024-03-18T16:06:46.044946476Z Some log output

Delete job run logs

To delete job run logs, use dr.registry.JobRun.delete_logs:

import datarobot as dr

job_id = "65f4453e6ea907cb0405ff7f"

job_run = dr.registry.JobRun.get(job_id, "65f856957d897d46b0e54b37")

job_run.delete_logs()