Composable ML

Composable ML consists of two major components: the DataRobot Blueprint Workshop and custom tasks, detailed below.

Custom tasks provide users the ability to train models with arbitrary code in an environment defined by the user.

For details on using environments, see: Manage Execution Environments.

Manage Custom Tasks

Before you can upload code for a custom task, you need to create the entity that holds all the metadata.

import datarobot as dr
from datarobot.enums import CUSTOM_TASK_TARGET_TYPE

transform = dr.CustomTask.create(
    name="a convenient display name",  # required
    target_type=CUSTOM_TASK_TARGET_TYPE.TRANSFORM,  # required
    language="python",
    description="a longer description of the task"
)

binary = dr.CustomTask.create(
    name="this or that",
    target_type=CUSTOM_TASK_TARGET_TYPE.BINARY,
)

A task, by itself is an empty metadata container. Before using your tasks, you need create a CustomTaskVersion associated with it. A task that is ready for use will have a latest_version field populated with this task.

binary.latest_version
>>> None

execution_environment = dr.ExecutionEnvironment.create(
    name="Python3 PyTorch Environment",
    description="This environment contains Python3 pytorch library.",
)
custom_task_folder = "datarobot-user-tasks/task_templates/python3_pytorch"
task_version = dr.CustomTaskVersion.create_clean(
    custom_task_id=binary.id,
    base_environment_id=execution_environment.id,
    folder_path=custom_task_folder,
)

binary.refresh()  # In order to see the change, you need to GET it from DataRobot
binary.latest_version
>>> CustomTaskVersion('v1.0')

If you create a new version, that will be returned as the latest_version. You can download the latest version as a zip file.

binary.latest_version
>>> CustomTaskVersion('v1.0')

custom_task_folder = "/home/my-user-name/tasks/my-updated-task/"
task_version = dr.CustomTaskVersion.create_clean(
    custom_task_id=binary.id,
    base_environment_id=execution_environment.id,
    folder_path=custom_task_folder,
)

binary.refresh()
binary.latest_version
>>> CustomTaskVersion('v2.0')

binary.download_latest_version("/home/my-user-name/downloads/my-task-files.zip")

You can get, list, copy, exactly as you would expect. copy makes a complete copy of the task: new copies of the metadata, new copies of the versions, new copies of uploaded files for the new versions.

all_tasks = CustomTask.list()
assert {el.id for el in all_tasks} == {binary.id, transform.id}

new_binary = CustomTask.copy(binary.id)
assert new_binary.latest_version.id != binary.latest_version.id

original_binary = CustomTask.get(binary.id)

assert len(CustomTask.list()) == 3

You can update the metadata of a task. When you do this, the object is also updated to the latest data.

assert binary.description == new_binary.description
binary.update(description="totally new description")

assert binary.description != new_binary.description
assert original_binary.description != binary.description  # hasn't refreshed from the server yet

original_binary.refresh()
assert original_binary.description == binary.description

And finally, you can delete only if the task is not in use by any of the following:

  • Trained models
  • Deployments
  • Blueprints in the AI catalog

Once you have deleted the objects that use the task, you will be able to delete the task itself.

Manage Custom Task Versions

Code for Custom Tasks can be uploaded by creating a Custom Task Version. When creating a Custom Task Version, the version must be associated with a base execution environment. If the base environment supports additional task dependencies (R or Python environments) and the Custom Task Version contains a valid requirements.txt file, the task version will run in an environment based on the base environment with the additional dependencies installed.

Create Custom Task Version

Upload actual custom task content by creating a clean Custom Task Version:

import os

custom_task_id = binary.id
custom_task_folder = "datarobot-user-tasks/task_templates/python3_pytorch"

# add files from the folder to the custom task
task_version = dr.CustomTaskVersion.create_clean(
    custom_task_id=custom_task_id,
    base_environment_id=execution_environment.id,
    folder_path=custom_task_folder,
)

To create a new Custom Task Version from a previous one, with just some files added or removed, do the following:

import os
import datarobot as dr

new_files_folder = "datarobot-user-tasks/task_templates/my_files_to_add_to_pytorch_task"

file_to_delete = task_version.items[0].id

task_version_2 = dr.CustomTaskVersion.create_from_previous(
    custom_task_id=custom_task_id,
    base_environment_id=execution_environment.id,
    folder_path=new_files_folder,
)

Please refer to CustomTaskFileItem for description of custom task file properties.

List Custom Task Versions

Use the following command to list Custom Task Versions available to the user:

import datarobot as dr

dr.CustomTaskVersion.list(custom_task_id)

>>> [CustomTaskVersion('v2.0'), CustomTaskVersion('v1.0')]

Retrieve Custom Task Version

To retrieve a specific Custom Task Version, run:

import datarobot as dr

dr.CustomTaskVersion.get(custom_task_id, custom_task_version_id='5ebe96b84024035cc6a6560b')

>>> CustomTaskVersion('v2.0')

Update Custom Task Version

To update Custom Task Version description execute the following:

import datarobot as dr

custom_task_version = dr.CustomTaskVersion.get(
    custom_task_id,
    custom_task_version_id='5ebe96b84024035cc6a6560b',
)

custom_task_version.update(description='new description')

custom_task_version.description
>>> 'new description'

Download Custom Task Version

Download content of the Custom Task Version as a ZIP archive:

import datarobot as dr

path_to_download = '/home/user/Documents/myTask.zip'

custom_task_version = dr.CustomTaskVersion.get(
    custom_task_id,
    custom_task_version_id='5ebe96b84024035cc6a6560b',
)

custom_task_version.download(path_to_download)

Preparing a Custom Task Version for Use

If your custom task version has dependencies, a dependency build must be completed before the task can be used. The dependency build installs your task’s dependencies into the base environment associated with the task version.

see: Preparing a Custom Model Version for Use