Segmented Modeling

API Reference for entities used in Segmented Modeling. See dedicated User Guide for examples.

class datarobot.CombinedModel(id=None, project_id=None, segmentation_task_id=None, is_active_combined_model=False)

A model from a segmented project. Combination of ordinary models in child segments projects.

Attributes:
idstr

the id of the model

project_idstr

the id of the project the model belongs to

segmentation_task_idstr

the id of a segmentation task used in this model

is_active_combined_modelbool

flag indicating if this is the active combined model in segmented project

classmethod get(project_id, combined_model_id)

Retrieve combined model

Parameters:
project_idstr

The project’s id.

combined_model_idstr

Id of the combined model.

Returns:
CombinedModel

The queried combined model.

Return type:

CombinedModel

classmethod set_segment_champion(project_id, model_id, clone=False)

Update a segment champion in a combined model by setting the model_id that belongs to the child project_id as the champion.

Parameters:
project_idstr

The project id for the child model that contains the model id.

model_idstr

Id of the model to mark as the champion

clonebool

(New in version v2.29) optional, defaults to False. Defines if combined model has to be cloned prior to setting champion (champion will be set for new combined model if yes).

Returns:
combined_model_idstr

Id of the combined model that was updated

Return type:

str

get_segments_info()

Retrieve Combined Model segments info

Returns:
list[SegmentInfo]

List of segments

Return type:

List[SegmentInfo]

get_segments_as_dataframe(encoding='utf-8')

Retrieve Combine Models segments as a DataFrame.

Parameters:
encodingstr, optional

A string representing the encoding to use in the output csv file. Defaults to ‘utf-8’.

Returns:
DataFrame

Combined model segments

Return type:

DataFrame

get_segments_as_csv(filename, encoding='utf-8')

Save the Combine Models segments to a csv.

Parameters:
filenamestr or file object

The path or file object to save the data to.

encodingstr, optional

A string representing the encoding to use in the output csv file. Defaults to ‘utf-8’.

Return type:

None

train(sample_pct=None, featurelist_id=None, scoring_type=None, training_row_count=None, monotonic_increasing_featurelist_id=<object object>, monotonic_decreasing_featurelist_id=<object object>)

Inherited from Model - CombinedModels cannot be retrained directly

Return type:

NoReturn

train_datetime(featurelist_id=None, training_row_count=None, training_duration=None, time_window_sample_pct=None, monotonic_increasing_featurelist_id=<object object>, monotonic_decreasing_featurelist_id=<object object>, use_project_settings=False, sampling_method=None, n_clusters=None)

Inherited from Model - CombinedModels cannot be retrained directly

Return type:

NoReturn

retrain(sample_pct=None, featurelist_id=None, training_row_count=None, n_clusters=None)

Inherited from Model - CombinedModels cannot be retrained directly

Return type:

NoReturn

request_frozen_model(sample_pct=None, training_row_count=None)

Inherited from Model - CombinedModels cannot be retrained as frozen

Return type:

NoReturn

request_frozen_datetime_model(training_row_count=None, training_duration=None, training_start_date=None, training_end_date=None, time_window_sample_pct=None, sampling_method=None)

Inherited from Model - CombinedModels cannot be retrained as frozen

Return type:

NoReturn

cross_validate()

Inherited from Model - CombinedModels cannot request cross validation

Return type:

NoReturn

class datarobot.SegmentationTask(id, project_id, name, type, created, segments_count, segments, metadata, data)

A Segmentation Task is used for segmenting an existing project into multiple child projects. Each child project (or segment) will be a separate autopilot run. Currently only user defined segmentation is supported.

Example for creating a new SegmentationTask for Time Series segmentation with a user defined id column:

from datarobot import SegmentationTask

# Create the SegmentationTask
segmentation_task_results = SegmentationTask.create(
    project_id=project.id,
    target=target,
    use_time_series=True,
    datetime_partition_column=datetime_partition_column,
    multiseries_id_columns=[multiseries_id_column],
    user_defined_segment_id_columns=[user_defined_segment_id_column]
)

# Retrieve the completed SegmentationTask object from the job results
segmentation_task = segmentation_task_results['completedJobs'][0]
Attributes:
idObjectId

The id of the segmentation task.

project_idObjectId

The associated id of the parent project.

typestr

What type of job the segmentation task is associated with, e.g. auto_ml or auto_ts.

createddatetime

The date this segmentation task was created.

segments_countint

The number of segments the segmentation task generated.

segmentslist of strings

The segment names that the segmentation task generated.

metadatadict

List of features that help to identify the parameters used by the segmentation task.

datadict

Optional parameters that are associated with enabled metadata for the segmentation task.

classmethod from_data(data)

Instantiate an object of this class using a dict.

Parameters:
datadict

Correctly snake_cased keys and their values.

Return type:

SegmentationTask

collect_payload()

Convert the record to a dictionary

Return type:

Dict[str, str]

classmethod create(project_id, target, use_time_series=False, datetime_partition_column=None, multiseries_id_columns=None, user_defined_segment_id_columns=None, max_wait=600, model_package_id=None)

Creates segmentation tasks for the project based on the defined parameters.

Parameters:
project_idstr

The associated id of the parent project.

targetstr

The column that represents the target in the dataset.

use_time_seriesbool

Whether AutoTS or AutoML segmentations should be generated.

datetime_partition_columnstr or null

Required for Time Series. The name of the column whose values as dates are used to assign a row to a particular partition.

multiseries_id_columnslist of str or null

Required for Time Series. A list of the names of multiseries id columns to define series within the training data. Currently only one multiseries id column is supported.

user_defined_segment_id_columnslist of str or null

Required when using a column for segmentation. A list of the segment id columns to use to define what columns are used to manually segment data. Currently only one user defined segment id column is supported.

model_package_idstr

Required when using automated segmentation. The associated id of the model in the DataRobot Model Registry that will be used to perform automated segmentation on a dataset.

max_waitinteger

The number of seconds to wait

Returns:
segmentation_tasksdict

Dictionary containing the numberOfJobs, completedJobs, and failedJobs. completedJobs is a list of SegmentationTask objects, while failed jobs is a list of dictionaries indicating problems with submitted tasks.

Return type:

SegmentationTaskCreatedResponse

classmethod list(project_id)

List all of the segmentation tasks that have been created for a specific project_id.

Parameters:
project_idstr

The id of the parent project

Returns:
segmentation_taskslist of SegmentationTask

List of instances with initialized data.

Return type:

List[SegmentationTask]

classmethod get(project_id, segmentation_task_id)

Retrieve information for a single segmentation task associated with a project_id.

Parameters:
project_idstr

The id of the parent project

segmentation_task_idstr

The id of the segmentation task

Returns:
segmentation_taskSegmentationTask

Instance with initialized data.

Return type:

SegmentationTask

class datarobot.SegmentInfo(project_id, segment, project_stage, project_status_error, autopilot_done, model_count=None, model_id=None)

A SegmentInfo is an object containing information about the combined model segments

Attributes:
project_idstr

The associated id of the child project.

segmentstr

the name of the segment

project_stagestr

A description of the current stage of the project

project_status_errorstr

Project status error message.

autopilot_donebool

Is autopilot done for the project.

model_countint

Count of trained models in project.

model_idstr

ID of segment champion model.

classmethod list(project_id, model_id)

List all of the segments that have been created for a specific project_id.

Parameters:
project_idstr

The id of the parent project

Returns:
segmentslist of datarobot.models.segmentation.SegmentInfo

List of instances with initialized data.

Return type:

List[SegmentInfo]

class datarobot.models.segmentation.SegmentationTask(id, project_id, name, type, created, segments_count, segments, metadata, data)

A Segmentation Task is used for segmenting an existing project into multiple child projects. Each child project (or segment) will be a separate autopilot run. Currently only user defined segmentation is supported.

Example for creating a new SegmentationTask for Time Series segmentation with a user defined id column:

from datarobot import SegmentationTask

# Create the SegmentationTask
segmentation_task_results = SegmentationTask.create(
    project_id=project.id,
    target=target,
    use_time_series=True,
    datetime_partition_column=datetime_partition_column,
    multiseries_id_columns=[multiseries_id_column],
    user_defined_segment_id_columns=[user_defined_segment_id_column]
)

# Retrieve the completed SegmentationTask object from the job results
segmentation_task = segmentation_task_results['completedJobs'][0]
Attributes:
idObjectId

The id of the segmentation task.

project_idObjectId

The associated id of the parent project.

typestr

What type of job the segmentation task is associated with, e.g. auto_ml or auto_ts.

createddatetime

The date this segmentation task was created.

segments_countint

The number of segments the segmentation task generated.

segmentslist of strings

The segment names that the segmentation task generated.

metadatadict

List of features that help to identify the parameters used by the segmentation task.

datadict

Optional parameters that are associated with enabled metadata for the segmentation task.

classmethod from_data(data)

Instantiate an object of this class using a dict.

Parameters:
datadict

Correctly snake_cased keys and their values.

Return type:

SegmentationTask

collect_payload()

Convert the record to a dictionary

Return type:

Dict[str, str]

classmethod create(project_id, target, use_time_series=False, datetime_partition_column=None, multiseries_id_columns=None, user_defined_segment_id_columns=None, max_wait=600, model_package_id=None)

Creates segmentation tasks for the project based on the defined parameters.

Parameters:
project_idstr

The associated id of the parent project.

targetstr

The column that represents the target in the dataset.

use_time_seriesbool

Whether AutoTS or AutoML segmentations should be generated.

datetime_partition_columnstr or null

Required for Time Series. The name of the column whose values as dates are used to assign a row to a particular partition.

multiseries_id_columnslist of str or null

Required for Time Series. A list of the names of multiseries id columns to define series within the training data. Currently only one multiseries id column is supported.

user_defined_segment_id_columnslist of str or null

Required when using a column for segmentation. A list of the segment id columns to use to define what columns are used to manually segment data. Currently only one user defined segment id column is supported.

model_package_idstr

Required when using automated segmentation. The associated id of the model in the DataRobot Model Registry that will be used to perform automated segmentation on a dataset.

max_waitinteger

The number of seconds to wait

Returns:
segmentation_tasksdict

Dictionary containing the numberOfJobs, completedJobs, and failedJobs. completedJobs is a list of SegmentationTask objects, while failed jobs is a list of dictionaries indicating problems with submitted tasks.

Return type:

SegmentationTaskCreatedResponse

classmethod list(project_id)

List all of the segmentation tasks that have been created for a specific project_id.

Parameters:
project_idstr

The id of the parent project

Returns:
segmentation_taskslist of SegmentationTask

List of instances with initialized data.

Return type:

List[SegmentationTask]

classmethod get(project_id, segmentation_task_id)

Retrieve information for a single segmentation task associated with a project_id.

Parameters:
project_idstr

The id of the parent project

segmentation_task_idstr

The id of the segmentation task

Returns:
segmentation_taskSegmentationTask

Instance with initialized data.

Return type:

SegmentationTask

class datarobot.models.segmentation.SegmentationTaskCreatedResponse(*args, **kwargs)