Segmented Modeling
API Reference for entities used in Segmented Modeling. See dedicated User Guide for examples.
- class datarobot.CombinedModel(id=None, project_id=None, segmentation_task_id=None, is_active_combined_model=False)
A model from a segmented project. Combination of ordinary models in child segments projects.
- Attributes:
- idstr
the id of the model
- project_idstr
the id of the project the model belongs to
- segmentation_task_idstr
the id of a segmentation task used in this model
- is_active_combined_modelbool
flag indicating if this is the active combined model in segmented project
- classmethod get(project_id, combined_model_id)
Retrieve combined model
- Parameters:
- project_idstr
The project’s id.
- combined_model_idstr
Id of the combined model.
- Returns:
- CombinedModel
The queried combined model.
- Return type:
- classmethod set_segment_champion(project_id, model_id, clone=False)
Update a segment champion in a combined model by setting the model_id that belongs to the child project_id as the champion.
- Parameters:
- project_idstr
The project id for the child model that contains the model id.
- model_idstr
Id of the model to mark as the champion
- clonebool
(New in version v2.29) optional, defaults to False. Defines if combined model has to be cloned prior to setting champion (champion will be set for new combined model if yes).
- Returns:
- combined_model_idstr
Id of the combined model that was updated
- Return type:
str
- get_segments_info()
Retrieve Combined Model segments info
- Returns:
- list[SegmentInfo]
List of segments
- Return type:
List
[SegmentInfo
]
- get_segments_as_dataframe(encoding='utf-8')
Retrieve Combine Models segments as a DataFrame.
- Parameters:
- encodingstr, optional
A string representing the encoding to use in the output csv file. Defaults to ‘utf-8’.
- Returns:
- DataFrame
Combined model segments
- Return type:
DataFrame
- get_segments_as_csv(filename, encoding='utf-8')
Save the Combine Models segments to a csv.
- Parameters:
- filenamestr or file object
The path or file object to save the data to.
- encodingstr, optional
A string representing the encoding to use in the output csv file. Defaults to ‘utf-8’.
- Return type:
None
- train(sample_pct=None, featurelist_id=None, scoring_type=None, training_row_count=None, monotonic_increasing_featurelist_id=<object object>, monotonic_decreasing_featurelist_id=<object object>)
Inherited from Model - CombinedModels cannot be retrained directly
- Return type:
NoReturn
- train_datetime(featurelist_id=None, training_row_count=None, training_duration=None, time_window_sample_pct=None, monotonic_increasing_featurelist_id=<object object>, monotonic_decreasing_featurelist_id=<object object>, use_project_settings=False, sampling_method=None, n_clusters=None)
Inherited from Model - CombinedModels cannot be retrained directly
- Return type:
NoReturn
- retrain(sample_pct=None, featurelist_id=None, training_row_count=None, n_clusters=None)
Inherited from Model - CombinedModels cannot be retrained directly
- Return type:
NoReturn
- request_frozen_model(sample_pct=None, training_row_count=None)
Inherited from Model - CombinedModels cannot be retrained as frozen
- Return type:
NoReturn
- request_frozen_datetime_model(training_row_count=None, training_duration=None, training_start_date=None, training_end_date=None, time_window_sample_pct=None, sampling_method=None)
Inherited from Model - CombinedModels cannot be retrained as frozen
- Return type:
NoReturn
- cross_validate()
Inherited from Model - CombinedModels cannot request cross validation
- Return type:
NoReturn
- class datarobot.SegmentationTask(id, project_id, name, type, created, segments_count, segments, metadata, data)
A Segmentation Task is used for segmenting an existing project into multiple child projects. Each child project (or segment) will be a separate autopilot run. Currently only user defined segmentation is supported.
Example for creating a new SegmentationTask for Time Series segmentation with a user defined id column:
from datarobot import SegmentationTask # Create the SegmentationTask segmentation_task_results = SegmentationTask.create( project_id=project.id, target=target, use_time_series=True, datetime_partition_column=datetime_partition_column, multiseries_id_columns=[multiseries_id_column], user_defined_segment_id_columns=[user_defined_segment_id_column] ) # Retrieve the completed SegmentationTask object from the job results segmentation_task = segmentation_task_results['completedJobs'][0]
- Attributes:
- idObjectId
The id of the segmentation task.
- project_idObjectId
The associated id of the parent project.
- typestr
What type of job the segmentation task is associated with, e.g. auto_ml or auto_ts.
- createddatetime
The date this segmentation task was created.
- segments_countint
The number of segments the segmentation task generated.
- segmentslist of strings
The segment names that the segmentation task generated.
- metadatadict
List of features that help to identify the parameters used by the segmentation task.
- datadict
Optional parameters that are associated with enabled metadata for the segmentation task.
- classmethod from_data(data)
Instantiate an object of this class using a dict.
- Parameters:
- datadict
Correctly snake_cased keys and their values.
- Return type:
- collect_payload()
Convert the record to a dictionary
- Return type:
Dict
[str
,str
]
- classmethod create(project_id, target, use_time_series=False, datetime_partition_column=None, multiseries_id_columns=None, user_defined_segment_id_columns=None, max_wait=600, model_package_id=None)
Creates segmentation tasks for the project based on the defined parameters.
- Parameters:
- project_idstr
The associated id of the parent project.
- targetstr
The column that represents the target in the dataset.
- use_time_seriesbool
Whether AutoTS or AutoML segmentations should be generated.
- datetime_partition_columnstr or null
Required for Time Series. The name of the column whose values as dates are used to assign a row to a particular partition.
- multiseries_id_columnslist of str or null
Required for Time Series. A list of the names of multiseries id columns to define series within the training data. Currently only one multiseries id column is supported.
- user_defined_segment_id_columnslist of str or null
Required when using a column for segmentation. A list of the segment id columns to use to define what columns are used to manually segment data. Currently only one user defined segment id column is supported.
- model_package_idstr
Required when using automated segmentation. The associated id of the model in the DataRobot Model Registry that will be used to perform automated segmentation on a dataset.
- max_waitinteger
The number of seconds to wait
- Returns:
- segmentation_tasksdict
Dictionary containing the numberOfJobs, completedJobs, and failedJobs. completedJobs is a list of SegmentationTask objects, while failed jobs is a list of dictionaries indicating problems with submitted tasks.
- Return type:
- classmethod list(project_id)
List all of the segmentation tasks that have been created for a specific project_id.
- Parameters:
- project_idstr
The id of the parent project
- Returns:
- segmentation_taskslist of SegmentationTask
List of instances with initialized data.
- Return type:
List
[SegmentationTask
]
- classmethod get(project_id, segmentation_task_id)
Retrieve information for a single segmentation task associated with a project_id.
- Parameters:
- project_idstr
The id of the parent project
- segmentation_task_idstr
The id of the segmentation task
- Returns:
- segmentation_taskSegmentationTask
Instance with initialized data.
- Return type:
- class datarobot.SegmentInfo(project_id, segment, project_stage, project_status_error, autopilot_done, model_count=None, model_id=None)
A SegmentInfo is an object containing information about the combined model segments
- Attributes:
- project_idstr
The associated id of the child project.
- segmentstr
the name of the segment
- project_stagestr
A description of the current stage of the project
- project_status_errorstr
Project status error message.
- autopilot_donebool
Is autopilot done for the project.
- model_countint
Count of trained models in project.
- model_idstr
ID of segment champion model.
- classmethod list(project_id, model_id)
List all of the segments that have been created for a specific project_id.
- Parameters:
- project_idstr
The id of the parent project
- Returns:
- segmentslist of datarobot.models.segmentation.SegmentInfo
List of instances with initialized data.
- Return type:
List
[SegmentInfo
]
- class datarobot.models.segmentation.SegmentationTask(id, project_id, name, type, created, segments_count, segments, metadata, data)
A Segmentation Task is used for segmenting an existing project into multiple child projects. Each child project (or segment) will be a separate autopilot run. Currently only user defined segmentation is supported.
Example for creating a new SegmentationTask for Time Series segmentation with a user defined id column:
from datarobot import SegmentationTask # Create the SegmentationTask segmentation_task_results = SegmentationTask.create( project_id=project.id, target=target, use_time_series=True, datetime_partition_column=datetime_partition_column, multiseries_id_columns=[multiseries_id_column], user_defined_segment_id_columns=[user_defined_segment_id_column] ) # Retrieve the completed SegmentationTask object from the job results segmentation_task = segmentation_task_results['completedJobs'][0]
- Attributes:
- idObjectId
The id of the segmentation task.
- project_idObjectId
The associated id of the parent project.
- typestr
What type of job the segmentation task is associated with, e.g. auto_ml or auto_ts.
- createddatetime
The date this segmentation task was created.
- segments_countint
The number of segments the segmentation task generated.
- segmentslist of strings
The segment names that the segmentation task generated.
- metadatadict
List of features that help to identify the parameters used by the segmentation task.
- datadict
Optional parameters that are associated with enabled metadata for the segmentation task.
- classmethod from_data(data)
Instantiate an object of this class using a dict.
- Parameters:
- datadict
Correctly snake_cased keys and their values.
- Return type:
- collect_payload()
Convert the record to a dictionary
- Return type:
Dict
[str
,str
]
- classmethod create(project_id, target, use_time_series=False, datetime_partition_column=None, multiseries_id_columns=None, user_defined_segment_id_columns=None, max_wait=600, model_package_id=None)
Creates segmentation tasks for the project based on the defined parameters.
- Parameters:
- project_idstr
The associated id of the parent project.
- targetstr
The column that represents the target in the dataset.
- use_time_seriesbool
Whether AutoTS or AutoML segmentations should be generated.
- datetime_partition_columnstr or null
Required for Time Series. The name of the column whose values as dates are used to assign a row to a particular partition.
- multiseries_id_columnslist of str or null
Required for Time Series. A list of the names of multiseries id columns to define series within the training data. Currently only one multiseries id column is supported.
- user_defined_segment_id_columnslist of str or null
Required when using a column for segmentation. A list of the segment id columns to use to define what columns are used to manually segment data. Currently only one user defined segment id column is supported.
- model_package_idstr
Required when using automated segmentation. The associated id of the model in the DataRobot Model Registry that will be used to perform automated segmentation on a dataset.
- max_waitinteger
The number of seconds to wait
- Returns:
- segmentation_tasksdict
Dictionary containing the numberOfJobs, completedJobs, and failedJobs. completedJobs is a list of SegmentationTask objects, while failed jobs is a list of dictionaries indicating problems with submitted tasks.
- Return type:
- classmethod list(project_id)
List all of the segmentation tasks that have been created for a specific project_id.
- Parameters:
- project_idstr
The id of the parent project
- Returns:
- segmentation_taskslist of SegmentationTask
List of instances with initialized data.
- Return type:
List
[SegmentationTask
]
- classmethod get(project_id, segmentation_task_id)
Retrieve information for a single segmentation task associated with a project_id.
- Parameters:
- project_idstr
The id of the parent project
- segmentation_task_idstr
The id of the segmentation task
- Returns:
- segmentation_taskSegmentationTask
Instance with initialized data.
- Return type:
- class datarobot.models.segmentation.SegmentationTaskCreatedResponse(*args, **kwargs)