Clustering
- class datarobot.models.ClusteringModel(id=None, processes=None, featurelist_name=None, featurelist_id=None, project_id=None, sample_pct=None, model_type=None, model_category=None, is_frozen=None, is_n_clusters_dynamically_determined=None, blueprint_id=None, metrics=None, monotonic_increasing_featurelist_id=None, monotonic_decreasing_featurelist_id=None, n_clusters=None, has_empty_clusters=None, supports_monotonic_constraints=None, is_starred=None, prediction_threshold=None, prediction_threshold_read_only=None, model_number=None, parent_model_id=None, supports_composable_ml=None, training_row_count=None, training_duration=None, training_start_date=None, training_end_date=None, data_selection_method=None, time_window_sample_pct=None, sampling_method=None, model_family_full_name=None, is_trained_into_validation=None, is_trained_into_holdout=None)
ClusteringModel extends
Model
class. It provides provides properties and methods specific to clustering projects.- compute_insights(max_wait=600)
Compute and retrieve cluster insights for model. This method awaits completion of job computing cluster insights and returns results after it is finished. If computation takes longer than specified
max_wait
exception will be raised.- Parameters:
- project_id: str
Project to start creation in.
- model_id: str
Project’s model to start creation in.
- max_wait: int
Maximum number of seconds to wait before giving up
- Returns:
- List of ClusterInsight
- Raises:
- ClientError
Server rejected creation due to client error. Most likely cause is bad
project_id
ormodel_id
.- AsyncFailureError
If any of the responses from the server are unexpected
- AsyncProcessUnsuccessfulError
If the cluster insights computation has failed or was cancelled.
- AsyncTimeoutError
If the cluster insights computation did not resolve in time
- Return type:
List
[ClusterInsight
]
- property insights: List[ClusterInsight]
Return actual list of cluster insights if already computed.
- Returns:
- List of ClusterInsight
- update_cluster_names(cluster_name_mappings)
Change many cluster names at once based on list of name mappings.
- Parameters:
- cluster_name_mappings: List of tuples
Cluster names mapping consisting of current cluster name and old cluster name. Example:
cluster_name_mappings = [ ("current cluster name 1", "new cluster name 1"), ("current cluster name 2", "new cluster name 2")]
- Returns:
- List of Cluster
- Raises:
- datarobot.errors.ClientError
Server rejected update of cluster names. Possible reasons include: incorrect format of mapping, mapping introduces duplicates.
- Return type:
List
[Cluster
]
- update_cluster_name(current_name, new_name)
Change cluster name from current_name to new_name.
- Parameters:
- current_name: str
Current cluster name.
- new_name: str
New cluster name.
- Returns:
- List of Cluster
- Raises:
- datarobot.errors.ClientError
Server rejected update of cluster names.
- Return type:
List
[Cluster
]
- class datarobot.models.cluster.Cluster(**kwargs)
Representation of a single cluster.
- Attributes:
- name: str
Current cluster name
- percent: float
Percent of data contained in the cluster. This value is reported after cluster insights are computed for the model.
- classmethod list(project_id, model_id)
Retrieve a list of clusters in the model.
- Parameters:
- project_id: str
ID of the project that the model is part of.
- model_id: str
ID of the model.
- Returns:
- List of clusters
- Return type:
List
[Cluster
]
- classmethod update_multiple_names(project_id, model_id, cluster_name_mappings)
Update many clusters at once based on list of name mappings.
- Parameters:
- project_id: str
ID of the project that the model is part of.
- model_id: str
ID of the model.
- cluster_name_mappings: List of tuples
Cluster name mappings, consisting of current and previous names for each cluster. Example:
cluster_name_mappings = [ ("current cluster name 1", "new cluster name 1"), ("current cluster name 2", "new cluster name 2")]
- Returns:
- List of clusters
- Raises:
- datarobot.errors.ClientError
Server rejected update of cluster names.
- ValueError
Invalid cluster name mapping provided.
- Return type:
List
[Cluster
]
- classmethod update_name(project_id, model_id, current_name, new_name)
Change cluster name from current_name to new_name
- Parameters:
- project_id: str
ID of the project that the model is part of.
- model_id: str
ID of the model.
- current_name: str
Current cluster name
- new_name: str
New cluster name
- Returns:
- List of Cluster
- Return type:
List
[Cluster
]
- class datarobot.models.cluster_insight.ClusterInsight(**kwargs)
Holds data on all insights related to feature as well as breakdown per cluster.
- Parameters:
- feature_name: str
Name of a feature from the dataset.
- feature_type: str
Type of feature.
- insightsList of classes (ClusterInsight)
List provides information regarding the importance of a specific feature in relation to each cluster. Results help understand how the model is grouping data and what each cluster represents.
- feature_impact: float
Impact of a feature ranging from 0 to 1.
- classmethod compute(project_id, model_id, max_wait=600)
Starts creation of cluster insights for the model and if successful, returns computed ClusterInsights. This method allows calculation to continue for a specified time and if not complete, cancels the request.
- Parameters:
- project_id: str
ID of the project to begin creation of cluster insights for.
- model_id: str
ID of the project model to begin creation of cluster insights for.
- max_wait: int
Maximum number of seconds to wait canceling the request.
- Returns:
- List[ClusterInsight]
- Raises:
- ClientError
Server rejected creation due to client error. Most likely cause is bad
project_id
ormodel_id
.- AsyncFailureError
Indicates whether any of the responses from the server are unexpected.
- AsyncProcessUnsuccessfulError
Indicates whether the cluster insights computation failed or was cancelled.
- AsyncTimeoutError
Indicates whether the cluster insights computation did not resolve within the specified time limit (max_wait).
- Return type:
List
[ClusterInsight
]