Clustering
- class datarobot.models.ClusteringModel
ClusteringModel extends
Model
class. It provides provides properties and methods specific to clustering projects.- compute_insights(max_wait=600)
Compute and retrieve cluster insights for model. This method awaits completion of job computing cluster insights and returns results after it is finished. If computation takes longer than specified
max_wait
exception will be raised.- Parameters:
project_id (
str
) – Project to start creation in.model_id (
str
) – Project’s model to start creation in.max_wait (
int
) – Maximum number of seconds to wait before giving up
- Return type:
List
ofClusterInsight
- Raises:
ClientError – Server rejected creation due to client error. Most likely cause is bad
project_id
ormodel_id
.AsyncFailureError – If any of the responses from the server are unexpected
AsyncProcessUnsuccessfulError – If the cluster insights computation has failed or was cancelled.
AsyncTimeoutError – If the cluster insights computation did not resolve in time
- property insights: List[ClusterInsight]
Return actual list of cluster insights if already computed.
- Return type:
List
ofClusterInsight
- update_cluster_names(cluster_name_mappings)
Change many cluster names at once based on list of name mappings.
- Parameters:
cluster_name_mappings (
List
oftuples
) –Cluster names mapping consisting of current cluster name and old cluster name. Example:
cluster_name_mappings = [ ("current cluster name 1", "new cluster name 1"), ("current cluster name 2", "new cluster name 2")]
- Return type:
List
ofCluster
- Raises:
datarobot.errors.ClientError – Server rejected update of cluster names. Possible reasons include: incorrect format of mapping, mapping introduces duplicates.
- update_cluster_name(current_name, new_name)
Change cluster name from current_name to new_name.
- Parameters:
current_name (
str
) – Current cluster name.new_name (
str
) – New cluster name.
- Return type:
List
ofCluster
- Raises:
datarobot.errors.ClientError – Server rejected update of cluster names.
- class datarobot.models.cluster.Cluster
Representation of a single cluster.
- Variables:
name (
str
) – Current cluster namepercent (
float
) – Percent of data contained in the cluster. This value is reported after cluster insights are computed for the model.
- classmethod list(project_id, model_id)
Retrieve a list of clusters in the model.
- Parameters:
project_id (
str
) – ID of the project that the model is part of.model_id (
str
) – ID of the model.
- Return type:
List
ofclusters
- classmethod update_multiple_names(project_id, model_id, cluster_name_mappings)
Update many clusters at once based on list of name mappings.
- Parameters:
project_id (
str
) – ID of the project that the model is part of.model_id (
str
) – ID of the model.cluster_name_mappings (
List
oftuples
) –Cluster name mappings, consisting of current and previous names for each cluster. Example:
cluster_name_mappings = [ ("current cluster name 1", "new cluster name 1"), ("current cluster name 2", "new cluster name 2")]
- Return type:
List
ofclusters
- Raises:
datarobot.errors.ClientError – Server rejected update of cluster names.
ValueError – Invalid cluster name mapping provided.
- classmethod update_name(project_id, model_id, current_name, new_name)
Change cluster name from current_name to new_name
- Parameters:
project_id (
str
) – ID of the project that the model is part of.model_id (
str
) – ID of the model.current_name (
str
) – Current cluster namenew_name (
str
) – New cluster name
- Return type:
List
ofCluster
- class datarobot.models.cluster_insight.ClusterInsight
Holds data on all insights related to feature as well as breakdown per cluster.
- Parameters:
feature_name (
str
) – Name of a feature from the dataset.feature_type (
str
) – Type of feature.insights (
List[ClusterInsight]
) – List provides information regarding the importance of a specific feature in relation to each cluster. Results help understand how the model is grouping data and what each cluster represents.feature_impact (
float
) – Impact of a feature ranging from 0 to 1.
- classmethod compute(project_id, model_id, max_wait=600)
Starts creation of cluster insights for the model and if successful, returns computed ClusterInsights. This method allows calculation to continue for a specified time and if not complete, cancels the request.
- Parameters:
project_id (
str
) – ID of the project to begin creation of cluster insights for.model_id (
str
) – ID of the project model to begin creation of cluster insights for.max_wait (
int
) – Maximum number of seconds to wait canceling the request.
- Return type:
List[ClusterInsight]
- Raises:
ClientError – Server rejected creation due to client error. Most likely cause is bad
project_id
ormodel_id
.AsyncFailureError – Indicates whether any of the responses from the server are unexpected.
AsyncProcessUnsuccessfulError – Indicates whether the cluster insights computation failed or was cancelled.
AsyncTimeoutError – Indicates whether the cluster insights computation did not resolve within the specified time limit (max_wait).