Feature association

class datarobot.models.FeatureAssociationMatrix

Feature association statistics for a project.

Note

Projects created prior to v2.17 are not supported by this feature.

Variables:
  • project_id (str) – Id of the associated project.

  • strengths (list of dict) – Pairwise statistics for the available features as structured below.

  • features (list of dict) – Metadata for each feature and where it goes in the matrix.

Examples

import datarobot as dr

# retrieve feature association matrix
feature_association_matrix = dr.FeatureAssociationMatrix.get(project_id)
feature_association_matrix.strengths
feature_association_matrix.features

# retrieve feature association matrix for a metric, association type or a feature list
feature_association_matrix = dr.FeatureAssociationMatrix.get(
    project_id,
    metric=enums.FEATURE_ASSOCIATION_METRIC.SPEARMAN,
    association_type=enums.FEATURE_ASSOCIATION_TYPE.CORRELATION,
    featurelist_id=featurelist_id,
)
classmethod get(project_id, metric=None, association_type=None, featurelist_id=None)

Get feature association statistics.

Parameters:
  • project_id (str) – Id of the project that contains the requested associations.

  • metric (enums.FEATURE_ASSOCIATION_METRIC) – The name of a metric to get pairwise data for. Since ‘v2.19’ this is optional and defaults to enums.FEATURE_ASSOCIATION_METRIC.MUTUAL_INFO.

  • association_type (enums.FEATURE_ASSOCIATION_TYPE) – The type of dependence for the data. Since ‘v2.19’ this is optional and defaults to enums.FEATURE_ASSOCIATION_TYPE.ASSOCIATION.

  • featurelist_id (str or None) – Optional, the feature list to lookup FAM data for. By default, depending on the type of the project “Informative Features” or “Timeseries Informative Features” list will be used. (New in version v2.19)

Returns:

Feature association pairwise metric strength data, feature clustering data, and ordering data for Feature Association Matrix visualization.

Return type:

FeatureAssociationMatrix

classmethod create(project_id, featurelist_id)

Compute the Feature Association Matrix for a Feature List

Parameters:
  • project_id (str) – The ID of the project that the feature list belongs to.

  • featurelist_id (str) – The ID of the feature list for which insights are requested.

Returns:

status_check_job – Object contains all needed logic for a periodical status check of an async job.

Return type:

StatusCheckJob

Feature association matrix details

class datarobot.models.FeatureAssociationMatrixDetails

Plotting details for a pair of passed features present in the feature association matrix.

Note

Projects created prior to v2.17 are not supported by this feature.

Variables:
  • project_id (str) – Id of the project that contains the requested associations.

  • chart_type (str) – Which type of plotting the pair of features gets in the UI. e.g. ‘HORIZONTAL_BOX’, ‘VERTICAL_BOX’, ‘SCATTER’ or ‘CONTINGENCY’

  • values (list) – The data triplets for pairwise plotting e.g. {“values”: [[460.0, 428.5, 0.001], [1679.3, 259.0, 0.001], …] The first entry of each list is a value of feature1, the second entry of each list is a value of feature2, and the third is the relative frequency of the pair of datapoints in the sample.

  • features (list) – A list of the requested features, [feature1, feature2]

  • types (list) – The type of feature1 and feature2. Possible values: “CATEGORICAL”, “NUMERIC”

  • featurelist_id (str) – Id of the feature list to lookup FAM details for.

classmethod get(project_id, feature1, feature2, featurelist_id=None)

Get a sample of the actual values used to measure the association between a pair of features

Added in version v2.17.

Parameters:
  • project_id (str) – Id of the project of interest.

  • feature1 (str) – Feature name for the first feature of interest.

  • feature2 (str) – Feature name for the second feature of interest.

  • featurelist_id (str) – Optional, the feature list to lookup FAM data for. By default, depending on the type of the project “Informative Features” or “Timeseries Informative Features” list will be used.

Returns:

The feature association plotting for provided pair of features.

Return type:

FeatureAssociationMatrixDetails

Feature association featurelists

class datarobot.models.FeatureAssociationFeaturelists

Featurelists with feature association matrix availability flags for a project.

Variables:
  • project_id (str) – Id of the project that contains the requested associations.

  • featurelists (list fo dict) – The featurelists with the featurelist_id, title and the has_fam flag.

classmethod get(project_id)

Get featurelists with feature association status for each.

Parameters:

project_id (str) – Id of the project of interest.

Returns:

Featurelist with feature association status for each.

Return type:

FeatureAssociationFeaturelists