Anomaly Assessment

class datarobot.models.anomaly_assessment.AnomalyAssessmentRecord(status, status_details, start_date, end_date, prediction_threshold, preview_location, delete_location, latest_explanations_location, **record_kwargs)

Object which keeps metadata about anomaly assessment insight for the particular subset, backtest and series and the links to proceed to get the anomaly assessment data.

Added in version v2.25.

Notes

Record contains:

  • record_id : the ID of the record.

  • project_id : the project ID of the record.

  • model_id : the model ID of the record.

  • backtest : the backtest of the record.

  • source : the source of the record.

  • series_id : the series id of the record for the multiseries projects.

  • status : the status of the insight.

  • status_details : the explanation of the status.

  • start_date : the ISO-formatted timestamp of the first prediction in the subset. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.

  • end_date : the ISO-formatted timestamp of the last prediction in the subset. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.

  • prediction_threshold : the threshold, all rows with anomaly scores greater or equal to it have shap explanations computed. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.

  • preview_location : URL to retrieve predictions preview for the subset. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.

  • latest_explanations_location : the URL to retrieve the latest predictions with the shap explanations. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.

  • delete_location : the URL to delete anomaly assessment record and relevant insight data.

Attributes:
record_id: str

The ID of the record.

project_id: str

The ID of the project record belongs to.

model_id: str

The ID of the model record belongs to.

backtest: int or “holdout”

The backtest of the record.

source: “training” or “validation”

The source of the record

series_id: str or None

The series id of the record for the multiseries projects. Defined only for the multiseries projects.

status: str

The status of the insight. One of datarobot.enums.AnomalyAssessmentStatus

status_details: str

The explanation of the status.

start_date: str or None

See start_date info in Notes for more details.

end_date: str or None

See end_date info in Notes for more details.

prediction_threshold: float or None

See prediction_threshold info in Notes for more details.

preview_location: str or None

See preview_location info in Notes for more details.

latest_explanations_location: str or None

See latest_explanations_location info in Notes for more details.

delete_location: str

The URL to delete anomaly assessment record and relevant insight data.

classmethod list(project_id, model_id, backtest=None, source=None, series_id=None, limit=100, offset=0, with_data_only=False)

Retrieve the list of the anomaly assessment records for the project and model. Output can be filtered and limited.

Parameters:
project_id: str

The ID of the project record belongs to.

model_id: str

The ID of the model record belongs to.

backtest: int or “holdout”

The backtest to filter records by.

source: “training” or “validation”

The source to filter records by.

series_id: str, optional

The series id to filter records by. Can be specified for multiseries projects.

limit: int, optional

100 by default. At most this many results are returned.

offset: int, optional

This many results will be skipped.

with_data_only: bool, False by default

Filter by status == AnomalyAssessmentStatus.COMPLETED. If True, records with no data or not supported will be omitted.

Returns:
AnomalyAssessmentRecord

The anomaly assessment record.

Return type:

List[AnomalyAssessmentRecord]

classmethod compute(project_id, model_id, backtest, source, series_id=None)

Request anomaly assessment insight computation on the specified subset.

Parameters:
project_id: str

The ID of the project to compute insight for.

model_id: str

The ID of the model to compute insight for.

backtest: int or “holdout”

The backtest to compute insight for.

source: “training” or “validation”

The source to compute insight for.

series_id: str, optional

The series id to compute insight for. Required for multiseries projects.

Returns:
AnomalyAssessmentRecord

The anomaly assessment record.

Return type:

AnomalyAssessmentRecord

delete()

Delete anomaly assessment record with preview and explanations.

Return type:

None

get_predictions_preview()

Retrieve aggregated predictions statistics for the anomaly assessment record.

Returns:
AnomalyAssessmentPredictionsPreview
Return type:

AnomalyAssessmentPredictionsPreview

get_latest_explanations()

Retrieve latest predictions along with shap explanations for the most anomalous records.

Returns:
AnomalyAssessmentExplanations
Return type:

AnomalyAssessmentExplanations

get_explanations(start_date=None, end_date=None, points_count=None)

Retrieve predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points. Two out of three parameters: start_date, end_date or points_count must be specified.

Parameters:
start_date: str, optional

The start of the date range to get explanations in. Example: 2020-01-01T00:00:00.000000Z

end_date: str, optional

The end of the date range to get explanations in. Example: 2020-10-01T00:00:00.000000Z

points_count: int, optional

The number of the rows to return.

Returns:
AnomalyAssessmentExplanations
Return type:

AnomalyAssessmentExplanations

get_explanations_data_in_regions(regions, prediction_threshold=0.0)

Get predictions along with explanations for the specified regions, sorted by predictions in descending order.

Parameters:
regions: list of preview_bins

For each region explanations will be retrieved and merged.

prediction_threshold: float, optional

If specified, only points with score greater or equal to the threshold will be returned.

Returns:
dict in a form of {‘explanations’: explanations, ‘shap_base_value’: shap_base_value}
Return type:

RegionExplanationsData

class datarobot.models.anomaly_assessment.AnomalyAssessmentExplanations(shap_base_value, data, start_date, end_date, count, **record_kwargs)

Object which keeps predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points.

Added in version v2.25.

Notes

AnomalyAssessmentExplanations contains:

  • record_id : the id of the corresponding anomaly assessment record.

  • project_id : the project ID of the corresponding anomaly assessment record.

  • model_id : the model ID of the corresponding anomaly assessment record.

  • backtest : the backtest of the corresponding anomaly assessment record.

  • source : the source of the corresponding anomaly assessment record.

  • series_id : the series id of the corresponding anomaly assessment record for the multiseries projects.

  • start_date : the ISO-formatted first timestamp in the response. Will be None of there is no data in the specified range.

  • end_date : the ISO-formatted last timestamp in the response. Will be None of there is no data in the specified range.

  • count : The number of points in the response.

  • shap_base_value : the shap base value.

  • data : list of DataPoint objects in the specified date range.

DataPoint contains:

  • shap_explanation : None or an array of up to 10 ShapleyFeatureContribution objects. Only rows with the highest anomaly scores have Shapley explanations calculated. Value is None if prediction is lower than prediction_threshold.

  • timestamp (str) : ISO-formatted timestamp for the row.

  • prediction (float) : The output of the model for this row.

ShapleyFeatureContribution contains:

  • feature_value (str) : the feature value for this row. First 50 characters are returned.

  • strength (float) : the shap value for this feature and row.

  • feature (str) : the feature name.

Attributes:
record_id: str

The ID of the record.

project_id: str

The ID of the project record belongs to.

model_id: str

The ID of the model record belongs to.

backtest: int or “holdout”

The backtest of the record.

source: “training” or “validation”

The source of the record.

series_id: str or None

The series id of the record for the multiseries projects. Defined only for the multiseries projects.

start_date: str or None

The ISO-formatted datetime of the first row in the data.

end_date: str or None

The ISO-formatted datetime of the last row in the data.

data: array of `data_point` objects or None

See data info in Notes for more details.

shap_base_value: float

Shap base value.

count: int

The number of points in the data.

classmethod get(project_id, record_id, start_date=None, end_date=None, points_count=None)

Retrieve predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points. Two out of three parameters: start_date, end_date or points_count must be specified.

Parameters:
project_id: str

The ID of the project.

record_id: str

The ID of the anomaly assessment record.

start_date: str, optional

The start of the date range to get explanations in. Example: 2020-01-01T00:00:00.000000Z

end_date: str, optional

The end of the date range to get explanations in. Example: 2020-10-01T00:00:00.000000Z

points_count: int, optional

The number of the rows to return.

Returns:
AnomalyAssessmentExplanations
Return type:

AnomalyAssessmentExplanations

class datarobot.models.anomaly_assessment.AnomalyAssessmentPredictionsPreview(start_date, end_date, preview_bins, **record_kwargs)

Aggregated predictions over time for the corresponding anomaly assessment record. Intended to find the bins with highest anomaly scores.

Added in version v2.25.

Notes

AnomalyAssessmentPredictionsPreview contains:

  • record_id : the id of the corresponding anomaly assessment record.

  • project_id : the project ID of the corresponding anomaly assessment record.

  • model_id : the model ID of the corresponding anomaly assessment record.

  • backtest : the backtest of the corresponding anomaly assessment record.

  • source : the source of the corresponding anomaly assessment record.

  • series_id : the series id of the corresponding anomaly assessment record for the multiseries projects.

  • start_date : the ISO-formatted timestamp of the first prediction in the subset.

  • end_date : the ISO-formatted timestamp of the last prediction in the subset.

  • preview_bins : list of PreviewBin objects. The aggregated predictions for the subset. Bins boundaries may differ from actual start/end dates because this is an aggregation.

PreviewBin contains:

  • start_date (str) : the ISO-formatted datetime of the start of the bin.

  • end_date (str) : the ISO-formatted datetime of the end of the bin.

  • avg_predicted (float or None) : the average prediction of the model in the bin. None if there are no entries in the bin.

  • max_predicted (float or None) : the maximum prediction of the model in the bin. None if there are no entries in the bin.

  • frequency (int) : the number of the rows in the bin.

Attributes:
record_id: str

The ID of the record.

project_id: str

The ID of the project record belongs to.

model_id: str

The ID of the model record belongs to.

backtest: int or “holdout”

The backtest of the record.

source: “training” or “validation”

The source of the record

series_id: str or None

The series id of the record for the multiseries projects. Defined only for the multiseries projects.

start_date: str

the ISO-formatted timestamp of the first prediction in the subset.

end_date: str

the ISO-formatted timestamp of the last prediction in the subset.

preview_bins: list of preview_bin objects.

The aggregated predictions for the subset. See more info in Notes.

classmethod get(project_id, record_id)

Retrieve aggregated predictions over time.

Parameters:
project_id: str

The ID of the project.

record_id: str

The ID of the anomaly assessment record.

Returns:
AnomalyAssessmentPredictionsPreview
Return type:

AnomalyAssessmentPredictionsPreview

find_anomalous_regions(max_prediction_threshold=0.0)
Return type:

List[AnomalyAssessmentPreviewBin]

Sort preview bins by max_predicted value and select those with max predicted value

greater or equal to max prediction threshold. Sort the result by max predicted value in descending order.

Parameters:
max_prediction_threshold: float, optional

Return bins with maximum anomaly score greater or equal to max_prediction_threshold.

Returns:
preview_bins: list of preview_bin

Filtered and sorted preview bins