Anomaly Assessment

class datarobot.models.anomaly_assessment.AnomalyAssessmentRecord(status, status_details, start_date, end_date, prediction_threshold, preview_location, delete_location, latest_explanations_location, **record_kwargs)

Object which keeps metadata about anomaly assessment insight for the particular subset, backtest and series and the links to proceed to get the anomaly assessment data.

Added in version v2.25.

Notes

Record contains:

record_id : the ID of the record.
project_id : the project ID of the record.
model_id : the model ID of the record.
backtest : the backtest of the record.
source : the source of the record.
series_id : the series id of the record for the multiseries projects.
status : the status of the insight.
status_details : the explanation of the status.
start_date : the ISO-formatted timestamp of the first prediction in the subset. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.
end_date : the ISO-formatted timestamp of the last prediction in the subset. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.
prediction_threshold : the threshold, all rows with anomaly scores greater or equal to it have shap explanations computed. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.
preview_location : URL to retrieve predictions preview for the subset. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.
latest_explanations_location : the URL to retrieve the latest predictions with the shap explanations. Will be None if status is not AnomalyAssessmentStatus.COMPLETED.
delete_location : the URL to delete anomaly assessment record and relevant insight data.

Attributes:

record_id: str: The ID of the record.
project_id: str: The ID of the project record belongs to.
model_id: str: The ID of the model record belongs to.
backtest: int or “holdout”: The backtest of the record.
source: “training” or “validation”: The source of the record
series_id: str or None: The series id of the record for the multiseries projects. Defined only for the multiseries projects.
status: str: The status of the insight. One of datarobot.enums.AnomalyAssessmentStatus
status_details: str: The explanation of the status.
start_date: str or None: See start_date info in Notes for more details.
end_date: str or None: See end_date info in Notes for more details.
prediction_threshold: float or None: See prediction_threshold info in Notes for more details.
preview_location: str or None: See preview_location info in Notes for more details.
latest_explanations_location: str or None: See latest_explanations_location info in Notes for more details.
delete_location: str: The URL to delete anomaly assessment record and relevant insight data.

classmethod list(project_id, model_id, backtest=None, source=None, series_id=None, limit=100, offset=0, with_data_only=False)

Retrieve the list of the anomaly assessment records for the project and model. Output can be filtered and limited.

Parameters:

project_id: str: The ID of the project record belongs to.
model_id: str: The ID of the model record belongs to.
backtest: int or “holdout”: The backtest to filter records by.
source: “training” or “validation”: The source to filter records by.
series_id: str, optional: The series id to filter records by. Can be specified for multiseries projects.
limit: int, optional: 100 by default. At most this many results are returned.
offset: int, optional: This many results will be skipped.
with_data_only: bool, False by default: Filter by status == AnomalyAssessmentStatus.COMPLETED. If True, records with no data or not supported will be omitted.

Returns:

AnomalyAssessmentRecord: The anomaly assessment record.

Return type:

List[AnomalyAssessmentRecord]

classmethod compute(project_id, model_id, backtest, source, series_id=None)

Request anomaly assessment insight computation on the specified subset.

Parameters:

project_id: str: The ID of the project to compute insight for.
model_id: str: The ID of the model to compute insight for.
backtest: int or “holdout”: The backtest to compute insight for.
source: “training” or “validation”: The source to compute insight for.
series_id: str, optional: The series id to compute insight for. Required for multiseries projects.

Returns:

AnomalyAssessmentRecord: The anomaly assessment record.

Return type:

AnomalyAssessmentRecord

delete()

Delete anomaly assessment record with preview and explanations.

Return type:: None

get_predictions_preview()

Retrieve aggregated predictions statistics for the anomaly assessment record.

Returns:

AnomalyAssessmentPredictionsPreview

Return type:

AnomalyAssessmentPredictionsPreview

get_latest_explanations()

Retrieve latest predictions along with shap explanations for the most anomalous records.

Returns:

AnomalyAssessmentExplanations

Return type:

AnomalyAssessmentExplanations

get_explanations(start_date=None, end_date=None, points_count=None)

Retrieve predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points. Two out of three parameters: start_date, end_date or points_count must be specified.

Parameters:

start_date: str, optional: The start of the date range to get explanations in. Example: 2020-01-01T00:00:00.000000Z
end_date: str, optional: The end of the date range to get explanations in. Example: 2020-10-01T00:00:00.000000Z
points_count: int, optional: The number of the rows to return.

Returns:

AnomalyAssessmentExplanations

Return type:

AnomalyAssessmentExplanations

get_explanations_data_in_regions(regions, prediction_threshold=0.0)

Get predictions along with explanations for the specified regions, sorted by predictions in descending order.

Parameters:

regions: list of preview_bins: For each region explanations will be retrieved and merged.
prediction_threshold: float, optional: If specified, only points with score greater or equal to the threshold will be returned.

Returns:

dict in a form of {‘explanations’: explanations, ‘shap_base_value’: shap_base_value}

Return type:

RegionExplanationsData

class datarobot.models.anomaly_assessment.AnomalyAssessmentExplanations(shap_base_value, data, start_date, end_date, count, **record_kwargs)

Object which keeps predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points.

Added in version v2.25.

Notes

AnomalyAssessmentExplanations contains:

record_id : the id of the corresponding anomaly assessment record.
project_id : the project ID of the corresponding anomaly assessment record.
model_id : the model ID of the corresponding anomaly assessment record.
backtest : the backtest of the corresponding anomaly assessment record.
source : the source of the corresponding anomaly assessment record.
series_id : the series id of the corresponding anomaly assessment record for the multiseries projects.
start_date : the ISO-formatted first timestamp in the response. Will be None of there is no data in the specified range.
end_date : the ISO-formatted last timestamp in the response. Will be None of there is no data in the specified range.
count : The number of points in the response.
shap_base_value : the shap base value.
data : list of DataPoint objects in the specified date range.

DataPoint contains:

shap_explanation : None or an array of up to 10 ShapleyFeatureContribution objects. Only rows with the highest anomaly scores have Shapley explanations calculated. Value is None if prediction is lower than prediction_threshold.

timestamp (str) : ISO-formatted timestamp for the row.

prediction (float) : The output of the model for this row.

ShapleyFeatureContribution contains:

feature_value (str) : the feature value for this row. First 50 characters are returned.

strength (float) : the shap value for this feature and row.

feature (str) : the feature name.

Attributes:

record_id: str: The ID of the record.
project_id: str: The ID of the project record belongs to.
model_id: str: The ID of the model record belongs to.
backtest: int or “holdout”: The backtest of the record.
source: “training” or “validation”: The source of the record.
series_id: str or None: The series id of the record for the multiseries projects. Defined only for the multiseries projects.
start_date: str or None: The ISO-formatted datetime of the first row in the data.
end_date: str or None: The ISO-formatted datetime of the last row in the data.
data: array of `data_point` objects or None: See data info in Notes for more details.
shap_base_value: float: Shap base value.
count: int: The number of points in the data.

classmethod get(project_id, record_id, start_date=None, end_date=None, points_count=None)

Retrieve predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points. Two out of three parameters: start_date, end_date or points_count must be specified.

Parameters:

project_id: str: The ID of the project.
record_id: str: The ID of the anomaly assessment record.
start_date: str, optional: The start of the date range to get explanations in. Example: 2020-01-01T00:00:00.000000Z
end_date: str, optional: The end of the date range to get explanations in. Example: 2020-10-01T00:00:00.000000Z
points_count: int, optional: The number of the rows to return.

Returns:

AnomalyAssessmentExplanations

Return type:

AnomalyAssessmentExplanations

class datarobot.models.anomaly_assessment.AnomalyAssessmentPredictionsPreview(start_date, end_date, preview_bins, **record_kwargs)

Aggregated predictions over time for the corresponding anomaly assessment record. Intended to find the bins with highest anomaly scores.

Added in version v2.25.

Notes

AnomalyAssessmentPredictionsPreview contains:

record_id : the id of the corresponding anomaly assessment record.
project_id : the project ID of the corresponding anomaly assessment record.
model_id : the model ID of the corresponding anomaly assessment record.
backtest : the backtest of the corresponding anomaly assessment record.
source : the source of the corresponding anomaly assessment record.
series_id : the series id of the corresponding anomaly assessment record for the multiseries projects.
start_date : the ISO-formatted timestamp of the first prediction in the subset.
end_date : the ISO-formatted timestamp of the last prediction in the subset.
preview_bins : list of PreviewBin objects. The aggregated predictions for the subset. Bins boundaries may differ from actual start/end dates because this is an aggregation.

PreviewBin contains:

start_date (str) : the ISO-formatted datetime of the start of the bin.
end_date (str) : the ISO-formatted datetime of the end of the bin.
avg_predicted (float or None) : the average prediction of the model in the bin. None if there are no entries in the bin.
max_predicted (float or None) : the maximum prediction of the model in the bin. None if there are no entries in the bin.
frequency (int) : the number of the rows in the bin.

Attributes:

record_id: str: The ID of the record.
project_id: str: The ID of the project record belongs to.
model_id: str: The ID of the model record belongs to.
backtest: int or “holdout”: The backtest of the record.
source: “training” or “validation”: The source of the record
series_id: str or None: The series id of the record for the multiseries projects. Defined only for the multiseries projects.
start_date: str: the ISO-formatted timestamp of the first prediction in the subset.
end_date: str: the ISO-formatted timestamp of the last prediction in the subset.
preview_bins: list of preview_bin objects.: The aggregated predictions for the subset. See more info in Notes.

classmethod get(project_id, record_id)

Retrieve aggregated predictions over time.

Parameters:

project_id: str: The ID of the project.
record_id: str: The ID of the anomaly assessment record.

Returns:

AnomalyAssessmentPredictionsPreview

Return type:

AnomalyAssessmentPredictionsPreview

find_anomalous_regions(max_prediction_threshold=0.0)

Return type:: List[AnomalyAssessmentPreviewBin]

Sort preview bins by max_predicted value and select those with max predicted value: greater or equal to max prediction threshold. Sort the result by max predicted value in descending order.

Parameters:

max_prediction_threshold: float, optional: Return bins with maximum anomaly score greater or equal to max_prediction_threshold.

Returns:

preview_bins: list of preview_bin: Filtered and sorted preview bins