Reason Codes API

class datarobot.ReasonCodesInitialization(project_id, model_id, reason_codes_sample=None)

Represents a reason codes initialization of a model.

Attributes

project_id (str) id of the project the model belongs to
model_id (str) id of the model reason codes initialization is for
reason_codes_sample (list of dict) a small sample of reason codes that could be generated for the model
classmethod get(project_id, model_id)

Retrieve the reason codes initialization for a model.

Reason codes initializations are a prerequisite for computing reason codes, and include a sample what the computed reason codes for a prediction dataset would look like.

Parameters:

project_id : str

id of the project the model belongs to

model_id : str

id of the model reason codes initialization is for

Returns:

reason_codes_initialization : ReasonCodesInitialization

The queried instance.

Raises:

ClientError (404)

If the project or model does not exist or the initialization has not been computed.

classmethod create(project_id, model_id)

Create a reason codes initialization for the specified model.

Parameters:

project_id : str

id of the project the model belongs to

model_id : str

id of the model for which initialization is requested

Returns:

job : Job

an instance of created async job

delete()

Delete this reason codes initialization.

class datarobot.ReasonCodes(id, project_id, model_id, dataset_id, max_codes, num_columns, finish_time, reason_codes_location, threshold_low=None, threshold_high=None)

Represents reason codes metadata and provides access to computation results.

Examples

reason_codes = dr.ReasonCodes.get(project_id, reason_codes_id)
for row in reason_codes.get_rows():
    print(row)  # row is an instance of ReasonCodesRow

Attributes

id (str) id of the record and reason codes computation result
project_id (str) id of the project the model belongs to
model_id (str) id of the model reason codes initialization is for
dataset_id (str) id of the prediction dataset reason codes were computed for
max_codes (int) maximum number of reason codes to supply per row of the dataset
threshold_low (float) the lower threshold, below which a prediction must score in order for reason codes to be computed for a row in the dataset
threshold_high (float) the high threshold, above which a prediction must score in order for reason codes to be computed for a row in the dataset
num_columns (int) the number of columns reason codes were computed for
finish_time (float) timestamp referencing when computation for these reason codes finished
reason_codes_location (str) where to retrieve the reason codes
classmethod get(project_id, reason_codes_id)

Retrieve a specific reason codes.

Parameters:

project_id : str

id of the project the model belongs to

reason_codes_id : str

id of the reason codes

Returns:

reason_codes : ReasonCodes

The queried instance.

classmethod create(project_id, model_id, dataset_id, max_codes=None, threshold_low=None, threshold_high=None)

Create a reason codes for the specified dataset.

In order to create ReasonCodesPage for a particular model and dataset, you must first:

  • Compute feature impact for the model via datarobot.Model.get_feature_impact()
  • Compute a ReasonCodesInitialization for the model via datarobot.ReasonCodesInitialization.create(project_id, model_id)
  • Compute predictions for the model and dataset via datarobot.Model.request_predictions(dataset_id)

threshold_high and threshold_low are optional filters applied to speed up computation. When at least one is specified, only the selected outlier rows will have reason codes computed. Rows are considered to be outliers if their predicted value (in case of regression projects) or probability of being the positive class (in case of classification projects) is less than threshold_low or greater than thresholdHigh. If neither is specified, reason codes will be computed for all rows.

Parameters:

project_id : str

id of the project the model belongs to

model_id : str

id of the model for which reason codes are requested

dataset_id : str

id of the prediction dataset for which reason codes are requested

threshold_low : float, optional

the lower threshold, below which a prediction must score in order for reason codes to be computed for a row in the dataset. If neither threshold_high nor threshold_low is specified, reason codes will be computed for all rows.

threshold_high : float, optional

the high threshold, above which a prediction must score in order for reason codes to be computed. If neither threshold_high nor threshold_low is specified, reason codes will be computed for all rows.

max_codes : int, optional

the maximum number of reason codes to supply per row of the dataset, default: 3.

Returns:

job: Job

an instance of created async job

classmethod list(project_id, model_id=None, limit=None, offset=None)

List of reason codes for a specified project.

Parameters:

project_id : str

id of the project to list reason codes for

model_id : str, optional

if specified, only reason codes computed for this model will be returned

limit : int or None

at most this many results are returned, default: no limit

offset : int or None

this many results will be skipped, default: 0

Returns:

reason_codes : list[ReasonCodes]

get_rows(batch_size=None, exclude_adjusted_predictions=True)

Retrieve reason codes rows.

Parameters:

batch_size : int

maximum number of reason codes rows to retrieve per request

exclude_adjusted_predictions : bool

Optional, defaults to True. Set to False to include adjusted predictions, which will differ from the predictions on some projects, e.g. those with an exposure column specified.

Yields:

reason_codes_row : ReasonCodesRow

Represents reason codes computed for a prediction row.

get_all_as_dataframe(exclude_adjusted_predictions=True)

Retrieve all reason codes rows and return them as a pandas.DataFrame.

Returned dataframe has the following structure:

  • row_id : row id from prediction dataset
  • prediction : the output of the model for this row
  • adjusted_prediction : adjusted prediction values (only appears for projects that utilize prediction adjustments, e.g. projects with an exposure column)
  • class_0_label : a class level from the target (only appears for classification projects)
  • class_0_probability : the probability that the target is this class (only appears for classification projects)
  • class_1_label : a class level from the target (only appears for classification projects)
  • class_1_probability : the probability that the target is this class (only appears for classification projects)
  • reason_0_feature : the name of the feature contributing to the prediction for this reason
  • reason_0_feature_value : the value the feature took on
  • reason_0_label : the output being driven by this reason. For regression projects, this is the name of the target feature. For classification projects, this is the class label whose probability increasing would correspond to a positive strength.
  • reason_0_qualitative_strength : a human-readable description of how strongly the feature affected the prediction (e.g. ‘+++’, ‘–’, ‘+’) for this reason
  • reason_0_strength : the amount this feature’s value affected the prediction
  • ...
  • reason_N_feature : the name of the feature contributing to the prediction for this reason
  • reason_N_feature_value : the value the feature took on
  • reason_N_label : the output being driven by this reason. For regression projects, this is the name of the target feature. For classification projects, this is the class label whose probability increasing would correspond to a positive strength.
  • reason_N_qualitative_strength : a human-readable description of how strongly the feature affected the prediction (e.g. ‘+++’, ‘–’, ‘+’) for this reason
  • reason_N_strength : the amount this feature’s value affected the prediction
Parameters:

exclude_adjusted_predictions : bool

Optional, defaults to True. Set this to False to include adjusted prediction values in the returned dataframe.

Returns:

dataframe: pandas.DataFrame

download_to_csv(filename, encoding='utf-8', exclude_adjusted_predictions=True)

Save reason codes rows into CSV file.

Parameters:

filename : str or file object

path or file object to save reason codes rows

encoding : string, optional

A string representing the encoding to use in the output file, defaults to ‘utf-8’

exclude_adjusted_predictions : bool

Optional, defaults to True. Set to False to include adjusted predictions, which will differ from the predictions on some projects, e.g. those with an exposure column specified.

get_reason_codes_page(limit=None, offset=None, exclude_adjusted_predictions=True)

Get reason codes.

If you don’t want use a generator interface, you can access paginated reason codes directly.

Parameters:

limit : int or None

the number of records to return, the server will use a (possibly finite) default if not specified

offset : int or None

the number of records to skip, default 0

exclude_adjusted_predictions : bool

Optional, defaults to True. Set to False to include adjusted predictions, which will differ from the predictions on some projects, e.g. those with an exposure column specified.

Returns:

reason_codes : ReasonCodesPage

delete()

Delete this reason codes.

class datarobot.models.reason_codes.ReasonCodesRow(row_id, prediction, prediction_values, reason_codes=None, adjusted_prediction=None, adjusted_prediction_values=None)

Represents reason codes computed for a prediction row.

Notes

PredictionValue contains:

  • label : describes what this model output corresponds to. For regression projects, it is the name of the target feature. For classification projects, it is a level from the target feature.
  • value : the output of the prediction. For regression projects, it is the predicted value of the target. For classification projects, it is the predicted probability the row belongs to the class identified by the label.

ReasonCode contains:

  • label : described what output was driven by this reason code. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this reason code.
  • feature : the name of the feature contributing to the prediction
  • feature_value : the value the feature took on for this row
  • strength : the amount this feature’s value affected the prediction
  • qualitativate_strength : a human-readable description of how strongly the feature affected the prediction (e.g. ‘+++’, ‘–’, ‘+’)

Attributes

row_id (int) which row this ReasonCodeRow describes
prediction (float) the output of the model for this row
adjusted_prediction (float or None) adjusted prediction value for projects that provide this information, None otherwise
prediction_values (list) an array of dictionaries with a schema described as PredictionValue
adjusted_prediction_values (list) same as prediction_values but for adjusted predictions
reason_codes (list) an array of dictionaries with a schema described as ReasonCode
class datarobot.models.reason_codes.ReasonCodesPage(id, count=None, previous=None, next=None, data=None, reason_codes_record_location=None, adjustment_method=None)

Represents batch of reason codes received by one request.

Attributes

id (str) id of the reason codes computation result
data (list[dict]) list of raw reason codes, each row corresponds to a row of the prediction dataset
count (int) total number of rows computed
previous_page (str) where to retrieve previous page of reason codes, None if current page is the first
next_page (str) where to retrieve next page of reason codes, None if current page is the last
reason_codes_record_location (str) where to retrieve the reason codes metadata
adjustment_method (str) Adjustment method that was applied to predictions, or ‘N/A’ if no adjustments were done.
classmethod get(project_id, reason_codes_id, limit=None, offset=0, exclude_adjusted_predictions=True)

Retrieve reason codes.

Parameters:

project_id : str

id of the project the model belongs to

reason_codes_id : str

id of the reason codes

limit : int or None

the number of records to return, the server will use a (possibly finite) default if not specified

offset : int or None

the number of records to skip, default 0

exclude_adjusted_predictions : bool

Optional, defaults to True. Set to False to include adjusted predictions, which will differ from the predictions on some projects, e.g. those with an exposure column specified.

Returns:

reason_codes : ReasonCodesPage

The queried instance.