Advanced Options

class datarobot.helpers.AdvancedOptions(weights=None, response_cap=None, blueprint_threshold=None, seed=None, smart_downsampled=None, majority_downsampling_rate=None, offset=None, exposure=None, accuracy_optimized_mb=None, scaleout_modeling_mode=None, events_count=None, monotonic_increasing_featurelist_id=None, monotonic_decreasing_featurelist_id=None, only_include_monotonic_blueprints=None, allowed_pairwise_interaction_groups=None, blend_best_models=None, scoring_code_only=None, prepare_model_for_deployment=None, consider_blenders_in_recommendation=None, min_secondary_validation_model_count=None, shap_only_mode=None, autopilot_data_sampling_method=None, run_leakage_removed_feature_list=None, autopilot_with_feature_discovery=False, feature_discovery_supervised_feature_reduction=None, exponentially_weighted_moving_alpha=None, external_time_series_baseline_dataset_id=None, use_supervised_feature_reduction=True, primary_location_column=None, protected_features=None, preferable_target_value=None, fairness_metrics_set=None, fairness_threshold=None, bias_mitigation_feature_name=None, bias_mitigation_technique=None, include_bias_mitigation_feature_as_predictor_variable=None, default_monotonic_increasing_featurelist_id=None, default_monotonic_decreasing_featurelist_id=None, model_group_id=None, model_regime_id=None, model_baselines=None, series_id=None, forecast_distance=None, forecast_offsets=None, incremental_learning_only_mode=None, incremental_learning_on_best_model=None, number_of_incremental_learning_iterations_before_best_model_selection=None, chunk_definition_id=None, incremental_learning_early_stopping_rounds=None)

Used when setting the target of a project to set advanced options of modeling process.

Parameters:
weightsstring, optional

The name of a column indicating the weight of each row

response_capbool or float in [0.5, 1), optional

Defaults to none here, but server defaults to False. If specified, it is the quantile of the response distribution to use for response capping.

blueprint_thresholdint, optional

Number of hours models are permitted to run before being excluded from later autopilot stages Minimum 1

seedint, optional

a seed to use for randomization

smart_downsampledbool, optional

whether to use smart downsampling to throw away excess rows of the majority class. Only applicable to classification and zero-boosted regression projects.

majority_downsampling_ratefloat, optional

the percentage between 0 and 100 of the majority rows that should be kept. Specify only if using smart downsampling. May not cause the majority class to become smaller than the minority class.

offsetlist of str, optional

(New in version v2.6) the list of the names of the columns containing the offset of each row

exposurestring, optional

(New in version v2.6) the name of a column containing the exposure of each row

accuracy_optimized_mbbool, optional

(New in version v2.6) Include additional, longer-running models that will be run by the autopilot and available to run manually.

scaleout_modeling_modestring, optional

(Deprecated in 2.28. Will be removed in 2.30) DataRobot no longer supports scaleout models. Please remove any usage of this parameter as it will be removed from the API soon.

events_countstring, optional

(New in version v2.8) the name of a column specifying events count.

monotonic_increasing_featurelist_idstring, optional

(new in version 2.11) the id of the featurelist that defines the set of features with a monotonically increasing relationship to the target. If None, no such constraints are enforced. When specified, this will set a default for the project that can be overridden at model submission time if desired.

monotonic_decreasing_featurelist_idstring, optional

(new in version 2.11) the id of the featurelist that defines the set of features with a monotonically decreasing relationship to the target. If None, no such constraints are enforced. When specified, this will set a default for the project that can be overridden at model submission time if desired.

only_include_monotonic_blueprintsbool, optional

(new in version 2.11) when true, only blueprints that support enforcing monotonic constraints will be available in the project or selected for the autopilot.

allowed_pairwise_interaction_groupslist of tuple, optional

(New in version v2.19) For GA2M models - specify groups of columns for which pairwise interactions will be allowed. E.g. if set to [(A, B, C), (C, D)] then GA2M models will allow interactions between columns A x B, B x C, A x C, C x D. All others (A x D, B x D) will not be considered.

blend_best_models: bool, optional

(New in version v2.19) blend best models during Autopilot run.

scoring_code_only: bool, optional

(New in version v2.19) Keep only models that can be converted to scorable java code during Autopilot run

shap_only_mode: bool, optional

(New in version v2.21) Keep only models that support SHAP values during Autopilot run. Use SHAP-based insights wherever possible. Defaults to False.

prepare_model_for_deployment: bool, optional

(New in version v2.19) Prepare model for deployment during Autopilot run. The preparation includes creating reduced feature list models, retraining best model on higher sample size, computing insights and assigning “RECOMMENDED FOR DEPLOYMENT” label.

consider_blenders_in_recommendation: bool, optional

(New in version 2.22.0) Include blenders when selecting a model to prepare for deployment in an Autopilot Run. Defaults to False.

min_secondary_validation_model_count: int, optional

(New in version v2.19) Compute “All backtest” scores (datetime models) or cross validation scores for the specified number of the highest ranking models on the Leaderboard, if over the Autopilot default.

autopilot_data_sampling_method: str, optional

(New in version v2.23) one of datarobot.enums.DATETIME_AUTOPILOT_DATA_SAMPLING_METHOD. Applicable for OTV projects only, defines if autopilot uses “random” or “latest” sampling when iteratively building models on various training samples. Defaults to “random” for duration-based projects and to “latest” for row-based projects.

run_leakage_removed_feature_list: bool, optional

(New in version v2.23) Run Autopilot on Leakage Removed feature list (if exists).

autopilot_with_feature_discovery: bool, default ``False``, optional

(New in version v2.23) If true, autopilot will run on a feature list that includes features found via search for interactions.

feature_discovery_supervised_feature_reduction: bool, optional

(New in version v2.23) Run supervised feature reduction for feature discovery projects.

exponentially_weighted_moving_alpha: float, optional

(New in version v2.26) defaults to None, value between 0 and 1 (inclusive), indicates alpha parameter used in exponentially weighted moving average within feature derivation window.

external_time_series_baseline_dataset_id: str, optional

(New in version v2.26) If provided, will generate metrics scaled by external model predictions metric for time series projects. The external predictions catalog must be validated before autopilot starts, see Project.validate_external_time_series_baseline and external baseline predictions documentation for further explanation.

use_supervised_feature_reduction: bool, default ``True` optional

Time Series only. When true, during feature generation DataRobot runs a supervised algorithm to retain only qualifying features. Setting to false can severely impact autopilot duration, especially for datasets with many features.

primary_location_column: str, optional.

The name of primary location column.

protected_features: list of str, optional.

(New in version v2.24) A list of project features to mark as protected for Bias and Fairness testing calculations. Max number of protected features allowed is 10.

preferable_target_value: str, optional.

(New in version v2.24) A target value that should be treated as a favorable outcome for the prediction. For example, if we want to check gender discrimination for giving a loan and our target is named is_bad, then the positive outcome for the prediction would be No, which means that the loan is good and that’s what we treat as a favorable result for the loaner.

fairness_metrics_set: str, optional.

(New in version v2.24) Metric to use for calculating fairness. Can be one of proportionalParity, equalParity, predictionBalance, trueFavorableAndUnfavorableRateParity or favorableAndUnfavorablePredictiveValueParity. Used and required only if Bias & Fairness in AutoML feature is enabled.

fairness_threshold: str, optional.

(New in version v2.24) Threshold value for the fairness metric. Can be in a range of [0.0, 1.0]. If the relative (i.e. normalized) fairness score is below the threshold, then the user will see a visual indication on the

bias_mitigation_feature_namestr, optional

The feature from protected features that will be used in a bias mitigation task to mitigate bias

bias_mitigation_techniquestr, optional

One of datarobot.enums.BiasMitigationTechnique Options: - ‘preprocessingReweighing’ - ‘postProcessingRejectionOptionBasedClassification’ The technique by which we’ll mitigate bias, which will inform which bias mitigation task we insert into blueprints

include_bias_mitigation_feature_as_predictor_variablebool, optional

Whether we should also use the mitigation feature as in input to the modeler just like any other categorical used for training, i.e. do we want the model to “train on” this feature in addition to using it for bias mitigation

default_monotonic_increasing_featurelist_idstr, optional

Returned from server on Project GET request - not able to be updated by user

default_monotonic_decreasing_featurelist_idstr, optional

Returned from server on Project GET request - not able to be updated by user

model_group_id: Optional[str] = None,

(New in version v3.3) The name of a column containing the model group id for each row.

model_regime_id: Optional[str] = None,

(New in version v3.3) The name of a column containing the model regime id for each row.

model_baselines: Optional[List[str]] = None,

(New in version v3.3) The list of the names of the columns containing the model baselines

series_id: Optional[str] = None,

(New in version v3.6) The name of a column containing the series id for each row.

forecast_distance: Optional[str] = None,

(New in version v3.6) The name of a column containing the forecast distance for each row.

forecast_offsets: Optional[List[str]] = None,

(New in version v3.6) The list of the names of the columns containing the forecast offsets for each row.

incremental_learning_only_mode: Optional[bool] = None,

(New in version v3.4) Keep only models that support incremental learning during Autopilot run.

incremental_learning_on_best_model: Optional[bool] = None,

(New in version v3.4) Run incremental learning on the best model during Autopilot run.

chunk_definition_idstring, optional

(New in version v3.4) Unique definition for chunks needed to run automated incremental learning.

incremental_learning_early_stopping_roundsOptional[int] = None

(New in version v3.4) Early stopping rounds used in the automated incremental learning service.

number_of_incremental_learning_iterations_before_best_model_selection: Optional[int] = None

Number of iterations top 5 models complete prior to best model selection.

Examples

import datarobot as dr
advanced_options = dr.AdvancedOptions(
    weights='weights_column',
    offset=['offset_column'],
    exposure='exposure_column',
    response_cap=0.7,
    blueprint_threshold=2,
    smart_downsampled=True, majority_downsampling_rate=75.0)
get(_AdvancedOptions__key, _AdvancedOptions__default=None)

Return the value for key if key is in the dictionary, else default.

Return type:

Optional[Any]

pop(_AdvancedOptions__key)

If the key is not found, return the default if given; otherwise, raise a KeyError.

Return type:

Optional[Any]

update_individual_options(**kwargs)

Update individual attributes of an instance of AdvancedOptions.

Return type:

None