Recipes

class datarobot.models.recipe.Recipe(dialect, recipe_id, status, inputs, operations=None, downsampling=None, settings=None)

Data wrangling entity, which contains all information needed to transform dataset and generate SQL.

classmethod update_downsampling(recipe_id, downsampling)

Set downsampling for the recipe, applied during publishing.

Return type:

Recipe

retrieve_preview(max_wait=600, number_of_operations_to_use=None)

Retrieve preview and compute it, if absent.

Parameters:
max_wait: int

The number of seconds to wait for the result.

number_of_operations_to_use: int, optional

Request preview for particular number of operations.

Returns:
preview: dict
Return type:

Dict[str, Any]

retrieve_insights(max_wait=600, number_of_operations_to_use=None)

Retrieve insights for the sample. When preview is requested, the insights job starts automatically.

Parameters:
max_wait: int

The number of seconds to wait for the result.

number_of_operations_to_use: int, optional

Retrieves insights for the specified number of operations. First, preview computation for the same number of operations must be submitted.

Returns:
Return type:

Any

classmethod set_inputs(recipe_id, inputs)

Set inputs for the recipe.

Return type:

Recipe

classmethod set_operations(recipe_id, operations)

Set operations for the recipe.

Return type:

Recipe

get_sql(operations=None)

Generate sql for the given recipe in a transient way, recipe is not modified. if operations is None, recipe operations are used to generate sql. if operations = [], recipe operations are ignored during sql generation. if operations is not empty list, generate sql for them.

Return type:

str

classmethod from_data_store(use_case, data_store, data_source_type, dialect, data_source_inputs)

Create a wrangling recipe from data store.

Return type:

Recipe

classmethod from_dataset(use_case, dataset, dialect=None, inputs=None)

Create a wrangling recipe from dataset.

Return type:

Recipe

class datarobot.models.recipe.RecipeSettings(target=None, weights_feature=None)

Settings, for example to apply at downsampling stage.

class datarobot.models.recipe.RecipeDatasetInput(input_type, dataset_id, dataset_version_id, sampling=None, alias=None)

Object, describing inputs for recipe transformations.

class datarobot.models.recipe.DatasetInput(sampling)
class datarobot.models.recipe.DataSourceInput(canonical_name, table, schema=None, catalog=None, sampling=None)

Inputs required to create a new recipe from data store.

Recipe Operations

class datarobot.models.recipe_operation.WranglingOperation(directive, arguments)
class datarobot.models.recipe_operation.DownsamplingOperation(directive, arguments)
class datarobot.models.recipe_operation.SamplingOperation(directive, arguments)
class datarobot.models.recipe_operation.BaseTimeAwareTask(name, arguments)
class datarobot.models.recipe_operation.TaskPlanElement(column, task_list)
class datarobot.models.recipe_operation.CategoricalStats(methods, window_size)
class datarobot.models.recipe_operation.NumericStats(methods, window_size)
class datarobot.models.recipe_operation.Lags(orders)
class datarobot.models.recipe_operation.LagsOperation(column, orders, datetime_partition_column, multiseries_id_column=None)

Generate lags in a window.

class datarobot.models.recipe_operation.WindowCategoricalStatsOperation(column, window_size, methods, datetime_partition_column, multiseries_id_column=None, rolling_most_frequent_udf=None)

Generate rolling statistics in a window for categorical features.

class datarobot.models.recipe_operation.WindowNumericStatsOperation(column, window_size, methods, datetime_partition_column, multiseries_id_column=None, rolling_median_udf=None)

Generate various rolling numeric statistics in a window. Output could be a several columns.

class datarobot.models.recipe_operation.TimeSeriesOperation(target_column, datetime_partition_column, forecast_distances, task_plan, baseline_periods=None, known_in_advance_columns=None, multiseries_id_column=None, rolling_median_udf=None, rolling_most_frequent_udf=None, forecast_point=None)

Operation to generate a dataset ready for time series modeling: with forecast point, forecast distances, known in advance columns, etc.

class datarobot.models.recipe_operation.ComputeNewOperation(expression, new_feature_name)
class datarobot.models.recipe_operation.RenameColumnsOperation(column_mappings)
class datarobot.models.recipe_operation.FilterCondition
class datarobot.models.recipe_operation.FilterOperation(conditions, keep_rows=True, operator='and')

Filter rows.

class datarobot.models.recipe_operation.DropColumnsOperation(columns)
class datarobot.models.recipe_operation.RandomSamplingOperation(rows, seed=None)
class datarobot.models.recipe_operation.DatetimeSamplingOperation(datetime_partition_column, rows, strategy=None, multiseries_id_column=None, selected_series=None)