Dataset definition
- class datarobot.helpers.feature_discovery.DatasetDefinition
Dataset definition for the Feature Discovery
Added in version v2.25.
- Variables:
identifier (
str
) – Alias of the dataset (used directly as part of the generated feature names)catalog_id (
Optional[str]
) – Identifier of the catalog itemcatalog_version_id (
str
) – Identifier of the catalog item versionprimary_temporal_key (
Optional[str]
) – Name of the column indicating time of record creationfeature_list_id (
Optional[str]
) – Identifier of the feature list. This decides which columns in the dataset are used for feature generationsnapshot_policy (
Optional[str]
) – Policy to use when creating a project or making predictions. If omitted, by default endpoint will use ‘latest’. Must be one of the following values: ‘specified’: Use specific snapshot specified by catalogVersionId ‘latest’: Use latest snapshot from the same catalog item ‘dynamic’: Get data from the source (only applicable for JDBC datasets)
Examples
import datarobot as dr dataset_definition = dr.DatasetDefinition( identifier='profile', catalog_id='5ec4aec1f072bc028e3471ae', catalog_version_id='5ec4aec2f072bc028e3471b1', ) dataset_definition = dr.DatasetDefinition( identifier='transaction', catalog_id='5ec4aec1f072bc028e3471ae', catalog_version_id='5ec4aec2f072bc028e3471b1', primary_temporal_key='Date' )