Dataset Definition
- class datarobot.helpers.feature_discovery.DatasetDefinition(identifier, catalog_id, catalog_version_id, snapshot_policy='latest', feature_list_id=None, primary_temporal_key=None)
Dataset definition for the Feature Discovery
Added in version v2.25.
Examples
import datarobot as dr dataset_definition = dr.DatasetDefinition( identifier='profile', catalog_id='5ec4aec1f072bc028e3471ae', catalog_version_id='5ec4aec2f072bc028e3471b1', ) dataset_definition = dr.DatasetDefinition( identifier='transaction', catalog_id='5ec4aec1f072bc028e3471ae', catalog_version_id='5ec4aec2f072bc028e3471b1', primary_temporal_key='Date' )
- Attributes:
- identifier: string
Alias of the dataset (used directly as part of the generated feature names)
- catalog_id: string, optional
Identifier of the catalog item
- catalog_version_id: string
Identifier of the catalog item version
- primary_temporal_key: string, optional
Name of the column indicating time of record creation
- feature_list_id: string, optional
Identifier of the feature list. This decides which columns in the dataset are used for feature generation
- snapshot_policy: string, optional
Policy to use when creating a project or making predictions. If omitted, by default endpoint will use ‘latest’. Must be one of the following values: ‘specified’: Use specific snapshot specified by catalogVersionId ‘latest’: Use latest snapshot from the same catalog item ‘dynamic’: Get data from the source (only applicable for JDBC datasets)