Secondary Dataset
- class datarobot.helpers.feature_discovery.SecondaryDataset(identifier, catalog_id, catalog_version_id, snapshot_policy='latest')
A secondary dataset to be used for feature discovery
Added in version v2.25.
Examples
import datarobot as dr dataset_definition = dr.SecondaryDataset( identifier='profile', catalog_id='5ec4aec1f072bc028e3471ae', catalog_version_id='5ec4aec2f072bc028e3471b1', )
- Attributes:
- identifier: string
Alias of the dataset (used directly as part of the generated feature names)
- catalog_id: string
Identifier of the catalog item
- catalog_version_id: string
Identifier of the catalog item version
- snapshot_policy: string, optional
Policy to use while creating a project or making predictions. If omitted, by default endpoint will use ‘latest’. Must be one of the following values: ‘specified’: Use specific snapshot specified by catalogVersionId ‘latest’: Use latest snapshot from the same catalog item ‘dynamic’: Get data from the source (only applicable for JDBC datasets)
Secondary Dataset Configurations
- class datarobot.models.SecondaryDatasetConfigurations(id, project_id, config=None, secondary_datasets=None, name=None, creator_full_name=None, creator_user_id=None, created=None, featurelist_id=None, credential_ids=None, is_default=None, project_version=None)
Create secondary dataset configurations for a given project
Added in version v2.20.
- Attributes:
- idstr
Id of this secondary dataset configuration
- project_idstr
Id of the associated project.
- config: list of DatasetConfiguration (Deprecated in version v2.23)
List of secondary dataset configurations
- secondary_datasets: list of SecondaryDataset (new in v2.23)
List of secondary datasets (secondaryDataset)
- name: str
Verbose name of the SecondaryDatasetConfig. null if it wasn’t specified.
- created: datetime.datetime
DR-formatted datetime. null for legacy (before DR 6.0) db records.
- creator_user_id: str
Id of the user created this config.
- creator_full_name: str
fullname or email of the user created this config.
- featurelist_id: str, optional
Id of the feature list. null if it wasn’t specified.
- credential_ids: list of DatasetsCredentials, optional
credentials used by the secondary datasets if the datasets used in the configuration are from datasource
- is_default: bool, optional
Boolean flag if default config created during feature discovery aim
- project_version: str, optional
Version of project when its created (Release version)
- classmethod create(project_id, secondary_datasets, name, featurelist_id=None)
create secondary dataset configurations :rtype:
SecondaryDatasetConfigurations
Added in version v2.20.
- Parameters:
- project_idstr
id of the associated project.
- secondary_datasets: list of SecondaryDataset (New in version v2.23)
list of secondary datasets used by the configuration each element is a
datarobot.helpers.feature_discovery.SecondaryDataset
- name: str (New in version v2.23)
Name of the secondary datasets configuration
- featurelist_id: str, or None (New in version v2.23)
Id of the featurelist
- Returns:
- an instance of SecondaryDatasetConfigurations
- Raises:
- ClientError
raised if incorrect configuration parameters are provided
Examples
profile_secondary_dataset = dr.SecondaryDataset( identifier='profile', catalog_id='5ec4aec1f072bc028e3471ae', catalog_version_id='5ec4aec2f072bc028e3471b1', snapshot_policy='latest' ) transaction_secondary_dataset = dr.SecondaryDataset( identifier='transaction', catalog_id='5ec4aec268f0f30289a03901', catalog_version_id='5ec4aec268f0f30289a03900', snapshot_policy='latest' ) secondary_datasets = [profile_secondary_dataset, transaction_secondary_dataset] new_secondary_dataset_config = dr.SecondaryDatasetConfigurations.create( project_id=project.id, name='My config', secondary_datasets=secondary_datasets ) >>> new_secondary_dataset_config.id '5fd1e86c589238a4e635e93d'
- delete()
Removes the Secondary datasets configuration :rtype:
None
Added in version v2.21.
- Raises:
- ClientError
Raised if an invalid or already deleted secondary dataset config id is provided
Examples
# Deleting with a valid secondary_dataset_config id status_code = dr.SecondaryDatasetConfigurations.delete(some_config_id) status_code >>> 204
- get()
Retrieve a single secondary dataset configuration for a given id :rtype:
SecondaryDatasetConfigurations
Added in version v2.21.
- Returns:
- secondary_dataset_configurationsSecondaryDatasetConfigurations
The requested secondary dataset configurations
Examples
config_id = '5fd1e86c589238a4e635e93d' secondary_dataset_config = dr.SecondaryDatasetConfigurations(id=config_id).get() >>> secondary_dataset_config { 'created': datetime.datetime(2020, 12, 9, 6, 16, 22, tzinfo=tzutc()), 'creator_full_name': u'[email protected]', 'creator_user_id': u'asdf4af1gf4bdsd2fba1de0a', 'credential_ids': None, 'featurelist_id': None, 'id': u'5fd1e86c589238a4e635e93d', 'is_default': True, 'name': u'My config', 'project_id': u'5fd06afce2456ec1e9d20457', 'project_version': None, 'secondary_datasets': [ { 'snapshot_policy': u'latest', 'identifier': u'profile', 'catalog_version_id': u'5fd06b4af24c641b68e4d88f', 'catalog_id': u'5fd06b4af24c641b68e4d88e' }, { 'snapshot_policy': u'dynamic', 'identifier': u'transaction', 'catalog_version_id': u'5fd1e86c589238a4e635e98e', 'catalog_id': u'5fd1e86c589238a4e635e98d' } ] }
- classmethod list(project_id, featurelist_id=None, limit=None, offset=None)
Returns list of secondary dataset configurations. :rtype:
List
[SecondaryDatasetConfigurations
]Added in version v2.23.
- Parameters:
- project_id: str
The Id of project
- featurelist_id: str, optional
Id of the feature list to filter the secondary datasets configurations
- Returns:
- secondary_dataset_configurationslist of SecondaryDatasetConfigurations
The requested list of secondary dataset configurations for a given project
Examples
pid = '5fd06afce2456ec1e9d20457' secondary_dataset_configs = dr.SecondaryDatasetConfigurations.list(pid) >>> secondary_dataset_configs[0] { 'created': datetime.datetime(2020, 12, 9, 6, 16, 22, tzinfo=tzutc()), 'creator_full_name': u'[email protected]', 'creator_user_id': u'asdf4af1gf4bdsd2fba1de0a', 'credential_ids': None, 'featurelist_id': None, 'id': u'5fd1e86c589238a4e635e93d', 'is_default': True, 'name': u'My config', 'project_id': u'5fd06afce2456ec1e9d20457', 'project_version': None, 'secondary_datasets': [ { 'snapshot_policy': u'latest', 'identifier': u'profile', 'catalog_version_id': u'5fd06b4af24c641b68e4d88f', 'catalog_id': u'5fd06b4af24c641b68e4d88e' }, { 'snapshot_policy': u'dynamic', 'identifier': u'transaction', 'catalog_version_id': u'5fd1e86c589238a4e635e98e', 'catalog_id': u'5fd1e86c589238a4e635e98d' } ] }