Database Connectivity
- class datarobot.DataDriver(id=None, creator=None, base_names=None, class_name=None, canonical_name=None, database_driver=None, type=None, version=None)
A data driver
- Attributes:
- idstr
the id of the driver.
- class_namestr
the Java class name for the driver.
- canonical_namestr
the user-friendly name of the driver.
- creatorstr
the id of the user who created the driver.
- base_nameslist of str
a list of the file name(s) of the jar files.
- classmethod list(typ=None)
Returns list of available drivers.
- Parameters:
- typDataDriverListTypes
If specified, filters by specified driver type.
- Returns:
- driverslist of DataDriver instances
contains a list of available drivers.
- Return type:
List
[DataDriver
]
Examples
>>> import datarobot as dr >>> drivers = dr.DataDriver.list() >>> drivers [DataDriver('mysql'), DataDriver('RedShift'), DataDriver('PostgreSQL')]
- classmethod get(driver_id)
Gets the driver.
- Parameters:
- driver_idstr
the identifier of the driver.
- Returns:
- driverDataDriver
the required driver.
- Return type:
Examples
>>> import datarobot as dr >>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c') >>> driver DataDriver('PostgreSQL')
- classmethod create(class_name, canonical_name, files=None, typ=None, database_driver=None)
Creates the driver. Only available to admin users.
- Parameters:
- class_namestr
the Java class name for the driver. Specify None if typ is DataDriverTypes.DR_DATABASE_V1`.
- canonical_namestr
the user-friendly name of the driver.
- fileslist of str
a list of the file paths on file system file_path(s) for the driver.
- typ: str
Optional. Specify the type of the driver. Defaults to DataDriverTypes.JDBC, may also be DataDriverTypes.DR_DATABASE_V1.
- database_driver: str
Optional. Specify when typ is DataDriverTypes.DR_DATABASE_V1 to create a native database driver. See DrDatabaseV1Types enum for some of the types, but that list may not be exhaustive.
- Returns:
- driverDataDriver
the created driver.
- Raises:
- ClientError
raised if user is not granted for Can manage JDBC database drivers feature
- Return type:
Examples
>>> import datarobot as dr >>> driver = dr.DataDriver.create( ... class_name='org.postgresql.Driver', ... canonical_name='PostgreSQL', ... files=['/tmp/postgresql-42.2.2.jar'] ... ) >>> driver DataDriver('PostgreSQL')
- update(class_name=None, canonical_name=None)
Updates the driver. Only available to admin users.
- Parameters:
- class_namestr
the Java class name for the driver.
- canonical_namestr
the user-friendly name of the driver.
- Raises:
- ClientError
raised if user is not granted for Can manage JDBC database drivers feature
- Return type:
None
Examples
>>> import datarobot as dr >>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c') >>> driver.canonical_name 'PostgreSQL' >>> driver.update(canonical_name='postgres') >>> driver.canonical_name 'postgres'
- delete()
Removes the driver. Only available to admin users.
- Raises:
- ClientError
raised if user is not granted for Can manage JDBC database drivers feature
- Return type:
None
- class datarobot.Connector(id=None, creator_id=None, configuration_id=None, base_name=None, canonical_name=None, connector_type=None)
A connector
- Attributes:
- idstr
the id of the connector.
- creator_idstr
the id of the user who created the connector.
- base_namestr
the file name of the jar file.
- canonical_namestr
the user-friendly name of the connector.
- configuration_idstr
the id of the configuration of the connector.
- classmethod list()
Returns list of available connectors.
- Returns:
- connectorslist of Connector instances
contains a list of available connectors.
- Return type:
List
[Connector
]
Examples
>>> import datarobot as dr >>> connectors = dr.Connector.list() >>> connectors [Connector('ADLS Gen2 Connector'), Connector('S3 Connector')]
- classmethod get(connector_id)
Gets the connector.
- Parameters:
- connector_idstr
the identifier of the connector.
- Returns:
- connectorConnector
the required connector.
- Return type:
Examples
>>> import datarobot as dr >>> connector = dr.Connector.get('5fe1063e1c075e0245071446') >>> connector Connector('ADLS Gen2 Connector')
- classmethod create(file_path=None, connector_type=None)
Creates the connector from a jar file. Only available to admin users.
- Parameters:
- file_pathstr
(Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.
- connector_type: str
The type of the native connector to create
- Returns:
- connectorConnector
the created connector.
- Raises:
- ClientError
raised if user is not granted for Can manage connectors feature
- Return type:
Examples
>>> import datarobot as dr >>> connector = dr.Connector.create('/tmp/connector-adls-gen2.jar') >>> connector Connector('ADLS Gen2 Connector')
- update(file_path)
Updates the connector with new jar file. Only available to admin users.
- Parameters:
- file_pathstr
(Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.
- Returns:
- connectorConnector
the updated connector.
- Raises:
- ClientError
raised if user is not granted for Can manage connectors feature
- Return type:
Examples
>>> import datarobot as dr >>> connector = dr.Connector.get('5fe1063e1c075e0245071446') >>> connector.base_name 'connector-adls-gen2.jar' >>> connector.update('/tmp/connector-s3.jar') >>> connector.base_name 'connector-s3.jar'
- delete()
Removes the connector. Only available to admin users.
- Raises:
- ClientError
raised if user is not granted for Can manage connectors feature
- Return type:
None
- class datarobot.DataStore(data_store_id=None, data_store_type=None, canonical_name=None, creator=None, updated=None, params=None, role=None)
A data store. Represents database
- Attributes:
- idstr
The id of the data store.
- data_store_typestr
The type of data store.
- canonical_namestr
The user-friendly name of the data store.
- creatorstr
The id of the user who created the data store.
- updateddatetime.datetime
The time of the last update
- paramsDataStoreParameters
A list specifying data store parameters.
- rolestr
Your access role for this data store.
- classmethod list(typ=None, name=None)
Returns list of available data stores.
- Parameters:
- typstr
If specified, filters by specified data store type. If not specified, the default is DataStoreListTypes.JDBC.
- name: str
If specified, filters by data store names that match or contain this name. The search is case-insensitive.
- Returns:
- data_storeslist of DataStore instances
contains a list of available data stores.
- Return type:
List
[DataStore
]
Examples
>>> import datarobot as dr >>> data_stores = dr.DataStore.list() >>> data_stores [DataStore('Demo'), DataStore('Airlines')]
- classmethod get(data_store_id)
Gets the data store.
- Parameters:
- data_store_idstr
the identifier of the data store.
- Returns:
- data_storeDataStore
the required data store.
- Return type:
Examples
>>> import datarobot as dr >>> data_store = dr.DataStore.get('5a8ac90b07a57a0001be501e') >>> data_store DataStore('Demo')
- classmethod create(data_store_type, canonical_name, driver_id=None, jdbc_url=None, fields=None, connector_id=None)
Creates the data store.
- Parameters:
- data_store_typestr or DataStoreTypes
the type of data store.
- canonical_namestr
the user-friendly name of the data store.
- driver_idstr
Optional. The identifier of the DataDriver if data_store_type is DataStoreListTypes.JDBC or DataStoreListTypes.DR_DATABASE_V1.
- jdbc_urlstr
Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).
- fields: list
Optional. If the type is dr-database-v1, then the fields specify the configuration.
- connector_id: str
Optional. The identifier of the Connector if data_store_type is DataStoreListTypes.DR_CONNECTOR_V1
- Returns
- ——-
- data_storeDataStore
the created data store.
- Return type:
Examples
>>> import datarobot as dr >>> data_store = dr.DataStore.create( ... data_store_type='jdbc', ... canonical_name='Demo DB', ... driver_id='5a6af02eb15372000117c040', ... jdbc_url='jdbc:postgresql://my.db.address.org:5432/perftest' ... ) >>> data_store DataStore('Demo DB')
- update(canonical_name=None, driver_id=None, connector_id=None, jdbc_url=None, fields=None)
Updates the data store.
- Parameters:
- canonical_namestr
optional, the user-friendly name of the data store.
- driver_idstr
Optional. The identifier of the DataDriver. if the type is one of DataStoreTypes.DR_DATABASE_V1 or DataStoreTypes.JDBC.
- connector_idstr
Optional. The identifier of the Connector. if the type is DataStoreTypes.DR_CONNECTOR_V1.
- jdbc_urlstr
Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).
- fields: list
Optional. If the type is dr-database-v1, then the fields specify the configuration.
- Return type:
None
Examples
>>> import datarobot as dr >>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae') >>> data_store DataStore('Demo DB') >>> data_store.update(canonical_name='Demo DB updated') >>> data_store DataStore('Demo DB updated')
- delete()
Removes the DataStore
- Return type:
None
- test(username=None, password=None, credential_id=None, use_kerberos=None, credential_data=None)
Tests database connection. :rtype:
TestResponse
Changed in version v3.2: Added credential_id, use_kerberos and credential_data optional params and made username and password optional.
- Parameters:
- usernamestr
optional, the username for database authentication.
- passwordstr
optional, the password for database authentication. The password is encrypted at server side and never saved / stored
- credential_idstr
optional, id of the set of credentials to use instead of username and password
- use_kerberosbool
optional, whether to use Kerberos for data store authentication
- credential_datadict
optional, the credentials to authenticate with the database, to use instead of user/password or credential ID
- Returns:
- messagedict
message with status.
Examples
>>> import datarobot as dr >>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae') >>> data_store.test(username='db_username', password='db_password') {'message': 'Connection successful'}
- schemas(username, password)
Returns list of available schemas.
- Parameters:
- usernamestr
the username for database authentication.
- passwordstr
the password for database authentication. The password is encrypted at server side and never saved / stored
- Returns:
- responsedict
dict with database name and list of str - available schemas
- Return type:
Examples
>>> import datarobot as dr >>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae') >>> data_store.schemas(username='db_username', password='db_password') {'catalog': 'perftest', 'schemas': ['demo', 'information_schema', 'public']}
- tables(username, password, schema=None)
Returns list of available tables in schema.
- Parameters:
- usernamestr
optional, the username for database authentication.
- passwordstr
optional, the password for database authentication. The password is encrypted at server side and never saved / stored
- schemastr
optional, the schema name.
- Returns:
- responsedict
dict with catalog name and tables info
- Return type:
Examples
>>> import datarobot as dr >>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae') >>> data_store.tables(username='db_username', password='db_password', schema='demo') {'tables': [{'type': 'TABLE', 'name': 'diagnosis', 'schema': 'demo'}, {'type': 'TABLE', 'name': 'kickcars', 'schema': 'demo'}, {'type': 'TABLE', 'name': 'patient', 'schema': 'demo'}, {'type': 'TABLE', 'name': 'transcript', 'schema': 'demo'}], 'catalog': 'perftest'}
- classmethod from_server_data(data, keep_attrs=None)
Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing
- Parameters:
- datadict
The directly translated dict of JSON from the server. No casing fixes have taken place
- keep_attrsiterable
List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None
- Return type:
- get_access_list()
Retrieve what users have access to this data store :rtype:
List
[SharingAccess
]Added in version v2.14.
- Returns:
- list of
SharingAccess
- list of
Retrieve what users have access to this data store :rtype:
List
[SharingRole
]Added in version v3.2.
- Returns:
- list of
SharingRole
- list of
Modify the ability of users to access this data store :rtype:
None
Added in version v2.14.
- Parameters:
- access_listlist of
SharingRole
the modifications to make.
- access_listlist of
- Raises:
- datarobot.ClientError
if you do not have permission to share this data store, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data store without an owner.
Examples
The
SharingRole
class is needed in order to share a Data Store with one or more users.For example, suppose you had a list of user IDs you wanted to share this DataStore with. You could use a loop to generate a list of
SharingRole
objects for them, and bulk share this Data Store.>>> import datarobot as dr >>> from datarobot.models.sharing import SharingRole >>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE >>> >>> user_ids = ["60912e09fd1f04e832a575c1", "639ce542862e9b1b1bfa8f1b", "63e185e7cd3a5f8e190c6393"] >>> sharing_roles = [] >>> for user_id in user_ids: ... new_sharing_role = SharingRole( ... role=SHARING_ROLE.CONSUMER, ... share_recipient_type=SHARING_RECIPIENT_TYPE.USER, ... id=user_id, ... can_share=True, ... ) ... sharing_roles.append(new_sharing_role) >>> dr.DataStore.get('my-data-store-id').share(access_list)
Similarly, a
SharingRole
instance can be used to remove a user’s access if therole
is set toSHARING_ROLE.NO_ROLE
, like in this example:>>> import datarobot as dr >>> from datarobot.models.sharing import SharingRole >>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE >>> >>> user_to_remove = "[email protected]" ... remove_sharing_role = SharingRole( ... role=SHARING_ROLE.NO_ROLE, ... share_recipient_type=SHARING_RECIPIENT_TYPE.USER, ... username=user_to_remove, ... can_share=False, ... ) >>> dr.DataStore.get('my-data-store-id').share(roles=[remove_sharing_role])
- class datarobot.DataSource(data_source_id=None, data_source_type=None, canonical_name=None, creator=None, updated=None, params=None, role=None)
A data source. Represents data request
- Attributes:
- idstr
the id of the data source.
- typestr
the type of data source.
- canonical_namestr
the user-friendly name of the data source.
- creatorstr
the id of the user who created the data source.
- updateddatetime.datetime
the time of the last update.
- paramsDataSourceParameters
a list specifying data source parameters.
- rolestr or None
if a string, represents a particular level of access and should be one of
datarobot.enums.SHARING_ROLE
. For more information on the specific access levels, see the sharing documentation. If None, can be passed to a share function to revoke access for a specific user.
- classmethod list(typ=None)
Returns list of available data sources.
- Parameters:
- typDataStoreListTypes
If specified, filters by specified datasource type. If not specified it will default to DataStoreListTypes.DATABASES
- Returns:
- data_sourceslist of DataSource instances
contains a list of available data sources.
- Return type:
List
[DataSource
]
Examples
>>> import datarobot as dr >>> data_sources = dr.DataSource.list() >>> data_sources [DataSource('Diagnostics'), DataSource('Airlines 100mb'), DataSource('Airlines 10mb')]
- classmethod get(data_source_id)
Gets the data source.
- Parameters:
- data_source_idstr
the identifier of the data source.
- Returns:
- data_sourceDataSource
the requested data source.
- Return type:
TypeVar
(TDataSource
, bound= DataSource)
Examples
>>> import datarobot as dr >>> data_source = dr.DataSource.get('5a8ac9ab07a57a0001be501f') >>> data_source DataSource('Diagnostics')
- classmethod create(data_source_type, canonical_name, params)
Creates the data source.
- Parameters:
- data_source_typestr or DataStoreTypes
the type of data source.
- canonical_namestr
the user-friendly name of the data source.
- paramsDataSourceParameters
a list specifying data source parameters.
- Returns:
- data_sourceDataSource
the created data source.
- Return type:
TypeVar
(TDataSource
, bound= DataSource)
Examples
>>> import datarobot as dr >>> params = dr.DataSourceParameters( ... data_store_id='5a8ac90b07a57a0001be501e', ... query='SELECT * FROM airlines10mb WHERE "Year" >= 1995;' ... ) >>> data_source = dr.DataSource.create( ... data_source_type='jdbc', ... canonical_name='airlines stats after 1995', ... params=params ... ) >>> data_source DataSource('airlines stats after 1995')
- update(canonical_name=None, params=None)
Creates the data source.
- Parameters:
- canonical_namestr
optional, the user-friendly name of the data source.
- paramsDataSourceParameters
optional, the identifier of the DataDriver.
- Return type:
None
Examples
>>> import datarobot as dr >>> data_source = dr.DataSource.get('5ad840cc613b480001570953') >>> data_source DataSource('airlines stats after 1995') >>> params = dr.DataSourceParameters( ... query='SELECT * FROM airlines10mb WHERE "Year" >= 1990;' ... ) >>> data_source.update( ... canonical_name='airlines stats after 1990', ... params=params ... ) >>> data_source DataSource('airlines stats after 1990')
- delete()
Removes the DataSource
- Return type:
None
- classmethod from_server_data(data, keep_attrs=None)
Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing
- Parameters:
- datadict
The directly translated dict of JSON from the server. No casing fixes have taken place
- keep_attrsiterable
List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None
- Return type:
TypeVar
(TDataSource
, bound= DataSource)
- get_access_list()
Retrieve what users have access to this data source :rtype:
List
[SharingAccess
]Added in version v2.14.
- Returns:
- list of
SharingAccess
- list of
Modify the ability of users to access this data source :rtype:
None
Added in version v2.14.
- Parameters:
- access_list: list of :class:`SharingAccess <datarobot.SharingAccess>`
The modifications to make.
- Raises:
- datarobot.ClientError:
If you do not have permission to share this data source, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data source without an owner.
Examples
Transfer access to the data source from old_user@datarobot.com to new_user@datarobot.com
from datarobot.enums import SHARING_ROLE from datarobot.models.data_source import DataSource from datarobot.models.sharing import SharingAccess new_access = SharingAccess( "[email protected]", SHARING_ROLE.OWNER, can_share=True, ) access_list = [ SharingAccess("[email protected]", SHARING_ROLE.OWNER, can_share=True), new_access, ] DataSource.get('my-data-source-id').share(access_list)
- create_dataset(username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None)
Create a
Dataset
from this data source. :rtype:Dataset
Added in version v2.22.
- Parameters:
- username: string, optional
The username for database authentication.
- password: string, optional
The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.
- do_snapshot: bool, optional
If unset, uses the server default: True. If true, creates a snapshot dataset; if false, creates a remote dataset. Creating snapshots from non-file sources requires an additional permission, Enable Create Snapshot Data Source.
- persist_data_after_ingestion: bool, optional
If unset, uses the server default: True. If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and doSnapshot to true will result in an error.
- categories: list[string], optional
An array of strings describing the intended use of the dataset. The current supported options are “TRAINING” and “PREDICTION”.
- credential_id: string, optional
The ID of the set of credentials to use instead of user and password. Note that with this change, username and password will become optional.
- use_kerberos: bool, optional
If unset, uses the server default: False. If true, use kerberos authentication for database authentication.
- Returns:
- response: Dataset
The Dataset created from the uploaded data
- class datarobot.DataSourceParameters(data_store_id=None, catalog=None, table=None, schema=None, partition_column=None, query=None, fetch_size=None, path=None)
Data request configuration
- Attributes:
- data_store_idstr
the id of the DataStore.
- tablestr
Optional. The name of specified database table.
- schemastr
Optional. The name of the schema associated with the table.
- partition_columnstr
Optional. The name of the partition column.
- querystr
Optional. The user specified SQL query.
- fetch_sizeint
Optional. A user specified fetch size in the range [1, 20000]. By default a fetchSize will be assigned to balance throughput and memory usage
- path: str
Optional. The user-specified path for BLOB storage