Database connectivity

class datarobot.DataDriver

A data driver

Variables:

id (str) – the id of the driver.
class_name (str) – the Java class name for the driver.
canonical_name (str) – the user-friendly name of the driver.
creator (str) – the id of the user who created the driver.
base_names (List[str]) – a list of the file name(s) of the jar files.

classmethod list(typ=None)

Returns list of available drivers.

Parameters:: typ (DataDriverListTypes) – If specified, filters by specified driver type.
Returns:: drivers – contains a list of available drivers.
Return type:: list of DataDriver instances

Examples

>>> import datarobot as dr
>>> drivers = dr.DataDriver.list()
>>> drivers
[DataDriver('mysql'), DataDriver('RedShift'), DataDriver('PostgreSQL')]

classmethod get(driver_id)

Gets the driver.

Parameters:: driver_id (str) – the identifier of the driver.
Returns:: driver – the required driver.
Return type:: DataDriver

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c')
>>> driver
DataDriver('PostgreSQL')

classmethod create(class_name, canonical_name, files=None, typ=None, database_driver=None)

Creates the driver. Only available to admin users.

Parameters:

class_name (str) – the Java class name for the driver. Specify None if typ is DataDriverTypes.DR_DATABASE_V1`.
canonical_name (str) – the user-friendly name of the driver.
files (List[str]) – a list of the file paths on file system file_path(s) for the driver.
typ (str) – Optional. Specify the type of the driver. Defaults to DataDriverTypes.JDBC, may also be DataDriverTypes.DR_DATABASE_V1.
database_driver (str) – Optional. Specify when typ is DataDriverTypes.DR_DATABASE_V1 to create a native database driver. See DrDatabaseV1Types enum for some of the types, but that list may not be exhaustive.

Returns:

driver – the created driver.

Return type:

DataDriver

Raises:

ClientError – raised if user is not granted for Can manage JDBC database drivers feature

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.create(
...     class_name='org.postgresql.Driver',
...     canonical_name='PostgreSQL',
...     files=['/tmp/postgresql-42.2.2.jar']
... )
>>> driver
DataDriver('PostgreSQL')

update(class_name=None, canonical_name=None)

Updates the driver. Only available to admin users.

Parameters:

class_name (str) – the Java class name for the driver.
canonical_name (str) – the user-friendly name of the driver.

Raises:

ClientError – raised if user is not granted for Can manage JDBC database drivers feature

Return type:

None

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c')
>>> driver.canonical_name
'PostgreSQL'
>>> driver.update(canonical_name='postgres')
>>> driver.canonical_name
'postgres'

delete()

Removes the driver. Only available to admin users.

Raises:: ClientError – raised if user is not granted for Can manage JDBC database drivers feature
Return type:: None

class datarobot.Connector

A connector

Variables:

id (str) – the id of the connector.
creator_id (str) – the id of the user who created the connector.
base_name (str) – the file name of the jar file.
canonical_name (str) – the user-friendly name of the connector.
configuration_id (str) – the id of the configuration of the connector.

classmethod list()

Returns list of available connectors.

Returns:: connectors – contains a list of available connectors.
Return type:: list of Connector instances

Examples

>>> import datarobot as dr
>>> connectors = dr.Connector.list()
>>> connectors
[Connector('ADLS Gen2 Connector'), Connector('S3 Connector')]

classmethod get(connector_id)

Gets the connector.

Parameters:: connector_id (str) – the identifier of the connector.
Returns:: connector – the required connector.
Return type:: Connector

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.get('5fe1063e1c075e0245071446')
>>> connector
Connector('ADLS Gen2 Connector')

classmethod create(file_path=None, connector_type=None)

Creates the connector from a jar file. Only available to admin users.

Parameters:

file_path (str) – (Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.
connector_type (str) – The type of the native connector to create

Returns:

connector – the created connector.

Return type:

Connector

Raises:

ClientError – raised if user is not granted for Can manage connectors feature

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.create('/tmp/connector-adls-gen2.jar')
>>> connector
Connector('ADLS Gen2 Connector')

update(file_path)

Updates the connector with new jar file. Only available to admin users.

Parameters:: file_path (str) – (Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.
Returns:: connector – the updated connector.
Return type:: Connector
Raises:: ClientError – raised if user is not granted for Can manage connectors feature

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.get('5fe1063e1c075e0245071446')
>>> connector.base_name
'connector-adls-gen2.jar'
>>> connector.update('/tmp/connector-s3.jar')
>>> connector.base_name
'connector-s3.jar'

delete()

Removes the connector. Only available to admin users.

Raises:: ClientError – raised if user is not granted for Can manage connectors feature
Return type:: None

class datarobot.DataStore

A data store. Represents database

Variables:

id (str) – The id of the data store.
data_store_type (str) – The type of data store.
canonical_name (str) – The user-friendly name of the data store.
creator (str) – The id of the user who created the data store.
updated (datetime.datetime) – The time of the last update
params (DataStoreParameters) – A list specifying data store parameters.
role (str) – Your access role for this data store.

classmethod list(typ=None, name=None)

Returns list of available data stores.

Parameters:

typ (str) – If specified, filters by specified data store type. If not specified, the default is DataStoreListTypes.JDBC.
name (str) – If specified, filters by data store names that match or contain this name. The search is case-insensitive.

Returns:

data_stores – contains a list of available data stores.

Return type:

list of DataStore instances

Examples

>>> import datarobot as dr
>>> data_stores = dr.DataStore.list()
>>> data_stores
[DataStore('Demo'), DataStore('Airlines')]

classmethod get(data_store_id)

Gets the data store.

Parameters:: data_store_id (str) – the identifier of the data store.
Returns:: data_store – the required data store.
Return type:: DataStore

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5a8ac90b07a57a0001be501e')
>>> data_store
DataStore('Demo')

classmethod create(data_store_type, canonical_name, driver_id=None, jdbc_url=None, fields=None, connector_id=None)

Creates the data store.

Parameters:

data_store_type (str or DataStoreTypes) – the type of data store.
canonical_name (str) – the user-friendly name of the data store.
driver_id (str) – Optional. The identifier of the DataDriver if data_store_type is DataStoreListTypes.JDBC or DataStoreListTypes.DR_DATABASE_V1.
jdbc_url (str) – Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).
fields (list) – Optional. If the type is dr-database-v1, then the fields specify the configuration.
connector_id (str) – Optional. The identifier of the Connector if data_store_type is DataStoreListTypes.DR_CONNECTOR_V1

Returns:

data_store – the created data store.

Return type:

DataStore

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.create(
...     data_store_type='jdbc',
...     canonical_name='Demo DB',
...     driver_id='5a6af02eb15372000117c040',
...     jdbc_url='jdbc:postgresql://my.db.address.org:5432/perftest'
... )
>>> data_store
DataStore('Demo DB')

update(canonical_name=None, driver_id=None, connector_id=None, jdbc_url=None, fields=None)

Updates the data store.

Parameters:

canonical_name (str) – optional, the user-friendly name of the data store.
driver_id (str) – Optional. The identifier of the DataDriver. if the type is one of DataStoreTypes.DR_DATABASE_V1 or DataStoreTypes.JDBC.
connector_id (str) – Optional. The identifier of the Connector. if the type is DataStoreTypes.DR_CONNECTOR_V1.
jdbc_url (str) – Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).
fields (list) – Optional. If the type is dr-database-v1, then the fields specify the configuration.

Return type:

None

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store
DataStore('Demo DB')
>>> data_store.update(canonical_name='Demo DB updated')
>>> data_store
DataStore('Demo DB updated')

delete()

Removes the DataStore

Return type:: None

test(username=None, password=None, credential_id=None, use_kerberos=None, credential_data=None)

Tests database connection.

Changed in version v3.2: Added credential_id, use_kerberos and credential_data optional params and made username and password optional.

Parameters:

username (str) – optional, the username for database authentication.
password (str) – optional, the password for database authentication. The password is encrypted at server side and never saved / stored
credential_id (str) – optional, id of the set of credentials to use instead of username and password
use_kerberos (bool) – optional, whether to use Kerberos for data store authentication
credential_data (dict) – optional, the credentials to authenticate with the database, to use instead of user/password or credential ID

Returns:

message – message with status.

Return type:

dict

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.test(username='db_username', password='db_password')
{'message': 'Connection successful'}

schemas(username, password)

Returns list of available schemas.

Parameters:

username (str) – the username for database authentication.
password (str) – the password for database authentication. The password is encrypted at server side and never saved / stored

Returns:

response – dict with database name and list of str - available schemas

Return type:

dict

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.schemas(username='db_username', password='db_password')
{'catalog': 'perftest', 'schemas': ['demo', 'information_schema', 'public']}

tables(username, password, schema=None)

Returns list of available tables in schema.

Parameters:

username (str) – optional, the username for database authentication.
password (str) – optional, the password for database authentication. The password is encrypted at server side and never saved / stored
schema (str) – optional, the schema name.

Returns:

response – dict with catalog name and tables info

Return type:

dict

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.tables(username='db_username', password='db_password', schema='demo')
{'tables': [{'type': 'TABLE', 'name': 'diagnosis', 'schema': 'demo'}, {'type': 'TABLE',
'name': 'kickcars', 'schema': 'demo'}, {'type': 'TABLE', 'name': 'patient',
'schema': 'demo'}, {'type': 'TABLE', 'name': 'transcript', 'schema': 'demo'}],
'catalog': 'perftest'}

classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing

Parameters:

data (dict) – The directly translated dict of JSON from the server. No casing fixes have taken place
keep_attrs (iterable) – List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None

Return type:

DataStore

get_shared_roles()

Retrieve what users have access to this data store

Added in version v3.2.

Return type:: list of SharingRole

share(access_list)

Modify the ability of users to access this data store

Added in version v2.14.

Parameters:: access_list (list of SharingRole) – the modifications to make.
Raises:: datarobot.ClientError : – if you do not have permission to share this data store, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data store without an owner.
Return type:: None

Examples

The SharingRole class is needed in order to share a Data Store with one or more users.

For example, suppose you had a list of user IDs you wanted to share this DataStore with. You could use a loop to generate a list of SharingRole objects for them, and bulk share this Data Store.

>>> import datarobot as dr
>>> from datarobot.models.sharing import SharingRole
>>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE
>>>
>>> user_ids = ["60912e09fd1f04e832a575c1", "639ce542862e9b1b1bfa8f1b", "63e185e7cd3a5f8e190c6393"]
>>> sharing_roles = []
>>> for user_id in user_ids:
...     new_sharing_role = SharingRole(
...         role=SHARING_ROLE.CONSUMER,
...         share_recipient_type=SHARING_RECIPIENT_TYPE.USER,
...         id=user_id,
...         can_share=True,
...     )
...     sharing_roles.append(new_sharing_role)
>>> dr.DataStore.get('my-data-store-id').share(access_list)

Similarly, a SharingRole instance can be used to remove a user’s access if the role is set to SHARING_ROLE.NO_ROLE, like in this example:

>>> import datarobot as dr
>>> from datarobot.models.sharing import SharingRole
>>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE
>>>
>>> user_to_remove = "[email protected]"
... remove_sharing_role = SharingRole(
...     role=SHARING_ROLE.NO_ROLE,
...     share_recipient_type=SHARING_RECIPIENT_TYPE.USER,
...     username=user_to_remove,
...     can_share=False,
... )
>>> dr.DataStore.get('my-data-store-id').share(roles=[remove_sharing_role])

class datarobot.DataSource

A data source. Represents data request

Variables:

id (str) – the id of the data source.
type (str) – the type of data source.
canonical_name (str) – the user-friendly name of the data source.
creator (str) – the id of the user who created the data source.
updated (datetime.datetime) – the time of the last update.
params (DataSourceParameters) – a list specifying data source parameters.
role (str or None) – if a string, represents a particular level of access and should be one of datarobot.enums.SHARING_ROLE. For more information on the specific access levels, see the sharing documentation. If None, can be passed to a share function to revoke access for a specific user.

classmethod list(typ=None)

Returns list of available data sources.

Parameters:: typ (DataStoreListTypes) – If specified, filters by specified datasource type. If not specified it will default to DataStoreListTypes.DATABASES
Returns:: data_sources – contains a list of available data sources.
Return type:: list of DataSource instances

Examples

>>> import datarobot as dr
>>> data_sources = dr.DataSource.list()
>>> data_sources
[DataSource('Diagnostics'), DataSource('Airlines 100mb'), DataSource('Airlines 10mb')]

classmethod get(data_source_id)

Gets the data source.

Parameters:: data_source_id (str) – the identifier of the data source.
Returns:: data_source – the requested data source.
Return type:: DataSource

Examples

>>> import datarobot as dr
>>> data_source = dr.DataSource.get('5a8ac9ab07a57a0001be501f')
>>> data_source
DataSource('Diagnostics')

classmethod create(data_source_type, canonical_name, params)

Creates the data source.

Parameters:

data_source_type (str or DataStoreTypes) – the type of data source.
canonical_name (str) – the user-friendly name of the data source.
params (DataSourceParameters) – a list specifying data source parameters.

Returns:

data_source – the created data source.

Return type:

DataSource

Examples

>>> import datarobot as dr
>>> params = dr.DataSourceParameters(
...     data_store_id='5a8ac90b07a57a0001be501e',
...     query='SELECT * FROM airlines10mb WHERE "Year" >= 1995;'
... )
>>> data_source = dr.DataSource.create(
...     data_source_type='jdbc',
...     canonical_name='airlines stats after 1995',
...     params=params
... )
>>> data_source
DataSource('airlines stats after 1995')

update(canonical_name=None, params=None)

Creates the data source.

Parameters:

canonical_name (str) – optional, the user-friendly name of the data source.
params (DataSourceParameters) – optional, the identifier of the DataDriver.

Return type:

None

Examples

>>> import datarobot as dr
>>> data_source = dr.DataSource.get('5ad840cc613b480001570953')
>>> data_source
DataSource('airlines stats after 1995')
>>> params = dr.DataSourceParameters(
...     query='SELECT * FROM airlines10mb WHERE "Year" >= 1990;'
... )
>>> data_source.update(
...     canonical_name='airlines stats after 1990',
...     params=params
... )
>>> data_source
DataSource('airlines stats after 1990')

delete()

Removes the DataSource

Return type:: None

classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing

Parameters:

data (dict) – The directly translated dict of JSON from the server. No casing fixes have taken place
keep_attrs (iterable) – List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None

Return type:

TypeVar(TDataSource, bound= DataSource)

get_access_list()

Retrieve what users have access to this data source

Added in version v2.14.

Return type:: list of SharingAccess

share(access_list)

Modify the ability of users to access this data source

Added in version v2.14.

Parameters:: access_list (list of SharingAccess) – The modifications to make.
Raises:: datarobot.ClientError: – If you do not have permission to share this data source, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data source without an owner.
Return type:: None

Examples

Transfer access to the data source from old_user@datarobot.com to new_user@datarobot.com

from datarobot.enums import SHARING_ROLE
from datarobot.models.data_source import DataSource
from datarobot.models.sharing import SharingAccess

new_access = SharingAccess(
    "[email protected]",
    SHARING_ROLE.OWNER,
    can_share=True,
)
access_list = [
    SharingAccess("[email protected]", SHARING_ROLE.OWNER, can_share=True),
    new_access,
]

DataSource.get('my-data-source-id').share(access_list)

create_dataset(username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None)

Create a Dataset from this data source.

Added in version v2.22.

Parameters:

username (string, optional) – The username for database authentication.
password (string, optional) – The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.
do_snapshot (Optional[bool]) – If unset, uses the server default: True. If true, creates a snapshot dataset; if false, creates a remote dataset. Creating snapshots from non-file sources requires an additional permission, Enable Create Snapshot Data Source.
persist_data_after_ingestion (Optional[bool]) – If unset, uses the server default: True. If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and doSnapshot to true will result in an error.
categories (list[string], optional) – An array of strings describing the intended use of the dataset. The current supported options are “TRAINING” and “PREDICTION”.
credential_id (string, optional) – The ID of the set of credentials to use instead of user and password. Note that with this change, username and password will become optional.
use_kerberos (Optional[bool]) – If unset, uses the server default: False. If true, use kerberos authentication for database authentication.

Returns:

response – The Dataset created from the uploaded data

Return type:

Dataset

class datarobot.DataSourceParameters

Data request configuration

Variables:

data_store_id (str) – the id of the DataStore.
table (str) – Optional. The name of specified database table.
schema (str) – Optional. The name of the schema associated with the table.
partition_column (str) – Optional. The name of the partition column.
query (str) – Optional. The user specified SQL query.
fetch_size (int) – Optional. A user specified fetch size in the range [1, 20000]. By default a fetchSize will be assigned to balance throughput and memory usage
path (str) – Optional. The user-specified path for BLOB storage