Database Connectivity

class datarobot.DataDriver(id=None, creator=None, base_names=None, class_name=None, canonical_name=None, database_driver=None, type=None, version=None)

A data driver

Attributes:
idstr

the id of the driver.

class_namestr

the Java class name for the driver.

canonical_namestr

the user-friendly name of the driver.

creatorstr

the id of the user who created the driver.

base_nameslist of str

a list of the file name(s) of the jar files.

classmethod list(typ=None)

Returns list of available drivers.

Parameters:
typDataDriverListTypes

If specified, filters by specified driver type.

Returns:
driverslist of DataDriver instances

contains a list of available drivers.

Return type:

List[DataDriver]

Examples

>>> import datarobot as dr
>>> drivers = dr.DataDriver.list()
>>> drivers
[DataDriver('mysql'), DataDriver('RedShift'), DataDriver('PostgreSQL')]
classmethod get(driver_id)

Gets the driver.

Parameters:
driver_idstr

the identifier of the driver.

Returns:
driverDataDriver

the required driver.

Return type:

DataDriver

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c')
>>> driver
DataDriver('PostgreSQL')
classmethod create(class_name, canonical_name, files=None, typ=None, database_driver=None)

Creates the driver. Only available to admin users.

Parameters:
class_namestr

the Java class name for the driver. Specify None if typ is DataDriverTypes.DR_DATABASE_V1`.

canonical_namestr

the user-friendly name of the driver.

fileslist of str

a list of the file paths on file system file_path(s) for the driver.

typ: str

Optional. Specify the type of the driver. Defaults to DataDriverTypes.JDBC, may also be DataDriverTypes.DR_DATABASE_V1.

database_driver: str

Optional. Specify when typ is DataDriverTypes.DR_DATABASE_V1 to create a native database driver. See DrDatabaseV1Types enum for some of the types, but that list may not be exhaustive.

Returns:
driverDataDriver

the created driver.

Raises:
ClientError

raised if user is not granted for Can manage JDBC database drivers feature

Return type:

DataDriver

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.create(
...     class_name='org.postgresql.Driver',
...     canonical_name='PostgreSQL',
...     files=['/tmp/postgresql-42.2.2.jar']
... )
>>> driver
DataDriver('PostgreSQL')
update(class_name=None, canonical_name=None)

Updates the driver. Only available to admin users.

Parameters:
class_namestr

the Java class name for the driver.

canonical_namestr

the user-friendly name of the driver.

Raises:
ClientError

raised if user is not granted for Can manage JDBC database drivers feature

Return type:

None

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c')
>>> driver.canonical_name
'PostgreSQL'
>>> driver.update(canonical_name='postgres')
>>> driver.canonical_name
'postgres'
delete()

Removes the driver. Only available to admin users.

Raises:
ClientError

raised if user is not granted for Can manage JDBC database drivers feature

Return type:

None

class datarobot.Connector(id=None, creator_id=None, configuration_id=None, base_name=None, canonical_name=None, connector_type=None)

A connector

Attributes:
idstr

the id of the connector.

creator_idstr

the id of the user who created the connector.

base_namestr

the file name of the jar file.

canonical_namestr

the user-friendly name of the connector.

configuration_idstr

the id of the configuration of the connector.

classmethod list()

Returns list of available connectors.

Returns:
connectorslist of Connector instances

contains a list of available connectors.

Return type:

List[Connector]

Examples

>>> import datarobot as dr
>>> connectors = dr.Connector.list()
>>> connectors
[Connector('ADLS Gen2 Connector'), Connector('S3 Connector')]
classmethod get(connector_id)

Gets the connector.

Parameters:
connector_idstr

the identifier of the connector.

Returns:
connectorConnector

the required connector.

Return type:

Connector

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.get('5fe1063e1c075e0245071446')
>>> connector
Connector('ADLS Gen2 Connector')
classmethod create(file_path=None, connector_type=None)

Creates the connector from a jar file. Only available to admin users.

Parameters:
file_pathstr

(Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.

connector_type: str

The type of the native connector to create

Returns:
connectorConnector

the created connector.

Raises:
ClientError

raised if user is not granted for Can manage connectors feature

Return type:

Connector

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.create('/tmp/connector-adls-gen2.jar')
>>> connector
Connector('ADLS Gen2 Connector')
update(file_path)

Updates the connector with new jar file. Only available to admin users.

Parameters:
file_pathstr

(Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.

Returns:
connectorConnector

the updated connector.

Raises:
ClientError

raised if user is not granted for Can manage connectors feature

Return type:

Connector

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.get('5fe1063e1c075e0245071446')
>>> connector.base_name
'connector-adls-gen2.jar'
>>> connector.update('/tmp/connector-s3.jar')
>>> connector.base_name
'connector-s3.jar'
delete()

Removes the connector. Only available to admin users.

Raises:
ClientError

raised if user is not granted for Can manage connectors feature

Return type:

None

class datarobot.DataStore(data_store_id=None, data_store_type=None, canonical_name=None, creator=None, updated=None, params=None, role=None)

A data store. Represents database

Attributes:
idstr

The id of the data store.

data_store_typestr

The type of data store.

canonical_namestr

The user-friendly name of the data store.

creatorstr

The id of the user who created the data store.

updateddatetime.datetime

The time of the last update

paramsDataStoreParameters

A list specifying data store parameters.

rolestr

Your access role for this data store.

classmethod list(typ=None, name=None)

Returns list of available data stores.

Parameters:
typstr

If specified, filters by specified data store type. If not specified, the default is DataStoreListTypes.JDBC.

name: str

If specified, filters by data store names that match or contain this name. The search is case-insensitive.

Returns:
data_storeslist of DataStore instances

contains a list of available data stores.

Return type:

List[DataStore]

Examples

>>> import datarobot as dr
>>> data_stores = dr.DataStore.list()
>>> data_stores
[DataStore('Demo'), DataStore('Airlines')]
classmethod get(data_store_id)

Gets the data store.

Parameters:
data_store_idstr

the identifier of the data store.

Returns:
data_storeDataStore

the required data store.

Return type:

DataStore

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5a8ac90b07a57a0001be501e')
>>> data_store
DataStore('Demo')
classmethod create(data_store_type, canonical_name, driver_id=None, jdbc_url=None, fields=None, connector_id=None)

Creates the data store.

Parameters:
data_store_typestr or DataStoreTypes

the type of data store.

canonical_namestr

the user-friendly name of the data store.

driver_idstr

Optional. The identifier of the DataDriver if data_store_type is DataStoreListTypes.JDBC or DataStoreListTypes.DR_DATABASE_V1.

jdbc_urlstr

Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).

fields: list

Optional. If the type is dr-database-v1, then the fields specify the configuration.

connector_id: str

Optional. The identifier of the Connector if data_store_type is DataStoreListTypes.DR_CONNECTOR_V1

Returns
——-
data_storeDataStore

the created data store.

Return type:

DataStore

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.create(
...     data_store_type='jdbc',
...     canonical_name='Demo DB',
...     driver_id='5a6af02eb15372000117c040',
...     jdbc_url='jdbc:postgresql://my.db.address.org:5432/perftest'
... )
>>> data_store
DataStore('Demo DB')
update(canonical_name=None, driver_id=None, connector_id=None, jdbc_url=None, fields=None)

Updates the data store.

Parameters:
canonical_namestr

optional, the user-friendly name of the data store.

driver_idstr

Optional. The identifier of the DataDriver. if the type is one of DataStoreTypes.DR_DATABASE_V1 or DataStoreTypes.JDBC.

connector_idstr

Optional. The identifier of the Connector. if the type is DataStoreTypes.DR_CONNECTOR_V1.

jdbc_urlstr

Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).

fields: list

Optional. If the type is dr-database-v1, then the fields specify the configuration.

Return type:

None

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store
DataStore('Demo DB')
>>> data_store.update(canonical_name='Demo DB updated')
>>> data_store
DataStore('Demo DB updated')
delete()

Removes the DataStore

Return type:

None

test(username=None, password=None, credential_id=None, use_kerberos=None, credential_data=None)

Tests database connection. :rtype: TestResponse

Changed in version v3.2: Added credential_id, use_kerberos and credential_data optional params and made username and password optional.

Parameters:
usernamestr

optional, the username for database authentication.

passwordstr

optional, the password for database authentication. The password is encrypted at server side and never saved / stored

credential_idstr

optional, id of the set of credentials to use instead of username and password

use_kerberosbool

optional, whether to use Kerberos for data store authentication

credential_datadict

optional, the credentials to authenticate with the database, to use instead of user/password or credential ID

Returns:
messagedict

message with status.

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.test(username='db_username', password='db_password')
{'message': 'Connection successful'}
schemas(username, password)

Returns list of available schemas.

Parameters:
usernamestr

the username for database authentication.

passwordstr

the password for database authentication. The password is encrypted at server side and never saved / stored

Returns:
responsedict

dict with database name and list of str - available schemas

Return type:

SchemasResponse

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.schemas(username='db_username', password='db_password')
{'catalog': 'perftest', 'schemas': ['demo', 'information_schema', 'public']}
tables(username, password, schema=None)

Returns list of available tables in schema.

Parameters:
usernamestr

optional, the username for database authentication.

passwordstr

optional, the password for database authentication. The password is encrypted at server side and never saved / stored

schemastr

optional, the schema name.

Returns:
responsedict

dict with catalog name and tables info

Return type:

TablesResponse

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.tables(username='db_username', password='db_password', schema='demo')
{'tables': [{'type': 'TABLE', 'name': 'diagnosis', 'schema': 'demo'}, {'type': 'TABLE',
'name': 'kickcars', 'schema': 'demo'}, {'type': 'TABLE', 'name': 'patient',
'schema': 'demo'}, {'type': 'TABLE', 'name': 'transcript', 'schema': 'demo'}],
'catalog': 'perftest'}
classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing

Parameters:
datadict

The directly translated dict of JSON from the server. No casing fixes have taken place

keep_attrsiterable

List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None

Return type:

DataStore

get_access_list()

Retrieve what users have access to this data store :rtype: List[SharingAccess]

Added in version v2.14.

Returns:
list of SharingAccess
get_shared_roles()

Retrieve what users have access to this data store :rtype: List[SharingRole]

Added in version v3.2.

Returns:
list of SharingRole
share(access_list)

Modify the ability of users to access this data store :rtype: None

Added in version v2.14.

Parameters:
access_listlist of SharingRole

the modifications to make.

Raises:
datarobot.ClientError

if you do not have permission to share this data store, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data store without an owner.

Examples

The SharingRole class is needed in order to share a Data Store with one or more users.

For example, suppose you had a list of user IDs you wanted to share this DataStore with. You could use a loop to generate a list of SharingRole objects for them, and bulk share this Data Store.

>>> import datarobot as dr
>>> from datarobot.models.sharing import SharingRole
>>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE
>>>
>>> user_ids = ["60912e09fd1f04e832a575c1", "639ce542862e9b1b1bfa8f1b", "63e185e7cd3a5f8e190c6393"]
>>> sharing_roles = []
>>> for user_id in user_ids:
...     new_sharing_role = SharingRole(
...         role=SHARING_ROLE.CONSUMER,
...         share_recipient_type=SHARING_RECIPIENT_TYPE.USER,
...         id=user_id,
...         can_share=True,
...     )
...     sharing_roles.append(new_sharing_role)
>>> dr.DataStore.get('my-data-store-id').share(access_list)

Similarly, a SharingRole instance can be used to remove a user’s access if the role is set to SHARING_ROLE.NO_ROLE, like in this example:

>>> import datarobot as dr
>>> from datarobot.models.sharing import SharingRole
>>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE
>>>
>>> user_to_remove = "[email protected]"
... remove_sharing_role = SharingRole(
...     role=SHARING_ROLE.NO_ROLE,
...     share_recipient_type=SHARING_RECIPIENT_TYPE.USER,
...     username=user_to_remove,
...     can_share=False,
... )
>>> dr.DataStore.get('my-data-store-id').share(roles=[remove_sharing_role])
class datarobot.DataSource(data_source_id=None, data_source_type=None, canonical_name=None, creator=None, updated=None, params=None, role=None)

A data source. Represents data request

Attributes:
idstr

the id of the data source.

typestr

the type of data source.

canonical_namestr

the user-friendly name of the data source.

creatorstr

the id of the user who created the data source.

updateddatetime.datetime

the time of the last update.

paramsDataSourceParameters

a list specifying data source parameters.

rolestr or None

if a string, represents a particular level of access and should be one of datarobot.enums.SHARING_ROLE. For more information on the specific access levels, see the sharing documentation. If None, can be passed to a share function to revoke access for a specific user.

classmethod list(typ=None)

Returns list of available data sources.

Parameters:
typDataStoreListTypes

If specified, filters by specified datasource type. If not specified it will default to DataStoreListTypes.DATABASES

Returns:
data_sourceslist of DataSource instances

contains a list of available data sources.

Return type:

List[DataSource]

Examples

>>> import datarobot as dr
>>> data_sources = dr.DataSource.list()
>>> data_sources
[DataSource('Diagnostics'), DataSource('Airlines 100mb'), DataSource('Airlines 10mb')]
classmethod get(data_source_id)

Gets the data source.

Parameters:
data_source_idstr

the identifier of the data source.

Returns:
data_sourceDataSource

the requested data source.

Return type:

TypeVar(TDataSource, bound= DataSource)

Examples

>>> import datarobot as dr
>>> data_source = dr.DataSource.get('5a8ac9ab07a57a0001be501f')
>>> data_source
DataSource('Diagnostics')
classmethod create(data_source_type, canonical_name, params)

Creates the data source.

Parameters:
data_source_typestr or DataStoreTypes

the type of data source.

canonical_namestr

the user-friendly name of the data source.

paramsDataSourceParameters

a list specifying data source parameters.

Returns:
data_sourceDataSource

the created data source.

Return type:

TypeVar(TDataSource, bound= DataSource)

Examples

>>> import datarobot as dr
>>> params = dr.DataSourceParameters(
...     data_store_id='5a8ac90b07a57a0001be501e',
...     query='SELECT * FROM airlines10mb WHERE "Year" >= 1995;'
... )
>>> data_source = dr.DataSource.create(
...     data_source_type='jdbc',
...     canonical_name='airlines stats after 1995',
...     params=params
... )
>>> data_source
DataSource('airlines stats after 1995')
update(canonical_name=None, params=None)

Creates the data source.

Parameters:
canonical_namestr

optional, the user-friendly name of the data source.

paramsDataSourceParameters

optional, the identifier of the DataDriver.

Return type:

None

Examples

>>> import datarobot as dr
>>> data_source = dr.DataSource.get('5ad840cc613b480001570953')
>>> data_source
DataSource('airlines stats after 1995')
>>> params = dr.DataSourceParameters(
...     query='SELECT * FROM airlines10mb WHERE "Year" >= 1990;'
... )
>>> data_source.update(
...     canonical_name='airlines stats after 1990',
...     params=params
... )
>>> data_source
DataSource('airlines stats after 1990')
delete()

Removes the DataSource

Return type:

None

classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing

Parameters:
datadict

The directly translated dict of JSON from the server. No casing fixes have taken place

keep_attrsiterable

List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None

Return type:

TypeVar(TDataSource, bound= DataSource)

get_access_list()

Retrieve what users have access to this data source :rtype: List[SharingAccess]

Added in version v2.14.

Returns:
list of SharingAccess
share(access_list)

Modify the ability of users to access this data source :rtype: None

Added in version v2.14.

Parameters:
access_list: list of :class:`SharingAccess <datarobot.SharingAccess>`

The modifications to make.

Raises:
datarobot.ClientError:

If you do not have permission to share this data source, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data source without an owner.

Examples

Transfer access to the data source from old_user@datarobot.com to new_user@datarobot.com

from datarobot.enums import SHARING_ROLE
from datarobot.models.data_source import DataSource
from datarobot.models.sharing import SharingAccess

new_access = SharingAccess(
    "[email protected]",
    SHARING_ROLE.OWNER,
    can_share=True,
)
access_list = [
    SharingAccess("[email protected]", SHARING_ROLE.OWNER, can_share=True),
    new_access,
]

DataSource.get('my-data-source-id').share(access_list)
create_dataset(username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None)

Create a Dataset from this data source. :rtype: Dataset

Added in version v2.22.

Parameters:
username: string, optional

The username for database authentication.

password: string, optional

The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.

do_snapshot: bool, optional

If unset, uses the server default: True. If true, creates a snapshot dataset; if false, creates a remote dataset. Creating snapshots from non-file sources requires an additional permission, Enable Create Snapshot Data Source.

persist_data_after_ingestion: bool, optional

If unset, uses the server default: True. If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and doSnapshot to true will result in an error.

categories: list[string], optional

An array of strings describing the intended use of the dataset. The current supported options are “TRAINING” and “PREDICTION”.

credential_id: string, optional

The ID of the set of credentials to use instead of user and password. Note that with this change, username and password will become optional.

use_kerberos: bool, optional

If unset, uses the server default: False. If true, use kerberos authentication for database authentication.

Returns:
response: Dataset

The Dataset created from the uploaded data

class datarobot.DataSourceParameters(data_store_id=None, catalog=None, table=None, schema=None, partition_column=None, query=None, fetch_size=None, path=None)

Data request configuration

Attributes:
data_store_idstr

the id of the DataStore.

tablestr

Optional. The name of specified database table.

schemastr

Optional. The name of the schema associated with the table.

partition_columnstr

Optional. The name of the partition column.

querystr

Optional. The user specified SQL query.

fetch_sizeint

Optional. A user specified fetch size in the range [1, 20000]. By default a fetchSize will be assigned to balance throughput and memory usage

path: str

Optional. The user-specified path for BLOB storage