Data exports

Use deployment data export to retrieve the data sent for predictions along with the associated predictions.

Prediction data export

Use the following commands to manage prediction data exports:

Create a prediction data export

To create a prediction data export, use PredictionDataExport.create, defining the time window to include in the export using the start and end parameters, as shown in the following example:

from datetime import datetime, timedelta
from datarobot.models.deployment import PredictionDataExport

now=datetime.now()

prediction_data_export = PredictionDataExport.create(
    deployment_id='5c939e08962d741e34f609f0', start=now - timedelta(days=7), end=now)

Specify the model ID for export, otherwise the champion model ID is used by default:

from datetime import datetime, timedelta
from datarobot.models.deployment import PredictionDataExport

now=datetime.now()

prediction_data_export = PredictionDataExport.create(
    deployment_id='5c939e08962d741e34f609f0',
    model_id='6444482e5583f6ee2e572265',
    start=now - timedelta(days=7),
    end=now
)

For deployments in batch mode, provide batch IDs to export prediction data for those batches:

from datetime import datetime, timedelta
from datarobot.models.deployment import PredictionDataExport

now=datetime.now()

prediction_data_export = PredictionDataExport.create(
    deployment_id='5c939e08962d741e34f609f0',
    model_id='6444482e5583f6ee2e572265',
    start=now - timedelta(days=7),
    end=now,
    batch_ids=['6572db2c9f9d4ad3b9de33d0', '6572db2c9f9d4ad3b9de33d0']
)

The start and end of the export can be defined as a datetime or string type.

List prediction data exports

To list prediction data exports, use PredictionDataExport.list, as in the following example:

from datarobot.models.deployment import PredictionDataExport

prediction_data_exports = PredictionDataExport.list(deployment_id='5c939e08962d741e34f609f0', limit=0)

prediction_data_exports
>>> [PredictionDataExport('65fbe59aaa3f847bd5acc75b'),
     PredictionDataExport('65fbe59aaa3f847bd5acc75c'),
     PredictionDataExport('65fbe59aaa3f847bd5acc75a')]

To list all prediction data exports, set the limit to 0.

Adjust additional parameters to filter the data as needed:

from datarobot.enums import ExportStatus
from datarobot.models.deployment import PredictionDataExport

prediction_data_exports = PredictionDataExport.list(deployment_id='5c939e08962d741e34f609f0', limit=100, offset=100)

# use additional filters
prediction_data_exports = PredictionDataExport.list(
    deployment_id='5c939e08962d741e34f609f0',
    model_id="6444482e5583f6ee2e572265",
    batch=False,
    status=ExportStatus.FAILED
)

Retrieve a prediction data export

To get a prediction data export by identifier, use PredictionDataExport.get, as in the following example:

from datarobot.models.deployment import PredictionDataExport

prediction_data_export = PredictionDataExport.get(
    deployment_id='5c939e08962d741e34f609f0', export_id='65fbe59aaa3f847bd5acc75b'
    )

prediction_data_exports
>>> PredictionDataExport('65fbe59aaa3f847bd5acc75b')

Fetch prediction export datasets

To return data from a prediction export as dr.Dataset, use fetch_data method, as in the following example:

from datarobot.models.deployment import PredictionDataExport

prediction_data_export = PredictionDataExport.get(
    deployment_id='5c939e08962d741e34f609f0', export_id='65fbe59aaa3f847bd5acc75b'
    )
prediction_datasets = prediction_data_export.fetch_data()

prediction_datasets
>>> [Dataset(name='Deployment prediction data', id='65f240b0e37a9f1a104bf450')]

prediction_dataset = prediction_datasets[0]

df = prediction_dataset.get_as_dataframe()
df.head(2)
>>>    DR_RESERVED_PREDICTION_TIMESTAMP  ...    upstream_x_datarobot_version
    0  2024-03-13 23:00:38.998000+00:00  ...               predictionapi/X/X
    1  2024-03-13 23:00:38.998000+00:00  ...               predictionapi/X/X

This method can return a list of datasets; however, usually it returns one dataset . There are cases, like time series, when more than one element is returned. The obtained dataset (or datasets) can be transformed into, for example, a pandas DataFrame.

Actuals data export

Use the following commands to manage actuals data exports:

Create actuals data export

To create actuals data export, use ActualsDataExport.create, defining the time window to include in the export using the start and end parameters, as shown in the following example:

from datetime import datetime, timedelta
from datarobot.models.deployment import ActualsDataExport

now=datetime.now()
actuals_data_export = ActualsDataExport.create(
    deployment_id='5c939e08962d741e34f609f0', start=now - timedelta(days=7), end=now
    )

Specify the model ID for export, otherwise the champion model ID is used by default:

from datetime import datetime, timedelta
from datarobot.models.deployment import ActualsDataExport

now=datetime.now()
actuals_data_export = ActualsDataExport.create(
    deployment_id='5c939e08962d741e34f609f0',
    model_id="6444482e5583f6ee2e572265",
    start=now - timedelta(days=7),
    end=now,
    )

To export only actuals that are matched to predictions, set only_matched_predictions to True; by default all available actuals are exported.

from datetime import datetime, timedelta
from datarobot.models.deployment import ActualsDataExport

now=datetime.now()
actuals_data_export = ActualsDataExport.create(
    deployment_id='5c939e08962d741e34f609f0',
    only_matched_predictions=True,
    start=now - timedelta(days=7),
    end=now,
    )

The start and end of the export can be defined as a datetime or string type.

List actuals data exports

To list actuals data exports, use ActualsDataExport.list, as in the following example:

from datarobot.models.deployment import ActualsDataExport

actuals_data_exports = ActualsDataExport.list(deployment_id='5c939e08962d741e34f609f0', limit=0)

actuals_data_exports
>>> [ActualsDataExport('660456a332d0081029ee5031'),
     ActualsDataExport('660456a332d0081029ee5032'),
     ActualsDataExport('660456a332d0081029ee5033')]

To list all actuals data exports, set the limit to 0.

Adjust additional parameters to filter the data as needed:

from datarobot.enums import ExportStatus
from datarobot.models.deployment import ActualsDataExport

# use additional filters
actuals_data_exports = ActualsDataExport.list(
    deployment_id='5c939e08962d741e34f609f0',
    offset=500,
    limit=50,
    status=ExportStatus.SUCCEEDED
)

Retrieve actuals data export

To get actuals data export by identifier, use ActualsDataExport.get, as in the following example:

from datarobot.models.deployment import ActualsDataExport

actuals_data_export = ActualsDataExport.get(
    deployment_id='5c939e08962d741e34f609f0', export_id='660456a332d0081029ee4031'
    )

actuals_data_export
>>> ActualsDataExport('660456a332d0081029ee4031')

Fetch actuals export datasets

To return data from actuals export as dr.Dataset, use fetch_data method, as in the following example:

from datarobot.models.deployment import ActualsDataExport

actuals_data_export = ActualsDataExport.get(
    deployment_id='5c939e08962d741e34f609f0', export_id='660456a332d0081029ee4031'
    )
actuals_datasets = actuals_data_export.fetch_data()

actuals_datasets
>>> [Dataset(name='Deployment prediction data', id='65f240b0e37a9f1a104bf450')]

actuals_dataset = actuals_datasets[0]

df = actuals_dataset.get_as_dataframe()
df.head(2)
>>>    association_id                  timestamp  actuals  predictions
    0               1  2024-03-20 15:00:00+00:00     21.0    18.125388
    1              10  2024-03-20 15:00:00+00:00     12.0    22.805252

This method may return a list of datasets; however, it usually returns one dataset. The obtained dataset (or datasets) can be transformed into, for example, a pandas DataFrame.

Training data export

Use the following commands to manage training data exports:

Create training data export

To create training data export, use TrainingDataExport.create and define the deployment ID, as shown in the following example:

from datarobot.models.deployment import TrainingDataExport

dataset_id = TrainingDataExport.create(deployment_id='5c939e08962d741e34f609f0')

Specify the model ID for export, otherwise the champion model ID is used by default:

from datarobot.models.deployment import TrainingDataExport

dataset_id = TrainingDataExport.create(
    deployment_id='5c939e08962d741e34f609f0', model_id='6444482e5583f6ee2e572265')

dataset_id
>>> 65fb0c25019ca3333bbb4c10

This method returns the ID of the dataset that contains the training data. This dataset is saved in the AI Catalog.

List training data exports

To list training data exports, use TrainingDataExport.list, as in the following example:

from datarobot.models.deployment import TrainingDataExport

training_data_exports = TrainingDataExport.list(deployment_id='5c939e08962d741e34f609f0')

training_data_exports
>>> [TrainingDataExport('6565fbf2356124f1daa3acc522')]

Retrieve training data export

To get training data export by identifier, use TrainingDataExport.get, as in the following example:

from datarobot.models.deployment import ActualsDataExport

training_data_export = TrainingDataExport.get(
    deployment_id='5c939e08962d741e34f609f0', export_id='65fbf2356124f1daa3acc522'
    )

training_data_export
>>> TrainingDataExport('6565fbf2356124f1daa3acc522')

Fetch training export dataset

To return data from the training export as dr.Dataset, use fetch_data, as in the following example:

from datarobot.models.deployment import TrainingDataExport

training_data_export = TrainingDataExport.get(
    deployment_id='5c939e08962d741e34f609f0', export_id='660456a332d0081029ee4031'
    )
training_dataset = training_data_export.fetch_data()

training_dataset
>>> [Dataset(name='training-data-10k_diabetes.csv', id='65fb0c25019ca3333bbb4c10')]

df = training_dataset.get_as_dataframe()
df.head(2)
>>> acetohexamide  time_in_hospital  ... number_outpatient payer_code
  0            No                 1  ...                 0         YY
  1            No                 2  ...                 0         XX

This method returns a single training dataset. The obtained dataset can be transformed into, for example, a pandas DataFrame.