Data exports¶
Use deployment data export to retrieve the data sent for predictions along with the associated predictions.
Prediction data export¶
Use the following commands to manage prediction data exports:
Create a prediction data export¶
To create a prediction data export, use PredictionDataExport.create, defining the time window to include in the export using the start and end parameters, as shown in the following example:
from datetime import datetime, timedelta
from datarobot.models.deployment import PredictionDataExport
now=datetime.now()
prediction_data_export = PredictionDataExport.create(
deployment_id='5c939e08962d741e34f609f0', start=now - timedelta(days=7), end=now)
Specify the model ID for export, otherwise the champion model ID is used by default:
from datetime import datetime, timedelta
from datarobot.models.deployment import PredictionDataExport
now=datetime.now()
prediction_data_export = PredictionDataExport.create(
deployment_id='5c939e08962d741e34f609f0',
model_id='6444482e5583f6ee2e572265',
start=now - timedelta(days=7),
end=now
)
For deployments in batch mode, provide batch IDs to export prediction data for those batches:
from datetime import datetime, timedelta
from datarobot.models.deployment import PredictionDataExport
now=datetime.now()
prediction_data_export = PredictionDataExport.create(
deployment_id='5c939e08962d741e34f609f0',
model_id='6444482e5583f6ee2e572265',
start=now - timedelta(days=7),
end=now,
batch_ids=['6572db2c9f9d4ad3b9de33d0', '6572db2c9f9d4ad3b9de33d0']
)
The start and end of the export can be defined as a datetime or string type.
List prediction data exports¶
To list prediction data exports, use PredictionDataExport.list, as in the following example:
from datarobot.models.deployment import PredictionDataExport
prediction_data_exports = PredictionDataExport.list(deployment_id='5c939e08962d741e34f609f0', limit=0)
prediction_data_exports
>>> [PredictionDataExport('65fbe59aaa3f847bd5acc75b'),
PredictionDataExport('65fbe59aaa3f847bd5acc75c'),
PredictionDataExport('65fbe59aaa3f847bd5acc75a')]
To list all prediction data exports, set the limit to 0.
Adjust additional parameters to filter the data as needed:
from datarobot.enums import ExportStatus
from datarobot.models.deployment import PredictionDataExport
prediction_data_exports = PredictionDataExport.list(deployment_id='5c939e08962d741e34f609f0', limit=100, offset=100)
# use additional filters
prediction_data_exports = PredictionDataExport.list(
deployment_id='5c939e08962d741e34f609f0',
model_id="6444482e5583f6ee2e572265",
batch=False,
status=ExportStatus.FAILED
)
Retrieve a prediction data export¶
To get a prediction data export by identifier, use PredictionDataExport.get, as in the following example:
from datarobot.models.deployment import PredictionDataExport
prediction_data_export = PredictionDataExport.get(
deployment_id='5c939e08962d741e34f609f0', export_id='65fbe59aaa3f847bd5acc75b'
)
prediction_data_exports
>>> PredictionDataExport('65fbe59aaa3f847bd5acc75b')
Fetch prediction export datasets¶
To return data from a prediction export as dr.Dataset, use fetch_data method, as in the following example:
from datarobot.models.deployment import PredictionDataExport
prediction_data_export = PredictionDataExport.get(
deployment_id='5c939e08962d741e34f609f0', export_id='65fbe59aaa3f847bd5acc75b'
)
prediction_datasets = prediction_data_export.fetch_data()
prediction_datasets
>>> [Dataset(name='Deployment prediction data', id='65f240b0e37a9f1a104bf450')]
prediction_dataset = prediction_datasets[0]
df = prediction_dataset.get_as_dataframe()
df.head(2)
>>> DR_RESERVED_PREDICTION_TIMESTAMP ... upstream_x_datarobot_version
0 2024-03-13 23:00:38.998000+00:00 ... predictionapi/X/X
1 2024-03-13 23:00:38.998000+00:00 ... predictionapi/X/X
This method can return a list of datasets; however, usually it returns one dataset . There are cases, like time series, when more than one element is returned. The obtained dataset (or datasets) can be transformed into, for example, a pandas DataFrame.
Actuals data export¶
Use the following commands to manage actuals data exports:
Create actuals data export¶
To create actuals data export, use ActualsDataExport.create, defining the time window to include in the export using the start and end parameters, as shown in the following example:
from datetime import datetime, timedelta
from datarobot.models.deployment import ActualsDataExport
now=datetime.now()
actuals_data_export = ActualsDataExport.create(
deployment_id='5c939e08962d741e34f609f0', start=now - timedelta(days=7), end=now
)
Specify the model ID for export, otherwise the champion model ID is used by default:
from datetime import datetime, timedelta
from datarobot.models.deployment import ActualsDataExport
now=datetime.now()
actuals_data_export = ActualsDataExport.create(
deployment_id='5c939e08962d741e34f609f0',
model_id="6444482e5583f6ee2e572265",
start=now - timedelta(days=7),
end=now,
)
To export only actuals that are matched to predictions, set only_matched_predictions to True; by default all available actuals are exported.
from datetime import datetime, timedelta
from datarobot.models.deployment import ActualsDataExport
now=datetime.now()
actuals_data_export = ActualsDataExport.create(
deployment_id='5c939e08962d741e34f609f0',
only_matched_predictions=True,
start=now - timedelta(days=7),
end=now,
)
The start and end of the export can be defined as a datetime or string type.
List actuals data exports¶
To list actuals data exports, use ActualsDataExport.list, as in the following example:
from datarobot.models.deployment import ActualsDataExport
actuals_data_exports = ActualsDataExport.list(deployment_id='5c939e08962d741e34f609f0', limit=0)
actuals_data_exports
>>> [ActualsDataExport('660456a332d0081029ee5031'),
ActualsDataExport('660456a332d0081029ee5032'),
ActualsDataExport('660456a332d0081029ee5033')]
To list all actuals data exports, set the limit to 0.
Adjust additional parameters to filter the data as needed:
from datarobot.enums import ExportStatus
from datarobot.models.deployment import ActualsDataExport
# use additional filters
actuals_data_exports = ActualsDataExport.list(
deployment_id='5c939e08962d741e34f609f0',
offset=500,
limit=50,
status=ExportStatus.SUCCEEDED
)
Retrieve actuals data export¶
To get actuals data export by identifier, use ActualsDataExport.get, as in the following example:
from datarobot.models.deployment import ActualsDataExport
actuals_data_export = ActualsDataExport.get(
deployment_id='5c939e08962d741e34f609f0', export_id='660456a332d0081029ee4031'
)
actuals_data_export
>>> ActualsDataExport('660456a332d0081029ee4031')
Fetch actuals export datasets¶
To return data from actuals export as dr.Dataset, use fetch_data method, as in the following example:
from datarobot.models.deployment import ActualsDataExport
actuals_data_export = ActualsDataExport.get(
deployment_id='5c939e08962d741e34f609f0', export_id='660456a332d0081029ee4031'
)
actuals_datasets = actuals_data_export.fetch_data()
actuals_datasets
>>> [Dataset(name='Deployment prediction data', id='65f240b0e37a9f1a104bf450')]
actuals_dataset = actuals_datasets[0]
df = actuals_dataset.get_as_dataframe()
df.head(2)
>>> association_id timestamp actuals predictions
0 1 2024-03-20 15:00:00+00:00 21.0 18.125388
1 10 2024-03-20 15:00:00+00:00 12.0 22.805252
This method may return a list of datasets; however, it usually returns one dataset. The obtained dataset (or datasets) can be transformed into, for example, a pandas DataFrame.
Training data export¶
Use the following commands to manage training data exports:
Create training data export¶
To create training data export, use TrainingDataExport.create and define the deployment ID, as shown in the following example:
from datarobot.models.deployment import TrainingDataExport
dataset_id = TrainingDataExport.create(deployment_id='5c939e08962d741e34f609f0')
Specify the model ID for export, otherwise the champion model ID is used by default:
from datarobot.models.deployment import TrainingDataExport
dataset_id = TrainingDataExport.create(
deployment_id='5c939e08962d741e34f609f0', model_id='6444482e5583f6ee2e572265')
dataset_id
>>> 65fb0c25019ca3333bbb4c10
This method returns the ID of the dataset that contains the training data. This dataset is saved in the AI Catalog.
List training data exports¶
To list training data exports, use TrainingDataExport.list, as in the following example:
from datarobot.models.deployment import TrainingDataExport
training_data_exports = TrainingDataExport.list(deployment_id='5c939e08962d741e34f609f0')
training_data_exports
>>> [TrainingDataExport('6565fbf2356124f1daa3acc522')]
Retrieve training data export¶
To get training data export by identifier, use TrainingDataExport.get, as in the following example:
from datarobot.models.deployment import ActualsDataExport
training_data_export = TrainingDataExport.get(
deployment_id='5c939e08962d741e34f609f0', export_id='65fbf2356124f1daa3acc522'
)
training_data_export
>>> TrainingDataExport('6565fbf2356124f1daa3acc522')
Fetch training export dataset¶
To return data from the training export as dr.Dataset, use fetch_data, as in the following example:
from datarobot.models.deployment import TrainingDataExport
training_data_export = TrainingDataExport.get(
deployment_id='5c939e08962d741e34f609f0', export_id='660456a332d0081029ee4031'
)
training_dataset = training_data_export.fetch_data()
training_dataset
>>> [Dataset(name='training-data-10k_diabetes.csv', id='65fb0c25019ca3333bbb4c10')]
df = training_dataset.get_as_dataframe()
df.head(2)
>>> acetohexamide time_in_hospital ... number_outpatient payer_code
0 No 1 ... 0 YY
1 No 2 ... 0 XX
This method returns a single training dataset. The obtained dataset can be transformed into, for example, a pandas DataFrame.