PredictJob API

datarobot.models.predict_job.wait_for_async_predictions(project_id, predict_job_id, max_wait=600)

Given a Project id and PredictJob id poll for status of process responsible for predictions generation until it’s finished

Parameters:

project_id : str

The identifier of the project

predict_job_id : str

The identifier of the PredictJob

max_wait : int, optional

Time in seconds after which predictions creation is considered unsuccessful

Returns:

predictions : pandas.DataFrame

Generated predictions.

Raises:

AsyncPredictionsGenerationError

Raised if status of fetched PredictJob object is error

AsyncTimeoutError

Predictions weren’t generated in time, specified by max_wait parameter

class datarobot.models.PredictJob(data, completed_resource_url=None)

Tracks asynchronous work being done within a project

Attributes

id (int) the id of the job
project_id (str) the id of the project the job belongs to
status (str) the status of the job - will be one of datarobot.enums.QUEUE_STATUS
job_type (str) what kind of work the job is doing - will be ‘predict’ for predict jobs
message (str) a message about the state of the job, typically explaining why an error occurred
classmethod from_job(job)

Transforms a generic Job into a PredictJob

Parameters:

job: Job

A generic job representing a PredictJob

Returns:

predict_job: PredictJob

A fully populated PredictJob with all the details of the job

Raises:

ValueError:

If the generic Job was not a predict job, e.g. job_type != JOB_TYPE.PREDICT

classmethod create(*args, **kwargs)

Note

Deprecated in v2.3 in favor of Project.upload_dataset and Model.request_predictions. That workflow allows you to reuse the same dataset for predictions from multiple models within one project.

Starts predictions generation for provided data using previously created model.

Parameters:

model : Model

Model to use for predictions generation

sourcedata : str, file or pandas.DataFrame

Data to be used for predictions. If this parameter is a str, it can be either a path to a local file or raw file content. If using a file on disk, the filename must consist of ASCII characters only. The file must be a CSV, and cannot be compressed

Returns:

predict_job_id : str

id of created job, can be used as parameter to PredictJob.get or PredictJob.get_predictions methods or wait_for_async_predictions function

Raises:

InputNotUnderstoodError

If the parameter for sourcedata didn’t resolve into known data types

Examples

model = Model.get('p-id', 'l-id')
predict_job = PredictJob.create(model, './data_to_predict.csv')
classmethod get(project_id, predict_job_id)

Fetches one PredictJob. If the job finished, raises PendingJobFinished exception.

Parameters:

project_id : str

The identifier of the project the model on which prediction was started belongs to

predict_job_id : str

The identifier of the predict_job

Returns:

predict_job : PredictJob

The pending PredictJob

Raises:

PendingJobFinished

If the job being queried already finished, and the server is re-routing to the finished predictions.

AsyncFailureError

Querying this resource gave a status code other than 200 or 303

classmethod get_predictions(project_id, predict_job_id, class_prefix='class_')

Fetches finished predictions from the job used to generate them.

Note

The prediction API for classifications now returns an additional prediction_values dictionary that is converted into a series of class_prefixed columns in the final dataframe. For example, <label> = 1.0 is converted to ‘class_1.0’. If you are on an older version of the client (prior to v2.8), you must update to v2.8 to correctly pivot this data.

Parameters:

project_id : str

The identifier of the project to which belongs the model used for predictions generation

predict_job_id : str

The identifier of the predict_job

class_prefix : str

The prefix to append to labels in the final dataframe (e.g., apple -> class_apple)

Returns:

predictions : pandas.DataFrame

Generated predictions

Raises:

JobNotFinished

If the job has not finished yet

AsyncFailureError

Querying the predict_job in question gave a status code other than 200 or 303

cancel()

Cancel this job. If this job has not finished running, it will be removed and canceled.

get_result()
Returns:

result : object

Return type depends on the job type:
  • for model jobs, a Model is returned
  • for predict jobs, a pandas.DataFrame (with predictions) is returned
  • for featureImpact jobs, a list of dicts (whose keys are featureName and impact)
  • for primeRulesets jobs, a list of Rulesets
  • for primeModel jobs, a PrimeModel
  • for primeDownloadValidation jobs, a PrimeFile
  • for reasonCodesInitialization jobs, a ReasonCodesInitialization
  • for reasonCodes jobs, a ReasonCodes
Raises:

JobNotFinished

If the job is not finished, the result is not available.

AsyncProcessUnsuccessfulError

If the job errored or was aborted

get_result_when_complete(max_wait=600)
Parameters:

max_wait : int, optional

How long to wait for the job to finish.

Returns:

result: object

Return type is the same as would be returned by Job.get_result.

Raises:

AsyncTimeoutError

If the job does not finish in time

AsyncProcessUnsuccessfulError

If the job errored or was aborted

refresh()

Update this object with the latest job data from the server.

wait_for_completion(max_wait=600)

Waits for job to complete.

Parameters:

max_wait : int, optional

How long to wait for the job to finish.