Compliance Documentation¶
Compliance Documentation allows users to automatically generate and download documentation to assist with deploying models in highly regulated industries. In most cases, Compliance Documentation is not available for Managed AI Cloud users. Interested users should contact their CFDS or DataRobot Support for additional information.
Generate and Download¶
Using the ComplianceDocumentation
class, users can generate and download documentation as a DOCX.
import datarobot as dr
project = dr.Project.get('5c881d7b79bffe6efc2e16f8')
model = project.get_models()[0]
# Using the default template
doc = dr.ComplianceDocumentation(project.id, model.id)
# Start a job to generate documentation
job = doc.generate()
# Once the job is complete, download as a DOCX
job.wait_for_completion()
doc.download('/path/to/save')
If no template_id is specified, DataRobot will generate compliance documentation using a default template. To create a custom template, see below:
Compliance Documentation Template¶
Using the ComplianceDocTemplate
class, users can
define their own templates to make generated documents match their organization guidelines and requirements.
- Templates are created from a list of sections, which are structured as follows:
contentId
: The identifier of the content in this sectionsections
: A list of sub-section dicts nested under the parent sectiontitle
: The title of the sectiontype
: The type of section - must be one of datarobot, user, or table_of_contents
- Sections of type user are for custom content and include the ability to use two additional fields:
regularText
: regular text of the section, optionally separated by n to split paragraphs.highlightedText
: highlighted text of the section, optionally separated by n to split paragraphs.
Within the above fields, users can embed DataRobot generated content using tags. Each tag looks like {{ keyword }} and on generation will be replaced with corresponding content. We also support parameterization for few of the tags that allow tweakable features found on the UI to be used in the templates. These can be used by placing a | after the keyword in the tag format {{ keyword | parameter=value }} Below you can find a table of currently supported tags:
Tag | Type | Parameters | Content | Web Application UI Analog |
{{ blueprint_diagram }} | Image | Graphical representation of the modeling pipeline. | Leaderboard >> Model >> Describe >> Blueprint | |
{{ alternative_models }} | Table | Comparison of the model with alternatives | Leaderboard built in the same project. | Also known as challenger models. | | |
{{ model_features }} | Table | Description of the model features | Data >> Project Data and corresponding EDA statistics. | | |
{{ missing_values }} | Table | Description of the missing values and their | Leaderboard >> Model >> Describe >> Missing Values processing in the model. | | |
{{ partitioning }} | Image | Graphical representation of the data partitioning. | Data >> Show Advanced Options >> Partitioning (only available before project start) | | |
{{ model_scores }} | Table | Metric scores of the model on different data sources | Leaderboard >> Model | |
{{ lift_chart }} | Image | reverse: True, False (Default) source: validation, holdout, crossValidation bins: 10, 12, 15, 20, 30, 60 |
|
{{ feature_impact }} | Image | Feature Impact chart. | Leaderboard >> Model >> Understand >> Feature Impact | |
{{ feature_impact_table }} | Table | sort_by: name | Table representation of Feature Impact data. | Leaderboard >> Model >> Understand >> Feature Impact >> Export |
{{ feature_effects }} | List of images | source: validation, holdout, crossValidation feature_names: feature1,feature2,feature3 |
|
{{ accuracy_over_time }} | Image | Accuracy over time chart. | Leaderboard >> Model >> Evaluate >> Accuracy Over Time Available only for datetime partitioned projects. | | |
{{ cv_scores }} | Table | Project metric scores for each fold. | Currently unavailable in the UI Available only for projects with cross validation. | | |
{{ roc_curve }} | Image | source: validation, holdout, crossValidation | ROC Curve. | Leaderboard >> Model >> Evaluate >> ROC Curve Available only for binary classification projects. | |
{{ confusion_matrix_summary }} | Table | source: validation, holdout, crossValidation threshold: value between 0 and 1 | Confusion matrix summary for the threshold with | Leaderboard >> Model >> Evaluate >> ROC Curve maximal F1 score value (default suggestion in UI). | Available only for binary classification projects. | |
{{ prediction_distribution }} | Image | Prediction distribution. | Leaderboard >> Model >> Evaluate >> ROC Curve Available only for binary classification projects. | |
Creating a Custom Template¶
A common workflow includes retrieving the default template and using it as a base to extend and customize.
import datarobot as dr
default_template = dr.ComplianceDocTemplate.get_default()
# Download the template and edit sections on your local machine
default_template.sections_to_json_file('path/to/save')
# Create a new template from your local file
my_template = dr.ComplianceDocTemplate.create_from_json_file(name='my_template', path='path/of/file')
Alternatively, custom templates can also be created from scratch.
sections = [{
'title': 'Missing Values Report',
'highlightedText': 'NOTICE',
'regularText': 'This dataset had a lot of Missing Values. See the chart below: {{missing_values}}',
'type': 'user'
},
{
'title': 'Blueprints',
'highlightedText': '',
'regularText': '{{blueprint_diagram}} /n Blueprint for this model'
'type': 'user'
}]
template = dr.ComplianceDocTemplate.create(name='Example', sections=sections)
# Specify the template_id to generate documentation using a custom template
doc = dr.ComplianceDocumentation(project.id, model.id, template.id)
job = doc.generate().wait_for_completion()
doc.download('/path/to/save')