Feature lineage

class datarobot.models.FeatureLineage

Bases: APIObject

Lineage of an automatically engineered feature.

Variables:
  • steps (list) – list of steps which were applied to build the feature.

  • is (left_table and right_table structure)

  • id (int) – step id starting with 0.

  • step_type (str) – one of the data/action/json/generatedData.

  • name (str) – name of the step.

  • description (str) – description of the step.

  • parents (list[int]) – references to other steps id.

  • is_time_aware (bool) – indicator of step being time aware. Mandatory only for action and join steps. action step provides additional information about feature derivation window in the timeInfo field.

  • catalog_id (str) – id of the catalog for a data step.

  • catalog_version_id (str) – id of the catalog version for a data step.

  • group_by (list[str]) – list of columns which this action step aggregated by.

  • columns (list[str]) – names of columns involved into the feature generation. Available only for data steps.

  • time_info (dict) – description of the feature derivation window which was applied to this action step.

  • join_info (list[dict]) – join step details.

  • is

  • data_type (str) – the type of the feature, e.g. ‘Categorical’, ‘Text’

  • is_input (bool) – indicates features which provided data to transform in this lineage.

  • name – feature name.

  • is_cutoff (bool) – indicates a cutoff column.

  • is

  • latest (dict) – end of the feature derivation window applied.

  • duration (int) – size of the feature derivation window applied.

  • is

  • time_unit (str) – time unit name like ‘MINUTE’, ‘DAY’, ‘MONTH’ etc.

  • duration – value/size of this duration object.

  • is

  • join_type (str) – kind of join, left/right.

  • left_table (dict) – information about a dataset which was considered as left.

  • right_table (str) – information about a dataset which was considered as right.

  • is

  • columns – list of columns which datasets were joined by.

  • datasteps (list[int]) – list of data steps id which brought the columns into the current step dataset.

classmethod get(project_id, id)

Retrieve a single FeatureLineage.

Parameters:
  • project_id (str) – The id of the project the feature belongs to

  • id (str) – id of a feature lineage to retrieve

Returns:

lineage – The queried instance

Return type:

FeatureLineage