Feature Lineage

class datarobot.models.FeatureLineage(steps=None)

Lineage of an automatically engineered feature.

Attributes:
steps: list

list of steps which were applied to build the feature.

`steps` structure is:
idint

step id starting with 0.

step_type: str

one of the data/action/json/generatedData.

name: str

name of the step.

description: str

description of the step.

parents: list[int]

references to other steps id.

is_time_aware: bool

indicator of step being time aware. Mandatory only for action and join steps. action step provides additional information about feature derivation window in the timeInfo field.

catalog_id: str

id of the catalog for a data step.

catalog_version_id: str

id of the catalog version for a data step.

group_by: list[str]

list of columns which this action step aggregated by.

columns: list

names of columns involved into the feature generation. Available only for data steps.

time_info: dict

description of the feature derivation window which was applied to this action step.

join_info: list[dict]

join step details.

`columns` structure is
data_type: str

the type of the feature, e.g. ‘Categorical’, ‘Text’

is_input: bool

indicates features which provided data to transform in this lineage.

name: str

feature name.

is_cutoff: bool

indicates a cutoff column.

`time_info` structure is:
latest: dict

end of the feature derivation window applied.

duration: dict

size of the feature derivation window applied.

`latest` and `duration` structure is:
time_unit: str

time unit name like ‘MINUTE’, ‘DAY’, ‘MONTH’ etc.

duration: int

value/size of this duration object.

`join_info` structure is:
join_type: str

kind of join, left/right.

left_table: dict

information about a dataset which was considered as left.

right_table: str

information about a dataset which was considered as right.

`left_table` and `right_table` structure is:
columns: list[str]

list of columns which datasets were joined by.

datasteps: list[int]

list of data steps id which brought the columns into the current step dataset.

classmethod get(project_id, id)

Retrieve a single FeatureLineage.

Parameters:
project_idstr

The id of the project the feature belongs to

idstr

id of a feature lineage to retrieve

Returns:
lineageFeatureLineage

The queried instance