Word Cloud
- class datarobot.models.word_cloud.WordCloud(ngrams)
Word cloud data for the model.
Notes
WordCloudNgram
is a dict containing the following:ngram
(str) Word or ngram value.coefficient
(float) Value from [-1.0, 1.0] range, describes effect of this ngram on the target. Large negative value means strong effect toward negative class in classification and smaller target value in regression models. Large positive - toward positive class and bigger value respectively.count
(int) Number of rows in the training sample where this ngram appears.frequency
(float) Value from (0.0, 1.0] range, relative frequency of given ngram to most frequent ngram.is_stopword
(bool) True for ngrams that DataRobot evaluates as stopwords.class
(str or None) For classification - values of the target class for corresponding word or ngram. For regression - None.
- Attributes:
- ngramslist of dicts
List of dicts with schema described as
WordCloudNgram
above.
- most_frequent(top_n=5)
Return most frequent ngrams in the word cloud.
- Parameters:
- top_nint
Number of ngrams to return
- Returns:
- list of dict
Up to top_n top most frequent ngrams in the word cloud. If top_n bigger then total number of ngrams in word cloud - return all sorted by frequency in descending order.
- Return type:
List
[WordCloudNgram
]
- most_important(top_n=5)
Return most important ngrams in the word cloud.
- Parameters:
- top_nint
Number of ngrams to return
- Returns:
- list of dict
Up to top_n top most important ngrams in the word cloud. If top_n bigger then total number of ngrams in word cloud - return all sorted by absolute coefficient value in descending order.
- Return type:
List
[WordCloudNgram
]
- ngrams_per_class()
Split ngrams per target class values. Useful for multiclass models.
- Returns:
- dict
Dictionary in the format of (class label) -> (list of ngrams for that class)
- Return type:
Dict
[Optional
[str
],List
[WordCloudNgram
]]
- class datarobot.models.word_cloud.WordCloudNgram(*args, **kwargs)