Contribution Score
Kumo scores the overall importance of both whole tables as well as individual columns within each table to the final prediction. To enable the analysis of each table columns’ effect on the overall prediction, Kumo provides the variation of predictions of all columns across all your Kumo tables.
Variation of Predictions
column indicate how each respective column contributes to your end predictions, calculated based on the variance of these predictions relative to the underlying columns (i.e., based on both ground truth labels and predictions).
Detecting Data Leakage
Columns with variation of prediction values dramatically higher than any others may be symptoms of data leakage, a common cause of poor accuracy at prediction time. Data leakage occurs during training when the model has access to information directly correlated to the target value being predicted, but not actually known yet at the time of prediction. For example, if you are predicting whether a customer will churn in the next 30 days and you have a column indicating whether the customer will cancel their subscription in the next 30 days, this would constitute both target leakage and future information leakage; in this case, you would look for an excessively high contribution score for this column.Analyzing Individual Columns
If you click on an individual column, you can view a plot that displays that the average model prediction against the actual labels in the holdout data split for different entity populations, segmented based on how often their historic rows appear within certain past time intervals.