Explainability
KumoRFM explanations provide insights into how predictions are made by offering two complementary perspectives. The global view (cohorts) reveals broad patterns across all in-context examples, showing which data characteristics drive predictions. The local view (subgraphs) highlights specific values in an individual entity’s data neighborhood that influenced its prediction. Together, these views answer: “What patterns does the model see overall?” and “Why did this specific entity get this prediction?” Use explanations to understand model reasoning, debug unexpected results, communicate insights to stakeholders, or validate that the model is learning meaningful patterns rather than spurious correlations.Getting Started
Enable explanations by passing anExplainConfig object to the predict method:
Explanation object with three components:
Global View: Column Analysis (Cohorts)
The cohort analysis reveals how different column values correlate with prediction outcomes across all in-context examples used by the model. This global perspective helps identify which features are generally predictive.Structure
Each cohort object contains:table_name: The table being analyzed (e.g.,'orders','users')column_name: The column or aggregate (e.g.,'age','COUNT(*)')hop: Distance from entity table (0 = entity itself, 1 = directly linked, 2 = second-degree neighbors)stype: Semantic typenumerical,categorical,timestamp,Nonefor aggregates)cohorts: List of value ranges (numerical/timestamp) or categoriespopulations: Proportion of in-context examples in each cohort (sums to 1.0)targets: Average prediction score for examples in each cohort
targets across different cohorts. Compare cohort values to identify risk factors and protective factors.
Example: Predicting order returns
Calculating Column importance
To quantify which columns matter most, compute weighted variance of targets:orders.price at hop 0 is the price of the order being predicted, while orders.price at hop 2 represents prices of other orders connected through items or users. These are treated as distinct features in the global analysis.
Local View: Subgraph Attribution
The subgraph shows the actual data neighborhood around the specific entity being predicted, with attribution scores indicating which values influenced the prediction.Structure
A subgraph contains:seed_id: Always 0 (the entity being predicted)seed_table: The entity’s table (e.g.,'orders')seed_time: The anchor time for predictiontables: Dictionary mapping table names to nodes
cells: Dictionary of column names to cell objectsvalue: The actual data value (can beNoneif missing)score: Attribution score from 0 to 1 (higher = more influential)links: Dictionary mapping relationship names to sets of connected node IDs
Score Interpretation
Attribution scores indicate how much changing a value would change the prediction:- 0.00 - 0.05: Negligible influence
- 0.05 - 0.15: Moderate influence
- 0.15 - 0.30: Strong influence
- 0.30+: Critical influence
Natural Language Summaries
For quick interpretation or stakeholder communication, generate natural language explanations:The model predicts this order is very unlikely to be returned (probability near 0%). The primary driver is the order’s age—it was placed over 18 months ago, well beyond typical return windows. Historical patterns show that orders older than 2 months have a 0% return rate, while recent orders (0-2 months) have a 2% return rate. The order’s low price ($8.98) and sales channel also contribute moderately to the low return probability.Summaries are powered by an LLM that interprets both cohort and subgraph data, making them ideal for reports, dashboards, or explaining decisions to non-technical audiences. But in any case, all the raw data which is summarized recovered as shown above and parsed into a desired format as needed.
To learn more about explainability see this example notebook