Prediction Explainability

Use this page when you want KumoRFM to explain why it made a prediction and which parts of the graph contributed to the result. KumoRFM explainability has two layers:

a natural-language summary that explains the prediction in plain English
structured details that let you inspect the most important columns, cohorts, and subgraphs behind that prediction

The public KumoRFM explainability notebook demonstrates both styles on an e-commerce return-prediction task built from users, items, orders, and returns data.

Generate Explanations

KumoRFM can generate explanations alongside predictions by using the explain parameter. For example, a query like the following predicts whether an order will have a return within the next 30 days:

query = "PREDICT COUNT(returns.*, 0, 30, days) > 0 FOR orders.order_id=333"

Basic explanation:

explanation = model.predict(query, explain=True)

# Access the prediction DataFrame
print(explanation.prediction)

# Access the human-readable summary
print(explanation.summary)

# Pretty-print both
explanation.print()

When explain=True is enabled, the return type changes from a prediction DataFrame to an Explanation object. That object still contains the prediction output, but it also includes human-readable and structured explanation data. Skip the summary (faster, returns only prediction details):

from kumoai.rfm import ExplainConfig

explanation = model.predict(
    query,
    explain=ExplainConfig(skip_summary=True),
)

Working with the Explanation Object

The Explanation object supports indexing for convenience:

prediction_df = explanation[0]   # Same as explanation.prediction
summary_text = explanation[1]    # Same as explanation.summary

In practice, these are the fields most people start with:

explanation.prediction: the original prediction DataFrame
explanation.summary: a readable narrative of the key drivers
explanation.details: structured explainability outputs for deeper inspection

The notebook follows this pattern directly: first inspect the prediction table, then print the summary, and finally drill into the structured details when you want a more technical explanation.

Understanding the Summary

The default summary is designed to answer a business-facing question: Why did KumoRFM make this prediction? In the notebook example, the summary highlights the most important signals behind the prediction, such as:

the order date
the sales channel
the order price
user characteristics
item characteristics

This is the quickest way to understand a single prediction without manually inspecting all related rows in the graph.

Structured Explanation Details

The notebook also shows that explanation.details contains richer structured outputs. In particular, it breaks the explanation into two useful views:

column analysis via details.cohorts
subgraph analysis via details.subgraphs

These two views answer slightly different questions.

Column Analysis

Column analysis gives you a global view of how values in a column relate to outcomes across the in-context examples KumoRFM used for the prediction. Each cohort object includes fields such as:

table_name: which table is being analyzed
column_name: the feature or aggregate being analyzed
hop: how far the table is from the entity table
stype: the semantic type, such as numerical, categorical, or timestamp
cohorts: the value buckets or categories
populations: how much of the sampled context falls into each cohort
targets: the average target value associated with each cohort

This view is useful when you want to answer questions like:

Which value ranges generally increase or decrease risk?
Which categories are most associated with a positive outcome?
Which features appear to matter globally across similar examples?

Subgraph Analysis

Subgraph analysis gives you a local view of the relational evidence around the specific entity being predicted. In the notebook, the subgraph explanation is used to visualize the most important neighboring records and edges around the seed entity. This helps you see not just which columns mattered, but which nearby records in the graph were most influential. This is especially useful when:

the query depends on several linked tables
the prediction is driven by specific related records
you want to visualize the local graph neighborhood for debugging or presentation purposes

When to Use Each Explanation Type

Use the natural-language summary when you want a fast explanation for a single prediction. Use cohort analysis when you want to understand broader feature patterns in the sampled context. Use subgraph analysis when you want to understand which linked records and paths in the graph influenced the result.

Configuration Tips

If you need maximum speed, use ExplainConfig(skip_summary=True) and work only with the structured explanation details. If you are debugging a specific prediction, start with the summary and then inspect details.cohorts and details.subgraphs to validate the model’s reasoning.

Current Limitations

Explainability is currently supported only for single-entity predictions with run_mode="FAST".

​Generate Explanations

​Working with the Explanation Object

​Understanding the Summary

​Structured Explanation Details

​Column Analysis

​Subgraph Analysis

​When to Use Each Explanation Type

​Configuration Tips

​Current Limitations

​More Reading

Generate Explanations

Working with the Explanation Object

Understanding the Summary

Structured Explanation Details

Column Analysis

Subgraph Analysis

When to Use Each Explanation Type

Configuration Tips

Current Limitations

More Reading