Churn Prediction
“Among members who visited in the past 60 days, which ones will have zero visits in the next 30 days?”
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
Among members who visited in the past 60 days, which ones will have zero visits in the next 30 days?
Predicting churn for all members is noisy — many already left months ago. What you really need is to identify members who are still coming but are about to stop. The backward window focuses on the members you can still save. For a gym chain with 500K members, preventing 5% churn saves $15M annually.
Quick answer
Churn prediction identifies which active customers will stop using your product or service within a defined time window. The best models focus on recently active users (not already-churned ones) and learn from behavioral decay patterns, social connections, and cross-account signals rather than simple usage thresholds.
Approaches compared
4 ways to solve this problem
1. Rule-Based Thresholds
Set static rules like 'flag members with fewer than 2 visits in 30 days.' Simple to implement, easy to explain, and requires no ML infrastructure.
Best for
Early-stage products with small user bases where manual review is feasible.
Watch out for
Rules cannot capture interaction effects (visit decline + location switching + social churn). Expect 55-65% accuracy at best.
2. Traditional ML (XGBoost/Logistic Regression)
Train a classifier on hand-crafted features like visit frequency, recency, and plan type. Requires a feature engineering pipeline and regular retraining.
Best for
Teams with ML engineers who can maintain feature pipelines and have clean single-table data.
Watch out for
Feature engineering takes weeks per iteration. Models trained on flat tables miss multi-hop relational signals like 'workout buddies who also churned.' Typical accuracy: 70-75%.
3. Deep Learning on Sequences (LSTMs/Transformers)
Model each member's visit history as a time series. Captures temporal patterns like frequency decay curves and seasonal dips.
Best for
High-frequency event data where temporal ordering matters more than cross-entity relationships.
Watch out for
Treats each member in isolation. Cannot see that 3 of a member's 4 gym buddies just canceled, or that a specific location is driving churn across many members.
4. KumoRFM (Graph Neural Networks on Relational Data)
Connects members, visits, locations, plans, and social relationships into a single graph. The GNN learns from the full relational structure without manual feature engineering. Backward-window PQL filters to recently active members automatically.
Best for
Any team with multi-table relational data who wants maximum accuracy with minimal feature engineering.
Watch out for
Requires relational data with meaningful connections between entities. If your data is truly a single flat table with no joins, the graph advantage is smaller.
Key metric: Graph-based churn models achieve 76.71 vs 62.44 accuracy on RelBench benchmarks compared to flat-table baselines, a 23% improvement driven by multi-table relational signals.
Why relational data changes the answer
Flat churn models see each member as an isolated row of features: visit count, plan type, tenure. They miss the rich web of relationships that actually predict churn. When Bob Garcia's two closest workout partners both canceled last month, that is a stronger churn signal than any individual metric. When members start switching between 3 locations instead of their usual 1, that restlessness pattern only appears when you join the visits table to the locations table. When a specific trainer leaves and their regulars start dropping off, you need the staff-member-visit graph to see it.
Kumo builds a heterogeneous graph connecting members to visits, locations, plans, trainers, and other members. The graph neural network propagates information across these connections, discovering compound signals that no amount of feature engineering on a flat table could replicate. In benchmarks on the RelBench dataset, graph-based models score 76.71 vs 62.44 for flat-table baselines on similar prediction tasks. The backward-window PQL filter adds another layer of precision by focusing predictions on members who visited in the past 60 days, eliminating the noise from long-gone members that inflates false positive rates in traditional approaches.
Think of it like a doctor diagnosing a patient. A flat model only sees the patient's vitals in isolation. A relational model sees that the patient's family members have a history of the same condition, their neighborhood has an environmental risk factor, and their pharmacy records show they stopped refilling a prescription. The diagnosis is the same task, but the relational context transforms accuracy.
How KumoRFM solves this
Relational intelligence for customer retention
Kumo's backward time window filters to recently active members before predicting forward behavior. Traditional models predict over all members, flooding retention teams with false positives from already-churned users. Kumo focuses on the members you can still save — learning from visit frequency decay, location switching patterns, and cross-member social signals in the relational graph.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
MEMBERS
| member_id | name | plan | signup_date |
|---|---|---|---|
| M001 | Alice Chen | Premium | 2024-01-15 |
| M002 | Bob Garcia | Basic | 2023-06-20 |
| M003 | Carol Patel | Premium | 2024-03-08 |
VISITS
| visit_id | member_id | location | timestamp |
|---|---|---|---|
| V9001 | M001 | Downtown | 2025-02-28 |
| V9002 | M002 | Midtown | 2025-02-10 |
| V9003 | M003 | Downtown | 2025-03-01 |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT COUNT(VISITS.*, 0, 30, days) = 0 FOR EACH MEMBERS.MEMBER_ID WHERE COUNT(VISITS.*, -60, 0, days) > 0
Prediction output
Every entity gets a score, updated continuously
| MEMBER_ID | TIMESTAMP | TARGET_PRED | True_PROB |
|---|---|---|---|
| M001 | 2025-03-05 | False | 0.09 |
| M002 | 2025-03-05 | True | 0.82 |
| M003 | 2025-03-05 | False | 0.14 |
Understand why
Every prediction includes feature attributions — no black boxes
Member M002 — Bob Garcia
Predicted: True (82% churn probability)
Top contributing features
Visit frequency (last 30d vs prior 30d)
-68%
34% attribution
Days since last visit
23 days
27% attribution
Workout buddies also churning
2 of 3
19% attribution
Plan downgrade in last 90d
Yes
12% attribution
Location switching frequency
3 locations
8% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Frequently asked questions
Common questions about churn prediction
What is the difference between churn prediction and churn detection?
Detection tells you who already churned (backward-looking). Prediction tells you who will churn in the future (forward-looking). Prediction is far more valuable because you can still intervene. Kumo's backward-window PQL makes this distinction explicit: it filters to recently active members and predicts their future behavior.
How far ahead can you predict churn accurately?
For most subscription and membership businesses, 14-30 day prediction windows deliver the best balance of accuracy and actionability. Longer windows (60-90 days) reduce accuracy but give more intervention time. The right window depends on your intervention playbook: if your retention offers take 2 weeks to execute, you need at least a 30-day prediction horizon.
Why do most churn models have high false positive rates?
Two reasons. First, they predict over all members including those who already churned months ago, inflating the 'at-risk' pool. Second, flat-table models miss relational signals that distinguish true risk from normal usage fluctuations. Kumo's backward window solves the first problem; the graph neural network solves the second.
What data do I need for churn prediction?
At minimum: a members/customers table and an events/transactions table with timestamps. The more relational data you add (locations, plans, support tickets, referral connections), the more signals the graph can capture. Kumo works directly on your existing database tables without requiring a feature store.
How does social churn affect prediction accuracy?
Social churn (friends or colleagues leaving together) is one of the strongest predictors that flat models completely miss. In gym and SaaS contexts, when 2 of 3 connected users churn, the third follows 60-70% of the time. Graph neural networks capture this propagation effect naturally.
Bottom line: A 500K-member gym chain preventing just 5% of at-risk churn saves $15M per year. Kumo's backward window eliminates noise from already-churned members, letting retention teams focus on the members they can still save.
Related use cases
Explore more retention use cases
Topics covered
One Platform. One Model. Infinite Predictions.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.




