3Classification · Customer Retention

Customer Churn Prediction

“Which loyalty members will lapse?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which loyalty members will lapse?

Loyalty programs cost $10-15 per active member annually to operate, yet 54% of loyalty memberships are inactive (Bond Brand Loyalty). A retailer with 10M loyalty members has 5.4M generating zero incremental revenue. More critically, 15-20% of active members lapse each year, and re-acquiring a lapsed member costs 5-7x more than retention. The signals of impending lapse are scattered: declining visit frequency, shrinking basket size, reduced email engagement, and competitors' promotional activity. By the time a member's status flips to 'inactive,' the retention window has closed.

Quick answer

Predicting which loyalty members will lapse requires connecting purchase history, visit frequency, email engagement, and competitive signals into a single model. Single-table churn models that only look at recency-frequency-monetary (RFM) metrics miss the early warning signs: a Gold member whose email opens dropped from 7 to 1 per month, whose basket size shrank 30%, and whose zip code just got a competing loyalty program. On SAP SALT benchmarks, relational approaches hit 91% accuracy vs 75% for XGBoost on customer prediction tasks.

Approaches compared

4 ways to solve this problem

1. RFM scoring with manual thresholds

Segment members by recency, frequency, and monetary value. Flag anyone below threshold as at-risk. The oldest trick in retail CRM.

Best for

Small loyalty programs (under 100K members) where a CRM manager can manually review flagged accounts.

Watch out for

RFM is backward-looking. By the time recency drops below your threshold, the member has already mentally churned. No ability to detect early behavioral shifts.

2. Logistic regression or random forest on flat features

Train a classifier on aggregated features: days since last purchase, avg basket size, total spend. Standard approach in most retail analytics teams.

Best for

Teams that need an interpretable baseline model with fast training times and minimal infrastructure.

Watch out for

Aggregating behavioral data into flat features destroys the temporal signal. A member who went from 10 visits to 2 over 90 days looks identical to one who went from 4 to 2. The trajectory matters more than the snapshot.

3. XGBoost with engineered behavioral features

Gradient-boosted trees with hand-built features like visit trend slopes, email engagement decay rates, and cross-channel activity flags.

Best for

Mature analytics teams that can invest 4-8 weeks building and maintaining feature pipelines across CRM, POS, and email systems.

Watch out for

Feature engineering is fragile. Every new data source (app usage, social media, competitor launches) requires manual pipeline work. SAP SALT shows 75% accuracy ceiling.

4. KumoRFM (relational foundation model)

Connects loyalty members to their full behavioral graph: purchases, visits, email interactions, promotional responses, and store-level competitive data.

Best for

Retailers with multi-channel behavioral data who want early churn detection (45-60 days before lapse) without building custom feature pipelines.

Watch out for

Requires connecting disparate data sources (POS, email, web analytics). If all your data lives in a single CRM export, start with XGBoost.

Key metric: SAP SALT customer prediction: relational 91% vs XGBoost 75% vs rules 63%. Early detection at 45-60 days vs 10-15 days for flat models.

Why relational data changes the answer

A flat churn model sees that Member LM-2201 made 2 purchases last month and opened 1 email. It cannot see that her visit frequency dropped 60% over 90 days, her basket size shrank from $42 to $28, she stopped clicking promotional emails entirely, and two competing loyalty programs launched in her zip code last month. These signals live across purchase_history, email_engagement, visit_patterns, and competitive_landscape tables.

Relational models connect these tables and learn the decay patterns that precede lapse. They discover that the combination of declining visit frequency + shrinking basket + email disengagement predicts lapse with 78% probability 45-60 days before it happens. Single-table models only catch this signal 10-15 days out, when the retention window has already closed. The earlier detection alone is worth $15-30M annually for a 10M-member program because proactive offers convert at 3-5x the rate of reactive win-back campaigns.

Predicting churn from a single RFM table is like a restaurant manager who only checks the reservation book. They see a regular went from weekly to monthly visits. A relational model is like a manager who also notices the regular stopped ordering wine (lower basket), ignored the last three special-event invitations (email disengagement), and a new restaurant opened on the same block (competitive signal). The reservation book alone tells you they are leaving. The full picture tells you why, and gives you time to act.

How KumoRFM solves this

Relational intelligence built for retail and e-commerce data

Kumo connects loyalty members to their purchase history, browsing behavior, email engagement, store visits, promotional responses, and competitor pricing data. The model identifies that Member LM-2201 has dropped from weekly to bi-weekly visits, her basket size has shrunk 30%, she has stopped opening promotional emails, and a competitor just launched a rival loyalty program in her zip code. These relational signals surface lapse risk 45-60 days before inactivity, giving marketing teams time to deploy personalized win-back offers.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

LOYALTY_MEMBERS

member_id	tier	join_date	points_balance	lifetime_spend
LM-2201	Gold	2021-03-15	12,400	$8,420
LM-2202	Silver	2023-08-01	3,200	$2,100
LM-2203	Platinum	2019-11-22	48,000	$32,500

PURCHASE_HISTORY

member_id	order_id	total	items	store_id	timestamp
LM-2201	ORD-1001	$42.30	6	S-14	2025-08-28
LM-2201	ORD-1002	$28.50	4	S-14	2025-09-10
LM-2203	ORD-1003	$187.20	12	S-22	2025-09-14

EMAIL_ENGAGEMENT

member_id	emails_sent_30d	opens	clicks	unsubscribed
LM-2201	8	1	0	False
LM-2202	8	5	2	False
LM-2203	8	7	4	False

VISIT_PATTERNS

member_id	visits_30d	visits_60d	visits_90d	trend
LM-2201	2	5	10	Declining
LM-2202	4	4	3	Stable
LM-2203	6	7	7	Stable

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(LOYALTY_MEMBERS.STATUS = 'lapsed', 0, 60, days)
FOR EACH LOYALTY_MEMBERS.MEMBER_ID
WHERE LOYALTY_MEMBERS.STATUS = 'active'

Prediction output

Every entity gets a score, updated continuously

MEMBER_ID	TIER	LAPSE_PROB	LIFETIME_SPEND	RECOMMENDED_ACTION
LM-2201	Gold	0.78	$8,420	Personal Offer + Bonus Points
LM-2202	Silver	0.23	$2,100	Standard Email
LM-2203	Platinum	0.09	$32,500	No Action Needed

Understand why

Every prediction includes feature attributions — no black boxes

Member LM-2201 (Gold tier)

Predicted: 78% probability of lapsing within 60 days

Top contributing features

Visit frequency declining (-60%)

10 to 2 in 90d

30% attribution

Basket size shrinkage

-30% avg order

24% attribution

Email engagement collapse

1 open, 0 clicks

20% attribution

Points redemption stalled

0 in 60d

15% attribution

Competitor loyalty launch in zip code

2 new programs

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about customer churn prediction

How early can AI predict customer churn in retail?

It depends on the data connections. Single-table RFM models typically detect churn risk 10-15 days before lapse, which is often too late for effective intervention. Relational models that connect purchase history, email engagement, visit patterns, and competitive data can flag at-risk members 45-60 days before lapse. That extra 30-45 days is the difference between a $5 targeted offer that retains a $8,000 lifetime-value member and a $25 win-back campaign that fails 80% of the time.

What is the cost of customer churn in retail?

Acquiring a new customer costs 5-7x more than retaining an existing one. For a loyalty program with 10M members where 15-20% lapse annually, each lapsed member represents $500-2,000 in lost annual revenue depending on the tier. A Platinum member lapsing costs 10x more than a Silver member. The total impact for a large retailer runs $150-400M annually in lost revenue, not counting the $10-15 per member annual cost of operating the loyalty program for inactive members.

Can churn models differentiate between price-sensitive and emotionally loyal customers?

Yes, if the model has access to promotional response data. A price-sensitive member who only shops during sales events needs a different retention strategy (exclusive early access) than an emotionally loyal member whose visits are declining due to poor in-store experience (personal outreach). Relational models surface these behavioral segments automatically because they connect purchase patterns to promotional response history, store visit data, and engagement channels.

Bottom line: Retain 25-35% of at-risk loyalty members with targeted interventions, recovering $15-30M in annual revenue from a 10M-member loyalty program.

Related use cases

Explore more retail & e-commerce use cases

Use Case #2Product RecommendationsLearn more

Use Case #6Basket AnalysisLearn more

Use Case #1SKU-Level Demand ForecastingLearn more

Previous#2 Product Recommendations

Next#4 Dynamic Pricing

Topics covered

retail customer churn predictionloyalty program retention AIcustomer attrition retaile-commerce churn modelgraph neural network retentionKumoRFMrelational deep learning retailloyalty member lapse predictioncustomer retention analyticschurn scoring retail

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free