What data is needed for referral prediction?

Kumo connects directly to your existing relational tables: CUSTOMERS, REFERRALS, ORDERS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

6Binary Classification · Referral

Referral Prediction

“Which customers will refer a new user in the next 30 days?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which customers will refer a new user in the next 30 days?

Referral programs are the lowest-cost acquisition channel, but most companies incentivize all customers equally. High-NPS customers with strong social connections refer at 10x the rate of average customers, yet referral nudges go out in blanket campaigns. The result: wasted incentive spend, referral fatigue among unlikely referrers, and missed opportunities with natural advocates. Companies need to identify who will refer — not just who is satisfied.

Quick answer

Referral prediction identifies which customers will refer new users within a defined time window. The best models go beyond NPS scores by learning from the referral graph itself: past referral behavior, purchase engagement, social connections, and what distinguishes active referrers from satisfied-but-passive customers. Targeting the top 20% of predicted referrers generates 4x more referrals per dollar spent.

Approaches compared

4 ways to solve this problem

1. NPS-Based Targeting

Target customers with NPS scores of 9-10 (Promoters) for referral campaigns. The most common approach, built into every NPS platform.

Best for

Companies with strong NPS survey response rates and well-correlated NPS-to-referral behavior.

Watch out for

NPS measures satisfaction, not referral propensity. Many Promoters (NPS 9-10) are satisfied but passive. They will never refer because they do not have the social connections or the inclination. Meanwhile, some customers with NPS 7-8 are prolific referrers because they are embedded in relevant professional networks.

2. RFM + NPS Segmentation

Combine recency, frequency, monetary value, and NPS into a composite referral score. Target customers who are both high-value and highly satisfied.

Best for

Teams that want a quick, multi-dimensional targeting approach without ML infrastructure.

Watch out for

Still misses the referral graph signals: has this customer referred before? Are their connections active referrers? What is their social network density? These structural signals are stronger predictors than any individual metric.

3. Binary Classification (XGBoost/Random Forest)

Train a model on features like NPS, tenure, purchase frequency, and campaign response rate. Predict referral probability per customer.

Best for

Teams with ML infrastructure and clean feature data. Good accuracy when the referral behavior correlates with individual-level features.

Watch out for

Cannot capture network effects. A customer connected to 3 active referrers is far more likely to refer than one connected to 0 referrers, even if their individual attributes are identical. The referral graph structure is invisible to flat models.

4. KumoRFM (Graph Neural Networks on Referral Graph)

Builds a graph connecting customers, referrals, and orders. The GNN learns that referral behavior depends on network position, prior referral history, purchase engagement, and what distinguishes referrers from satisfied-but-passive customers.

Best for

Companies with referral program data where network effects and social influence drive referral behavior.

Watch out for

Requires historical referral data with timestamps. If you have never run a referral program, there is no referral graph to learn from. Start with a basic program first, then model it.

Key metric: Targeting the top 20% of predicted referrers generates 4x more referrals per dollar spent. Network position (connections to active referrers) is 3.4x more predictive than NPS alone.

Why relational data changes the answer

Customer CU104 (Alex Kim, NPS 10, 18-month tenure) already referred 1 converted user in the last 60 days and made 2 purchases in the last 30 days. A flat model might score them highly based on these individual features. But the relational graph adds crucial context: CU104 is connected to 3 other active referrers. In the referral network, customers connected to active referrers are 3.4x more likely to refer themselves. This network propagation effect is the strongest referral signal, and it is completely invisible to any model that treats customers as independent rows.

Conversely, CU103 (Maria Lopez, NPS 7, 6-month tenure) has low individual scores and no referral connections. Even if you offered triple incentives, the probability of referral is 11%. The graph confirms this by showing that CU103's network position is isolated: no connected referrers, no connected customers in adjacent product categories. The combination of individual attributes and network position is what separates 93% prediction accuracy from 70%. On the SAP SALT benchmark, relational models achieve 91% accuracy vs 75% for single-table models. For referral prediction specifically, the network structure is often more predictive than any individual customer attribute.

Predicting referrals with NPS alone is like predicting which restaurant guests will write online reviews based on their satisfaction score. Many delighted guests never write reviews. The best predictor is not how satisfied they are but whether they have a history of reviewing, whether their social circle writes reviews, and whether they are embedded in foodie communities. Satisfaction is necessary but not sufficient. The social graph is what predicts action.

How KumoRFM solves this

Relational intelligence for smarter acquisition

Kumo builds a graph connecting CUSTOMERS, REFERRALS, and ORDERS. The GNN learns that referral behavior depends on more than NPS alone — it captures patterns like 'customers who purchased 3+ times, have tenure above 12 months, and are connected to other active referrers.' By modeling the referral graph directly, Kumo identifies the structural and behavioral signals that distinguish referrers from satisfied-but-passive customers.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

CUSTOMERS

customer_id	name	nps_score	tenure_months
CU101	Sarah Chen	9	24
CU102	James Park	8	36
CU103	Maria Lopez	7	6
CU104	Alex Kim	10	18

REFERRALS

referral_id	referrer_id	referee_id	status	timestamp
R01	CU101	CU103	converted	2025-09-15
R02	CU102	CU105	pending	2025-10-01
R03	CU104	CU106	converted	2025-10-20

ORDERS

order_id	customer_id	amount	timestamp
O901	CU101	$340	2025-10-05
O902	CU101	$520	2025-11-01
O903	CU102	$280	2025-10-15
O904	CU104	$610	2025-10-25
O905	CU104	$445	2025-11-10

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT COUNT(REFERRALS.*, 0, 30, days) > 0
FOR EACH CUSTOMERS.CUSTOMER_ID

Prediction output

Every entity gets a score, updated continuously

CUSTOMER_ID	TIMESTAMP	TARGET_PRED	True_PROB
CU101	2025-11-01	True	0.85
CU102	2025-11-01	True	0.72
CU103	2025-11-01	False	0.11
CU104	2025-11-01	True	0.93

Understand why

Every prediction includes feature attributions — no black boxes

Customer CU104 — Alex Kim

Predicted: True (93% probability)

Top contributing features

Already referred 1 converted user in last 60 days

1 referral

31% attribution

NPS score of 10 (promoter)

26% attribution

2 purchases in last 30 days (high engagement)

2 orders

20% attribution

18-month tenure (established relationship)

18 months

15% attribution

Connected to 3 other active referrers in graph

3 connections

8% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about referral prediction

Why do NPS Promoters not always refer?

NPS measures satisfaction, not referral propensity. Many Promoters are satisfied but socially passive. They do not have relevant professional connections, they are not active on social media, or they simply do not think to refer. Referral prediction models identify the subset of Promoters who are both satisfied and structurally positioned to refer.

How much more effective is targeted referral marketing?

Targeting the top 20% of predicted referrers generates 4x more referrals per dollar spent than blanket campaigns. The savings come from not wasting incentives on satisfied-but-passive customers and concentrating resources on customers with high referral probability and strong network positions.

What makes a customer a 'super referrer'?

Super referrers share three traits: high product engagement (frequent purchases, deep usage), strong social connectivity (connected to other active users and referrers), and prior referral behavior (have referred before and seen their referrals convert). The graph model identifies this combination automatically.

Can referral prediction work for B2B companies?

Yes. B2B referral dynamics are even more graph-dependent because professional networks drive recommendations. The model learns from industry connections, partner relationships, and conference co-attendance patterns. B2B referrals have 3-5x higher LTV than other acquisition channels, making prediction especially valuable.

Bottom line: Targeting the top 20% of predicted referrers with personalized incentives generates 4x more referrals per dollar spent than blanket referral campaigns, turning your best customers into a scalable acquisition engine.

Related use cases

Explore more acquisition use cases

Use Case #1Lead ScoringLearn more

Use Case #5Marketing AttributionLearn more

Use Case #7Trial-to-Paid ConversionLearn more

Previous#5 Marketing Attribution

Next#7 Trial-to-Paid Conversion

Topics covered

referral prediction AIcustomer referral modelreferral scoringword-of-mouth predictiongraph neural network referralKumoRFMrelational deep learningNPS predictionviral growth predictioncustomer advocacy scoringreferral program optimization

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free