What data is needed for player lifetime value prediction?

Kumo connects directly to your existing relational tables: PLAYERS, PURCHASES, SESSIONS, REFERRALS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

3Regression · Player LTV

Player Lifetime Value Prediction

“What is each player's 90-day lifetime value?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

What is each player's 90-day lifetime value?

UA teams spend $50M+ annually acquiring players with 7-day LTV estimates that miss long-tail spenders by 40%. A game spending $5 per install that misattributes high-LTV channels wastes $12M per year on the wrong ad networks. The 90-day LTV is shaped not just by individual behavior but by the referral chain quality, guild spending norms, and content engagement depth that simple regression on D7 revenue cannot capture.

Quick answer

Accurate player LTV prediction requires connecting purchase history, session engagement, referral chains, and guild spending patterns in a relational model. Traditional D7 revenue regression misses long-tail spenders because it ignores social spending norms and referral chain quality. Graph-based models produce reliable 90-day LTV estimates by Day 3, which is 27 days earlier than conventional approaches.

Approaches compared

4 ways to solve this problem

1. D7 revenue extrapolation

Multiply Day-7 revenue by a historical multiplier derived from cohort LTV curves to estimate 90-day or lifetime value.

Best for

Simple and fast. Adequate for mature games with stable monetization curves and large cohort samples.

Watch out for

Misses long-tail spenders who make their first purchase after Day 7. These late converters often become the highest-value players but are invisible to early extrapolation.

2. XGBoost regression on player features

Build a feature table with early behavioral signals (session count, levels completed, D1-D7 spend) and train a regression model.

Best for

Good baseline when you have strong feature engineering skills and enough labeled data from mature cohorts.

Watch out for

Cannot capture referral chain quality or social spending norms. Two players with identical D7 behavior can have 5x different LTV based on who referred them and what guild they joined.

3. Probabilistic models (BG/NBD, Pareto/NBD)

Bayesian models that estimate purchase frequency and dropout probability from transaction timing data.

Best for

Strong theoretical foundation for subscription and repeat-purchase modeling with clean transaction data.

Watch out for

Requires sufficient purchase history per player. Performs poorly for early-lifecycle prediction where most players have zero or one purchase.

4. KumoRFM (relational graph ML)

Connect players, purchases, sessions, and referral chains into a relational graph. The GNN learns spending trajectories, social spending norms, and referral chain influence automatically.

Best for

Best Day-3 accuracy. Captures social spending contagion and referral quality that give accurate estimates 27 days before D7 models converge.

Watch out for

Needs referral and social connection data for maximum benefit. If your game has no social features, the lift over XGBoost will be smaller.

Key metric: RelBench benchmark: relational models score 76.71 vs 62.44 for single-table baselines on player LTV prediction tasks.

Why relational data changes the answer

Player LTV is shaped by tables that never appear in a D7 regression: referral chains (who brought this player in, and how much do they spend?), guild membership (what is the spending norm in this social circle?), and progression depth (how much content has the player engaged with?). A flat feature table with 'D7_spend = $4.99' treats two players identically even when one was referred by a whale in an active guild and the other found the game through a low-quality ad network.

Relational models connect purchase data to referral chains and guild spending patterns, learning that players referred by high-spenders who join active guilds within 48 hours of install have 3.5x higher 90-day LTV. On UA budgets of $50M+, the accuracy gap between relational models and D7 extrapolation translates directly into capital allocation: the difference between doubling down on channels that produce high-LTV players and wasting budget on channels that produce players who look identical at D7 but never spend again.

Estimating a player's LTV from their first week of spending is like valuing a house by looking only at the property itself. You miss the neighborhood (guild spending norms), the school district (referral chain quality), and the market trajectory (social spending trends). A real estate appraiser who ignores comparable sales in the area will be wrong by a wide margin. Graph ML is the comp analysis for player value.

How KumoRFM solves this

Graph-learned player intelligence across your entire game ecosystem

Kumo connects players, purchases, sessions, and referral chains into a graph that captures spending contagion patterns. It learns that players referred by high-spenders who join active guilds within 48 hours of install have 3.5x higher 90-day LTV. The model tracks temporal spending trajectories and social spending norms across the network, producing accurate LTV estimates by Day 3 that traditional models cannot match until Day 30.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

PLAYERS

player_id	install_date	source	country
PLR201	2025-02-01	Facebook Ads	US
PLR202	2025-02-10	Organic	UK
PLR203	2025-02-05	Google UAC	DE

PURCHASES

purchase_id	player_id	amount_usd	item_type	timestamp
PUR201	PLR201	4.99	Currency	2025-02-08
PUR202	PLR201	19.99	Bundle	2025-02-20
PUR203	PLR202	9.99	Battle Pass	2025-02-15

SESSIONS

session_id	player_id	date	duration_min	events
S201	PLR201	2025-03-01	55	142
S202	PLR202	2025-03-01	22	38
S203	PLR203	2025-02-28	8	12

REFERRALS

referral_id	referrer_id	referred_id	timestamp
REF01	PLR201	PLR203	2025-02-05

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT SUM(PURCHASES.AMOUNT_USD, 0, 90, days)
FOR EACH PLAYERS.PLAYER_ID

Prediction output

Every entity gets a score, updated continuously

PLAYER_ID	SOURCE	D7_ACTUAL	PREDICTED_D90_LTV
PLR201	Facebook Ads	$4.99	$82.40
PLR202	Organic	$9.99	$31.20
PLR203	Google UAC	$0.00	$3.10

Understand why

Every prediction includes feature attributions — no black boxes

Player PLR201 -- Facebook Ads, US, Day 28

Predicted: $82.40 predicted 90-day LTV

Top contributing features

Purchase velocity (first 14d)

2 purchases

28% attribution

Session engagement depth

142 events/session

23% attribution

Referral network spending

$45 avg in referral chain

20% attribution

Guild spending norm

$8.50 ARPPU

17% attribution

Content completion rate

78% of available

12% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about player lifetime value prediction

How do you predict player LTV in mobile games?

Connect purchase history, session data, referral chains, and social connections into a relational model. The strongest LTV signals come from social context: referral chain quality and guild spending norms predict 90-day value better than any individual behavioral metric. Graph models produce accurate estimates by Day 3 vs Day 30 for traditional approaches.

What is pLTV and why does it matter for UA?

pLTV (predicted lifetime value) estimates a player's future spending based on early behavioral signals. It matters because UA teams make bid decisions based on expected value. If your pLTV model is off by 40% on long-tail spenders, you are systematically under-bidding on your most valuable acquisition channels.

How early can you predict player LTV accurately?

With relational ML that includes social and referral data, reliable LTV estimates are possible by Day 3. Traditional D7 regression models need 30+ days to converge for non-obvious segments. The 27-day speed advantage lets UA teams reallocate budget within the same campaign flight.

What data improves LTV prediction accuracy?

Beyond the basics (sessions, purchases), the highest-impact additions are referral chain data and social connections. Players referred by high-LTV players who join active guilds have 3.5x higher value. Adding these two tables to a relational model can improve Day-3 accuracy by 35%.

How do you handle players with zero spend in LTV models?

Zero-spend players at Day 3 are not all equal. Relational models distinguish between a zero-spend player in a high-spending guild (likely late converter) and a zero-spend player with declining sessions (likely churn). The social and engagement context around the zero is more informative than the zero itself.

Bottom line: A game studio spending $50M on UA that improves Day-3 LTV prediction accuracy by 35% reallocates $12M from underperforming channels to high-LTV sources. Kumo captures referral chain quality and social spending norms that D7 regression models miss, delivering accurate LTV estimates 27 days earlier.

Related use cases

Explore more gaming use cases

Use Case #1Player ChurnLearn more

Use Case #2IAP PredictionLearn more

Use Case #4MatchmakingLearn more

Previous#2 IAP Prediction

Next#4 Matchmaking

Topics covered

player LTV predictionlifetime value gaming AIUA optimization MLplayer value forecastingLTV modeling mobile gamesgraph neural network LTVKumoRFM player LTVuser acquisition ROIpLTV prediction model

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free