What data is needed for subscriber churn prediction?

Kumo connects directly to your existing relational tables: SUBSCRIBERS, PLANS, USAGE, TICKETS, NETWORK_EVENTS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

1Binary Classification · Subscriber Churn

Subscriber Churn Prediction

“Which subscribers will port out?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which subscribers will port out?

Telecom carriers lose 1.5-2% of subscribers monthly. For a carrier with 30M subscribers at $55 ARPU, each percentage point of churn costs $198M annually. The cost of acquiring a replacement subscriber ($300-$500) is 6-10x the cost of retaining one. Traditional churn models built on billing data miss the network effects: when a subscriber's frequently-called contacts switch carriers, that subscriber follows within 60 days.

Quick answer

The most accurate way to predict telecom churn is to combine call-graph social signals, network quality events, support ticket patterns, and usage data in a relational ML model. The single strongest predictor is social contagion: when a subscriber's top-5 contacts port out, that subscriber follows within 60 days. On the RelBench benchmark, relational models score 76.71 vs 62.44 for single-table baselines.

Approaches compared

4 ways to solve this problem

1. Rules-based contract triggers

Flag subscribers approaching contract end dates, declining usage, or recent complaint calls for retention outreach.

Best for

Catches the obvious cases: out-of-contract subscribers with declining usage. Simple to implement with existing billing data.

Watch out for

Misses the 40% of churners who show no obvious billing red flags. A subscriber whose friends ported out last month will not show up in billing rules.

2. XGBoost on billing and CRM features

Build a wide feature table from billing records, CRM notes, and plan details, then train a gradient-boosted classifier.

Best for

Solid baseline with good interpretability. Works well when combined with strong feature engineering from domain experts.

Watch out for

Treats each subscriber independently. Cannot capture social contagion: when a subscriber's calling circle migrates to a competitor, that signal lives in the call graph, not in any individual subscriber's billing data.

3. Survival analysis (Cox regression)

Model time-to-churn using tenure, plan type, and usage as covariates with a hazard function.

Best for

Good for understanding which factors accelerate or delay churn. Produces calibrated time-to-event estimates.

Watch out for

Assumes covariates are fixed or change slowly. Cannot incorporate the call graph, network quality events, or the temporal sequencing of complaints and outages.

4. KumoRFM (relational graph ML)

Connect subscribers, plans, usage data, support tickets, and network events into a call-graph-enriched relational model. The GNN learns social contagion and multi-signal churn patterns automatically.

Best for

Highest accuracy. Captures the call-graph churn contagion signal that is invisible to all non-graph approaches, plus network quality and ticket escalation patterns.

Watch out for

Requires call detail records (CDRs) for the social graph. If CDR data is unavailable, the lift over XGBoost will be smaller.

Key metric: RelBench benchmark: relational models score 76.71 vs 62.44 for single-table baselines on subscriber churn prediction tasks.

Why relational data changes the answer

Telecom churn is a social event as much as an individual one. The signals live across call detail records (who the subscriber talks to, and whether those contacts are still on-network), network events (dropped calls in specific coverage areas), support tickets (unresolved complaints escalating in severity), and billing data (plan utilization approaching limits). A flat feature table with 'tenure = 8 months' and 'plan = Basic 5GB' misses that this subscriber's top five contacts ported out last month and they experienced seven dropped calls in their home area.

Relational models build a call graph where churn propagates through communication patterns. They learn that subscribers whose top contacts have ported, who experience repeated network quality events in their home area, and who have unresolved support tickets are 9x more likely to leave. On the RelBench benchmark, this multi-table approach scores 76.71 vs 62.44 for single-table baselines. For a 30M-subscriber carrier, that accuracy gap translates into tens of millions in avoided acquisition costs, because replacing a churned subscriber costs $300-$500 while retaining one costs $30-$50.

Predicting churn from billing data alone is like predicting neighborhood turnover by looking at mortgage payments. Payments are current, so everything looks fine. But you miss that four houses on the block just sold, the school ratings dropped, and the neighbors are complaining about construction noise. The subscriber's call graph is the neighborhood: when the people they talk to every day leave, they follow.

How KumoRFM solves this

Graph-learned network intelligence across your entire subscriber base

Kumo builds a call-graph connecting subscribers through their communication patterns, overlaid with plan details, usage trends, support tickets, and network quality events. It learns that subscribers whose top-5 contacts have ported out, who experienced 3+ dropped calls in poor-coverage areas, and who called a competitor's store number are 9x more likely to churn. This social contagion signal is invisible to traditional feature-based models that treat each subscriber independently.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

SUBSCRIBERS

subscriber_id	plan	tenure_months	contract_end
SUB001	Unlimited Plus	24	2025-04-15
SUB002	Basic 5GB	8	2025-03-30
SUB003	Family Share	36	2025-06-01

PLANS

plan_id	name	monthly_cost	data_gb	hotspot
PLN01	Unlimited Plus	$75	Unlimited	50GB
PLN02	Basic 5GB	$35	5	None
PLN03	Family Share	$120	Shared 30GB	15GB

USAGE

usage_id	subscriber_id	date	data_gb	calls_min	texts
U001	SUB001	2025-03-01	8.2	320	150
U002	SUB002	2025-03-01	4.8	45	280
U003	SUB003	2025-03-01	12.4	580	420

TICKETS

ticket_id	subscriber_id	category	created_date	resolved
T001	SUB002	Network quality	2025-02-20	N
T002	SUB002	Billing dispute	2025-02-25	Y
T003	SUB001	Plan inquiry	2025-03-01	Y

NETWORK_EVENTS

event_id	subscriber_id	type	timestamp	cell_tower
NE01	SUB002	Dropped call	2025-02-28	TWR_445
NE02	SUB002	No service	2025-03-01	TWR_445
NE03	SUB001	Normal	2025-03-01	TWR_102

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(SUBSCRIBERS.STATUS = 'Ported', 0, 30, days)
FOR EACH SUBSCRIBERS.SUBSCRIBER_ID
WHERE SUBSCRIBERS.CONTRACT_END <= '2025-06-01'

Prediction output

Every entity gets a score, updated continuously

SUBSCRIBER_ID	PLAN	TENURE	CHURN_30D_PROB
SUB001	Unlimited Plus	24mo	0.11
SUB002	Basic 5GB	8mo	0.86
SUB003	Family Share	36mo	0.04

Understand why

Every prediction includes feature attributions — no black boxes

Subscriber SUB002 -- Basic 5GB, 8-month tenure

Predicted: 86% port-out probability within 30 days

Top contributing features

Top-5 contacts ported (last 60d)

3 of 5

30% attribution

Network quality events (last 30d)

7 events

24% attribution

Open support tickets

1 unresolved

18% attribution

Data usage vs plan limit

96% utilized

16% attribution

Contract end proximity

29 days

12% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about subscriber churn prediction

What is the best ML model for telecom churn prediction?

Graph neural networks that incorporate the call graph outperform all non-graph approaches. On the RelBench benchmark, relational models score 76.71 vs 62.44 for single-table baselines. The critical signal is social contagion: when a subscriber's frequent contacts port to a competitor, that subscriber's churn probability increases 9x. No amount of billing-data feature engineering can capture this.

How does social contagion drive telecom churn?

When a subscriber's top-5 calling contacts port out, that subscriber follows within 60 days in the majority of cases. This is the single strongest churn predictor in telecom. Family plans accelerate it further: when one member of a family plan switches, the entire household follows. Call-graph analysis reveals these patterns before any individual behavioral signal appears.

What data do you need for telecom churn prediction?

At minimum: subscriber profiles, plan details, and usage history. The highest-impact addition is call detail records (CDRs) to build the social call graph. Also valuable: support ticket history, network quality events (dropped calls, coverage gaps), and MNP (mobile number portability) records for identifying when contacts have already ported.

How early can you predict telecom subscriber churn?

With call-graph analysis, reliable churn signals appear 60-90 days before port-out. The earliest indicator is contacts leaving the network. Traditional models based on billing data detect churn only 15-30 days out, often after the subscriber has already called to cancel.

What is the cost of subscriber churn in telecom?

For a carrier with 30M subscribers at $55 ARPU, each percentage point of monthly churn costs $198M annually. Replacing a churned subscriber costs $300-$500 in acquisition spend, while a targeted retention intervention costs $30-$50. Reducing monthly churn by 0.3% saves $71M per year.

Bottom line: A 30M-subscriber carrier that reduces monthly churn by 0.3% saves $71M per year in avoided acquisition costs. Kumo detects social contagion churn through the call graph, learning that when a subscriber's frequent contacts port out, that subscriber follows within 60 days.

Related use cases

Explore more telecom use cases

Use Case #3Upsell PredictionLearn more

Use Case #4SIM Fraud DetectionLearn more

Use Case #6Customer LTVLearn more

Next#2 Network Capacity

Topics covered

telecom churn predictionsubscriber churn AImobile carrier retentionport-out predictiontelecom retention modelgraph neural network telecomKumoRFM telecom churnwireless churn analyticsMNP prediction model

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free