Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Use Case11 min read

Subscriber Retention: GNN on Telecom Usage + Network Graphs

Telecom subscriber acquisition costs $300-600. Churn rates of 1-2% monthly cost the industry billions. When one subscriber churns, their contacts are 3-5x more likely to follow. Here is how to model this contagion effect with GNNs.

PyTorch Geometric

TL;DR

  • 1Telecom retention is a social graph problem. The call/text network creates social anchoring: subscribers with many on-network contacts churn at 1/3 the rate. Churn is also contagious through the network.
  • 2GATConv on the communication graph learns which social connections drive retention. Subscribers in declining social clusters get higher churn scores before their individual usage changes.
  • 3On RelBench benchmarks, GNNs achieve 75.83 AUROC vs 62.44 for flat-table LightGBM. Social network features account for most of the improvement.
  • 4The communication graph is massive (millions of subscribers, billions of CDRs). Mini-batch training with NeighborLoader is essential.
  • 5KumoRFM predicts retention risk with one PQL query (76.71 AUROC zero-shot), automatically constructing the communication graph from CDR data.

The business problem

Telecom subscriber acquisition costs $300-600 per subscriber. Monthly churn rates of 1-2% mean carriers must constantly acquire new subscribers just to maintain their base. Reducing churn by 0.5% can save a major carrier $500M+ annually. The challenge: predicting churn early enough to intervene with retention offers that actually work.

The most powerful predictor of telecom retention is not individual usage but social anchoring. Subscribers who frequently communicate with other on-network subscribers stay 3x longer. When a subscriber's close contacts start churning, they become 3-5x more likely to follow. This contagion effect means churn is a graph problem.

Why flat ML fails

  • No social anchoring: A subscriber with 20 active on-network contacts has fundamentally different churn risk than one with 2, even if their usage metrics are identical.
  • No contagion modeling: Churn spreads through social clusters. When a family plan organizer switches, all members follow. Flat models cannot propagate this risk.
  • Usage alone is insufficient: Individual usage can look stable while the social network is deteriorating. By the time usage drops, the churn decision is already made.
  • No competitive intelligence: When contacts start calling from competitor numbers, it signals competitive pressure building in the social cluster.

The relational schema

schema.txt
Node types:
  Subscriber (id, plan, tenure, monthly_spend, device)
  Tower      (id, geo, congestion, coverage_score)
  Plan       (id, type, price, data_limit, family_flag)

Edge types:
  Subscriber --[calls]-->       Subscriber (minutes, frequency)
  Subscriber --[texts]-->       Subscriber (count, frequency)
  Subscriber --[connects_to]--> Tower      (signal_quality)
  Subscriber --[has_plan]-->    Plan
  Subscriber --[family_with]--> Subscriber

The call/text graph captures social anchoring. Tower connections capture coverage experience. Family edges capture shared plan dependencies.

PyG architecture: GATConv on communication graph

retention_model.py
import torch
import torch.nn.functional as F
from torch_geometric.nn import GATConv, HeteroConv, Linear

class RetentionGNN(torch.nn.Module):
    def __init__(self, hidden_dim=64, heads=4):
        super().__init__()
        self.subscriber_lin = Linear(-1, hidden_dim)
        self.tower_lin = Linear(-1, hidden_dim)
        self.plan_lin = Linear(-1, hidden_dim)

        self.conv1 = HeteroConv({
            ('subscriber', 'calls', 'subscriber'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
            ('subscriber', 'texts', 'subscriber'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
            ('subscriber', 'connects_to', 'tower'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
            ('subscriber', 'family_with', 'subscriber'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
        }, aggr='sum')

        self.conv2 = HeteroConv({
            ('subscriber', 'calls', 'subscriber'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
            ('subscriber', 'texts', 'subscriber'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
            ('subscriber', 'family_with', 'subscriber'): GATConv(
                hidden_dim, hidden_dim // heads, heads=heads),
        }, aggr='sum')

        self.classifier = Linear(hidden_dim, 1)

    def forward(self, x_dict, edge_index_dict):
        x_dict['subscriber'] = self.subscriber_lin(
            x_dict['subscriber'])
        x_dict['tower'] = self.tower_lin(x_dict['tower'])
        x_dict['plan'] = self.plan_lin(x_dict['plan'])

        x_dict = {k: F.elu(v) for k, v in
                  self.conv1(x_dict, edge_index_dict).items()}
        x_dict = self.conv2(x_dict, edge_index_dict)

        return torch.sigmoid(
            self.classifier(x_dict['subscriber']).squeeze(-1))

GATConv attention weights learn which communication relationships drive retention. Family members and frequent contacts get higher attention than occasional callers.

Expected performance

  • Usage-based rules: ~55 AUROC
  • LightGBM (flat-table): 62.44 AUROC
  • GNN (GATConv communication graph): 75.83 AUROC
  • KumoRFM (zero-shot): 76.71 AUROC

Or use KumoRFM in one line

KumoRFM PQL
PREDICT is_churned FOR subscriber
USING subscriber, call_record, plan, tower, device

One PQL query. KumoRFM constructs the communication graph from CDR data and predicts retention risk per subscriber.

Frequently asked questions

Why is the call/text graph so powerful for telecom retention?

The call graph captures social anchoring: subscribers who frequently call/text other subscribers on the same network have much higher retention. When one subscriber in a social cluster churns, their contacts are 3-5x more likely to follow. This contagion effect is invisible to individual subscriber models.

What makes telecom churn different from SaaS churn?

Telecom churn is driven by the communication network structure (who calls whom), plan economics (overpaying vs optimal plan), and coverage experience. The social anchoring effect is stronger than in most SaaS products because switching carriers disrupts shared family plans and established communication patterns.

How do you combine usage features with network features?

Usage features (data consumption, call minutes, plan utilization) become node attributes. Communication patterns become edges (call frequency, text count between subscribers). The GNN aggregates both: a subscriber's retention score depends on their own usage AND their social network's health.

Can GNNs detect subscribers being poached by competitors?

Yes. When a subscriber's contacts start churning to a specific competitor, the pattern propagates through the graph. The GNN can identify at-risk clusters where competitive pressure is building, enabling targeted retention offers before the subscriber actively considers switching.

How does KumoRFM handle telecom retention?

KumoRFM takes your telecom database (subscribers, call records, plans, devices, support tickets) and predicts retention risk with one PQL query. It automatically constructs the communication graph and captures social anchoring effects.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.