3Classification · Retention

Policyholder Churn Prediction

“Which policyholders will not renew?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which policyholders will not renew?

P&C insurers face 10-15% annual non-renewal rates, with acquisition costs of $400-$600 per new policyholder (J.D. Power). For an insurer with 5M policyholders, a 12% churn rate means 600K lost customers and $240-360M in replacement acquisition costs annually. Worse, profitable low-risk policyholders are the most likely to leave because competitors aggressively poach them with lower rates. The signals of impending non-renewal are scattered: rate-shopping behavior (quote requests from competitors), claim dissatisfaction, premium increase reactions, and life changes (moving, marriage, new vehicle) that trigger a review of coverage.

Quick answer

AI predicts which insurance policyholders will not renew by connecting rate-change history, service interactions, competitive market activity, and claims patterns into a relational graph. Unlike simple logistic regression on policyholder attributes, graph-based models detect that a low-loss-ratio customer who received a 12% rate increase and lives in a zip code with active competitor conquest campaigns is 74% likely to leave. Early detection gives retention teams 60-90 days to intervene with targeted offers.

Approaches compared

4 ways to solve this problem

1. Rule-Based Retention Triggers

Flag policyholders based on hardcoded rules: rate increase above X%, more than Y service calls, or approaching renewal date without engagement. Simple to implement and transparent.

Best for

Quick wins when you have obvious churn triggers like large rate increases or recent claim denials.

Watch out for

Rules miss the interaction effects. A 12% rate increase might be fine for a price-insensitive customer but fatal for a low-loss-ratio policyholder in a competitive market. One-size-fits-all rules waste retention budget.

2. Logistic Regression / Survival Models

Statistical models trained on policyholder features like tenure, premium, claims history, and rate-change history. Survival models add time-to-event analysis for more nuanced predictions.

Best for

Building interpretable churn scores when you need to explain the model to business stakeholders or regulators.

Watch out for

Cannot incorporate competitive market dynamics, service-interaction sentiment, or life-event signals without extensive manual feature engineering. Misses the relational context entirely.

3. Gradient-Boosted Trees (XGBoost)

Tree-based models that capture non-linear interactions between policyholder attributes. Can handle mixed data types and missing values well.

Best for

Moderate accuracy improvement over logistic regression with reasonable engineering effort.

Watch out for

Still a flat-table approach. Cannot see that a competitor launched a conquest campaign in the policyholder's zip code, or that the policyholder's household has been shopping for quotes. These cross-table signals require manual joins and aggregations.

4. Relational Deep Learning (Kumo's Approach)

Connects policyholders to rate changes, service interactions, competitive market data, claims history, and life-event signals in a single graph. Learns churn patterns from the full relational context automatically.

Best for

Detecting churn risk 60-90 days before renewal by combining rate-sensitivity signals, competitive pressure, and service-interaction patterns that span multiple data sources.

Watch out for

Requires competitive market data and service-interaction logs to be connected to policyholder records. If these data sources are siloed in different systems, integration work comes first.

Key metric: Multi-line policyholders retain at 90% vs. 70% for single-line (J.D. Power). Targeted retention of profitable at-risk customers saves $60-120M annually for a 5M-policyholder insurer.

Why relational data changes the answer

Flat churn models see each policyholder as an isolated row: premium, tenure, claims count, rate-change percentage. They can predict that policyholders with large rate increases churn more often. But they cannot see that Jennifer Adams received a 12% rate increase, called customer service twice with negative sentiment, lives in a zip code where two competitors just launched aggressive conquest campaigns, and has a loss ratio of 0.28 that makes her extremely attractive to those competitors. These signals come from four different tables, and their interaction effect is what drives the prediction. A 12% increase for a policyholder with no competitive alternatives is manageable. The same increase for Jennifer is a near-certain loss.

Relational learning captures these multi-table interaction patterns without manual feature engineering. The model walks from policyholder to rate changes, to service interactions, to competitive market activity in their zip code, to their claims history and risk profile. It learns that the combination of high rate increase + competitive pressure + low loss ratio + negative service sentiment predicts churn at 74% confidence, while any single factor alone would score below 40%. This precision matters because retention budgets are finite. Spending $200 on a rate adjustment for a profitable customer about to leave is a great investment. Spending the same $200 on a high-loss-ratio customer who was going to renew anyway is a waste.

Predicting churn from a flat policyholder table is like predicting whether a restaurant regular will stop coming by looking only at their visit frequency. You miss that the restaurant just raised prices 12%, a new competitor opened across the street with a grand-opening discount, and the regular complained to the manager about cold food last Tuesday. Every signal matters, and they compound. Relational churn models see the full picture: the price increase, the competitor, and the complaint, all connected to the same customer.

How KumoRFM solves this

Relational intelligence built for insurance data

Kumo connects policyholders to their policy details, claims history, billing patterns, service interactions, rate changes, and competitive market data. The model identifies that Policyholder PH-6601 received a 12% rate increase, called customer service twice with billing questions, and lives in a zip code where a competitor just launched an aggressive acquisition campaign. These signals predict non-renewal 60-90 days before the renewal date, giving retention teams time to offer proactive rate adjustments or coverage enhancements to keep profitable customers.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

POLICYHOLDERS

policyholder_id	name	policy_type	premium	tenure_years	loss_ratio
PH-6601	Jennifer Adams	Home + Auto	$3,200	6.4	0.28
PH-6602	Mark Stevens	Auto Only	$1,800	2.1	0.65
PH-6603	Diana Lee	Home + Auto + Umbrella	$5,400	11.2	0.15

RATE_CHANGES

policyholder_id	effective_date	old_premium	new_premium	pct_change
PH-6601	2025-07-01	$2,860	$3,200	+11.9%
PH-6602	2025-08-01	$1,720	$1,800	+4.7%
PH-6603	2025-06-01	$5,200	$5,400	+3.8%

SERVICE_INTERACTIONS

policyholder_id	channel	type	sentiment	timestamp
PH-6601	Phone	Billing Question	Negative	2025-09-05
PH-6601	Phone	Coverage Question	Neutral	2025-09-12
PH-6603	App	Document Request	Positive	2025-09-10

COMPETITIVE_MARKET

zip_code	competitor	campaign_type	avg_savings_offered	start_date
90210	Geico	Conquest	$400-$600	2025-08-15
90210	Progressive	Switch & Save	$300-$500	2025-09-01
10001	None Active	N/A	N/A	N/A

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(POLICYHOLDERS.STATUS = 'non_renewed', 0, 90, days)
FOR EACH POLICYHOLDERS.POLICYHOLDER_ID
WHERE POLICYHOLDERS.STATUS = 'active'

Prediction output

Every entity gets a score, updated continuously

POLICYHOLDER_ID	PREMIUM	CHURN_PROB	LOSS_RATIO	RETENTION_ACTION
PH-6601	$3,200	0.74	0.28	Proactive Rate Review
PH-6602	$1,800	0.31	0.65	Standard Renewal
PH-6603	$5,400	0.08	0.15	Loyalty Reward

Understand why

Every prediction includes feature attributions — no black boxes

Policyholder PH-6601 (Jennifer Adams)

Predicted: 74% probability of non-renewal

Top contributing features

Rate increase magnitude

+11.9%

28% attribution

Competitor conquest campaigns in zip

2 active

24% attribution

Negative service interactions

2 calls, 1 negative

21% attribution

Low loss ratio (attractive to competitors)

0.28

16% attribution

No bundling discount applied

Missing auto bundle

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about policyholder churn prediction

How does AI predict policyholder churn in insurance?

AI predicts policyholder churn by analyzing connected signals across rate changes, service interactions, competitive market activity, claims history, and life events. Graph-based models detect that specific combinations of these factors (like a large rate increase plus active competitor campaigns in the policyholder's area) predict non-renewal 60-90 days before the renewal date.

What is the cost of policyholder churn for insurance companies?

Acquiring a new policyholder costs $400-$600 (J.D. Power). For an insurer with 5M policyholders and 12% annual churn, that is 600K lost customers and $240-360M in replacement costs per year. Worse, the most profitable low-risk policyholders are the ones most likely to leave because competitors target them aggressively.

How early can AI detect insurance customer churn?

Graph-based models can flag at-risk policyholders 60-90 days before renewal. The signals appear when rate increases are applied, when service interactions turn negative, when competitors launch local campaigns, or when life events (home purchase, new vehicle) trigger coverage shopping. Earlier detection means more time for effective retention interventions.

What retention strategies work best for at-risk insurance policyholders?

The most effective strategies are personalized: proactive rate reviews for price-sensitive customers, bundling offers for single-line policyholders, loyalty rewards for long-tenure customers, and coverage enhancements for customers who recently had claims. AI-driven targeting doubles conversion rates compared to untargeted campaigns because it matches the right intervention to the right customer.

Bottom line: Retain 25-35% of at-risk profitable policyholders with targeted rate adjustments, saving $60-120M in annual acquisition costs for a 5M-policyholder insurer.

Related use cases

Explore more insurance use cases

Use Case #6Cross-Sell OptimizationLearn more

Use Case #8Pricing OptimizationLearn more

Use Case #2Underwriting Risk AssessmentLearn more

Previous#2 Underwriting Risk Assessment

Next#4 Claims Severity Prediction

Topics covered

policyholder churn predictioninsurance retention AIpolicy renewal predictioncustomer attrition insurancegraph neural network retentionKumoRFMrelational deep learning insuranceinsurance customer retentionnon-renewal predictionpolicyholder loyalty analytics

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free