4Regression · Loss Estimation

Claims Severity Prediction

“What will the total cost of this claim be?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

What will the total cost of this claim be?

Insurers set initial reserves based on adjuster experience and lookup tables, leading to 30-40% inaccuracy at First Notice of Loss (FNOL). Under-reserving creates balance-sheet surprises and regulatory issues. Over-reserving ties up $20-50B in unnecessary capital across the industry (AM Best). Adjusters spend 3-5 hours per claim on initial assessment, with complex claims taking 2-3 weeks to evaluate. A top-20 insurer processing 500K claims per year could save $200-400M annually in reserve accuracy improvements and $50-100M in faster claims handling.

Quick answer

AI predicts the total cost of an insurance claim at First Notice of Loss by connecting FNOL details with policy coverage, historical claim outcomes, provider cost data, and regional trends. Traditional adjuster estimates are off by 30-40% at FNOL. Graph-based models reduce that error to 10-15% within 48 hours by learning from the relational patterns across similar claims, providers, and geographic factors.

Approaches compared

4 ways to solve this problem

1. Adjuster Judgment + Lookup Tables

Adjusters set initial reserves based on experience, DRG-style severity tables, and manager guidelines. The industry default for decades. Heavily dependent on individual adjuster skill and workload.

Best for

Simple, low-complexity claims where the loss type is well-understood and comparable claims are plentiful.

Watch out for

Accuracy varies wildly by adjuster. Junior adjusters under-reserve, senior adjusters over-reserve defensively. Complex claims (BI, fire, multi-party) are consistently mis-estimated by 30-50%.

2. Regression Models on Claim Features

Linear or tree-based regression trained on claim-level features: peril type, initial estimate, claimant demographics, policy limits. A meaningful improvement over manual estimation.

Best for

High-volume, low-complexity lines (auto PD, basic property) where the claim-level features carry most of the predictive signal.

Watch out for

Cannot capture regional cost trends, provider pricing variation, or litigation probability signals that live in separate tables. Auto BI claims are notoriously hard to predict because severity depends on attorney involvement, medical provider costs, and jurisdiction.

3. Ensemble Models with External Data

Gradient-boosted trees enriched with external data feeds: medical cost indices, contractor rate databases, litigation scoring, and weather data. More features mean better predictions.

Best for

Improving predictions for complex perils (fire, BI, CAT) where external data sources add signal beyond what is in the claim record.

Watch out for

Feature engineering becomes a bottleneck. Each new data source requires manual integration, join logic, and aggregation decisions. The model sees external data as flat features, not as connected relationships.

4. Relational Deep Learning (Kumo's Approach)

Connects FNOL details to policy coverage, historical outcomes for similar claims, provider cost networks, regional trends, and litigation patterns in a single relational graph. Predictions update as new information arrives.

Best for

Accurate severity prediction at FNOL for all peril types, including complex BI and multi-party claims. The model learns provider-region-peril cost patterns automatically.

Watch out for

Initial predictions at FNOL are inherently uncertain for novel loss types. The model improves as claim details develop, converging to 10-15% accuracy within 48 hours.

Key metric: Adjuster estimates are 30-40% inaccurate at FNOL. Graph-based models converge to 10-15% error within 48 hours, saving $200-400M annually for a top-20 insurer in reserve accuracy and handling costs.

Why relational data changes the answer

Flat severity models see each claim as a row: peril type, initial estimate, policy limits, claimant age. They can predict that fire claims average $58K in the West region. But they cannot see that this specific fire claim involves a property in a contractor-scarce area where repair costs are running 25% above regional averages, the policy has replacement-cost endorsement (uncapped), and similar claims with this peril-construction combination in this zip code have been trending upward at 12% year-over-year. These signals live across provider cost tables, policy endorsement records, and historical claims data. A flat model would need a data engineer to precompute dozens of aggregated features, and they would still miss the interaction between contractor availability, coverage type, and geographic trend.

Relational learning connects the claim directly to its relevant context. The model walks from the claim to the policy (coverage limits, endorsements), to the geographic region (contractor rates, material costs, trend), to historical claims with similar characteristics (peril, construction type, zip code), and to provider networks (which contractors are available and at what cost). This produces a severity estimate that accounts for the full situational context. For auto BI claims, the model connects the injury description to medical provider costs in the region, attorney involvement rates for that jurisdiction, and verdict patterns for similar cases. The result is a reserve that is accurate within 10-15% at 48 hours, compared to 30-40% error for manual or flat-model estimates.

Estimating claim severity from a flat table is like a contractor quoting a renovation based only on square footage and room count. They miss that the house has asbestos behind the walls, material prices in the area jumped 20% this quarter, and the only licensed plumber in town is booked for six weeks. The actual cost depends on relationships between the property, the local market, and the available labor pool. Relational severity models see all of those connected factors at once.

How KumoRFM solves this

Relational intelligence built for insurance data

Kumo connects FNOL details, policy coverage, claimant history, provider networks, geographic risk factors, and historical claim outcomes into a relational graph. At the moment a claim is filed, the model predicts that Claim CLM-9210 (Property Fire) will cost $52,400 based on the property's construction type, local contractor rates, the severity of recent fires in the area, and the claimant's coverage limits. The prediction updates as new information arrives (adjuster photos, repair estimates, medical reports), converging to within 10-15% of final cost within 48 hours.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

CLAIMS

claim_id	policy_id	peril	initial_estimate	fnol_date	description
CLM-9210	POL-4425	Fire	$45,000	2025-09-10	Kitchen fire, partial structure damage
CLM-9215	POL-4432	Auto BI	$25,000	2025-09-12	Rear-end collision, neck injury
CLM-9220	POL-4440	Water	$12,000	2025-09-14	Pipe burst, basement flooding

POLICY_DETAILS

policy_id	coverage_limit	deductible	property_value	endorsements
POL-4425	$500,000	$2,500	$510,000	Replacement Cost
POL-4432	$100,000/$300,000	$500	N/A	UM/UIM
POL-4440	$350,000	$1,000	$380,000	Water Backup

HISTORICAL_CLAIMS

peril	region	avg_final_cost	median_duration_days	litigation_rate
Fire	West	$58,200	45	8%
Auto BI	Northeast	$32,400	120	22%
Water	Midwest	$14,800	21	3%

PROVIDER_COSTS

region	provider_type	avg_rate	availability	quality_score
West	General Contractor	$185/hr	Low (backlog)	4.2/5
Northeast	Chiropractor	$120/visit	High	3.8/5
Midwest	Plumber	$95/hr	Medium	4.0/5

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT SUM(CLAIMS.FINAL_PAID, 0, 0, days)
FOR EACH CLAIMS.CLAIM_ID
WHERE CLAIMS.STATUS = 'open'

Prediction output

Every entity gets a score, updated continuously

CLAIM_ID	PERIL	INITIAL_EST	KUMO_PREDICTED	CONFIDENCE	TRIAGE_TIER
CLM-9210	Fire	$45,000	$52,400	High	Senior Adjuster
CLM-9215	Auto BI	$25,000	$38,700	Medium	Litigation Watch
CLM-9220	Water	$12,000	$11,200	High	Fast-Track

Understand why

Every prediction includes feature attributions — no black boxes

Claim CLM-9215 (Auto BI, rear-end collision)

Predicted: $38,700 predicted total cost (vs $25K initial estimate)

Top contributing features

Injury type and litigation rate

Neck, 22% lit. rate

28% attribution

Regional medical cost trends

+12% YoY NE

24% attribution

Claimant attorney involvement signal

Likely

21% attribution

Similar claim outcome distribution

$32.4K median

16% attribution

Policy coverage limits

$100K/$300K

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about claims severity prediction

How accurate is AI at predicting insurance claim costs?

Graph-based AI models predict claim severity within 10-15% of final cost within 48 hours of FNOL, compared to 30-40% error for traditional adjuster estimates. Accuracy improves as claim details develop (adjuster photos, repair estimates, medical reports). For high-frequency, low-complexity claims, accuracy reaches 5-8% error.

What is FNOL triage and how does AI improve it?

First Notice of Loss (FNOL) triage is the process of routing new claims to the right handler based on complexity and expected cost. AI improves triage by predicting severity and complexity at the moment the claim is filed, routing simple claims to fast-track processing and flagging complex claims for senior adjusters or litigation watch. This cuts average handling time by 60%.

How does claims severity prediction affect insurance reserves?

Inaccurate reserves create two problems: under-reserving leads to balance-sheet surprises and regulatory scrutiny, while over-reserving ties up capital that could be invested or returned to policyholders. AI-driven severity prediction improves reserve accuracy by 30-40%, freeing $20-50B in unnecessary capital industry-wide (AM Best estimates).

Can AI predict litigation risk on insurance claims?

Yes. Graph-based models connect claim characteristics with jurisdiction-specific litigation rates, attorney involvement patterns, and historical verdict data. For auto BI claims, the model identifies that neck injuries in certain jurisdictions with specific medical providers have 3-4x the litigation rate. Early litigation flagging allows insurers to set appropriate reserves and assign specialized handlers.

Bottom line: Improve reserve accuracy by 30-40% at FNOL and triage claims 60% faster, saving $200-400M annually in capital efficiency and claims handling costs for a top-20 insurer.

Related use cases

Explore more insurance use cases

Use Case #1Claims Fraud DetectionLearn more

Use Case #5Subrogation RecoveryLearn more

Use Case #2Underwriting Risk AssessmentLearn more

Previous#3 Policyholder Churn Prediction

Next#5 Subrogation Recovery

Topics covered

claims severity predictioninsurance loss estimation AIFNOL triage automationclaim cost predictiongraph neural network claimsKumoRFMrelational deep learning insurancereserve setting AIclaims triage optimizationloss reserve prediction

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free