How is AI used in insurance?

AI is used across the insurance value chain: underwriting (risk assessment and pricing), claims management (triage, settlement prediction, subrogation identification), fraud detection ($80B+ annual problem in the US alone), customer retention (lapse prediction and intervention), and distribution (lead scoring and cross-sell). The highest-ROI applications involve prediction tasks on relational data: policies, claims, claimants, providers, adjusters, and agents form a rich graph that carries predictive signals flat models miss.

How does graph-based AI improve insurance fraud detection?

Insurance fraud rings involve coordinated actors: claimants, providers, attorneys, and adjusters who work together across multiple claims. Flat models scoring individual claims miss these coordinated patterns. Graph-based AI analyzes the full claim-provider-claimant-attorney network, detecting clusters of connected entities with unusual patterns: the same provider treating claimants who all use the same attorney, claimants filing similar injuries at the same time through different agents, or providers billing for treatments that overlap in ways consistent with staged accidents.

What is the cost of insurance fraud in the United States?

The Coalition Against Insurance Fraud estimates that insurance fraud costs $80 billion annually in the US across all lines (property, casualty, health, auto, workers' comp). The FBI estimates that the average US family pays $400-700 per year in increased premiums due to fraud. For insurers, fraud accounts for 5-10% of total claims cost, making it the single largest source of unnecessary loss.

How can AI improve insurance underwriting?

Traditional underwriting uses application data, credit scores, and actuarial tables. Graph-based AI adds relational signals: the risk profiles of similar policyholders, claims patterns of providers in the applicant's network, geographic clustering of loss events, and correlation patterns between policy features and claims outcomes. This enables more precise risk segmentation, reducing adverse selection by 10-20% and improving loss ratios by 2-5 points.

What is claims triage and how does AI help?

Claims triage is the process of classifying incoming claims by complexity and routing them to appropriate handlers. Simple claims can be auto-adjudicated. Complex claims need senior adjusters. Potentially fraudulent claims need SIU review. AI models predict claim complexity, expected settlement amount, litigation likelihood, and fraud probability at the time of first notice of loss, enabling immediate routing that reduces cycle times by 30-50% and improves customer satisfaction for straightforward claims.

AI in Insurance: Claims, Underwriting, and Fraud Detection | Kumo.ai

Insurance fraud costs the US economy $80 billion annually, according to the Coalition Against Insurance Fraud. That is $80 billion in fictitious claims, inflated injuries, staged accidents, and provider billing schemes that flow straight through to higher premiums for everyone else. The FBI estimates the average American family pays $400-700 per year in excess premiums to cover it.

Every major insurer runs fraud detection models. They flag suspicious claims based on features like claim amount, injury type, time since policy inception, and claimant history. These models catch some fraud. They miss the fraud that costs the most: organized rings where providers, claimants, attorneys, and sometimes adjusters coordinate across dozens of claims.

The difference between catching individual fraudulent claims and catching fraud rings is the difference between recovering thousands and recovering millions. And the only way to see a ring is to see the graph.

claims — sample auto insurance data

claim_id	policy_id	claimant	injury_type	amount	provider	attorney
CL-501	POL-220	J. Martinez	Whiplash	$32,000	Dr. A. Shah	Law Office Chen
CL-502	POL-318	R. Thompson	Whiplash	$28,500	Dr. A. Shah	Law Office Chen
CL-503	POL-445	K. Williams	Soft tissue	$35,200	Dr. B. Patel	Law Office Chen
CL-504	POL-112	M. Garcia	Whiplash	$31,800	Dr. A. Shah	Law Office Chen
CL-505	POL-667	D. Brown	Back pain	$29,400	Dr. B. Patel	Self

Highlighted: four claims share the same attorney. Three share the same provider. All have soft-tissue injuries within a 60-day window. Individually plausible. As a network: a fraud ring.

claims_triage — AI-predicted outcomes at FNOL

claim_id	predicted_settlement	complexity	fraud_prob	litigation_risk	recommended_action
CL-501	$31,200	High	0.89	0.78	Route to SIU
CL-502	$27,800	High	0.86	0.72	Route to SIU
CL-503	$34,100	High	0.91	0.81	Route to SIU
CL-505	$18,200	Low	0.06	0.12	Auto-adjudicate

Graph-based triage identifies the ring members immediately and routes them to SIU, while auto-adjudicating the legitimate claim.

Why insurance data is naturally relational

An insurance company's data model typically spans 20-40 tables. Policies. Policyholders. Claims. Claimants (who may differ from policyholders). Injuries and diagnoses. Treatments. Providers (doctors, hospitals, repair shops). Adjusters. Agents. Attorneys. Payments. Reserves. Reinsurance contracts. Geographic risk zones.

Every claim connects to a web of entities. A single auto accident claim might involve: the policyholder, 2 claimants, 3 medical providers, 1 auto repair shop, 1 attorney, 1 adjuster, 8 treatment events, and 12 payment transactions. That is 28 entities across 8 tables, connected through foreign keys that define who-treated-whom, who-represents-whom, and who-paid-whom.

Traditional fraud models collapse this structure into a single row: "Claim #12345: amount $45,000, injury type whiplash, time since inception 89 days, claimant has 2 prior claims." The entire relational context is lost.

Fraud rings: the invisible threat

A fraud ring in auto insurance might work like this. A group of 20 people stage low-speed collisions in parking lots. Each files a separate claim with a different insurer. They all visit the same 3 medical providers, who bill for extensive treatment of soft-tissue injuries that cannot be verified by imaging. The same attorney represents all 20 claimants. The same tow company handles all the vehicles.

Each individual claim looks plausible. The amounts are within normal ranges. The injuries are consistent with the accidents. The providers are licensed. A flat model scoring each claim independently rates them as medium-risk at worst.

In the graph, the pattern is unmistakable. Twenty claimants connected to the same three providers, the same attorney, and the same tow company, with claims filed within a 60-day window. The hub-and-spoke topology screams organized fraud. But you can only see it if you look at the relationships.

Provider fraud in health insurance

Health insurance fraud follows similar relational patterns. A provider billing for services not rendered, upcoding procedures, or unbundling bundled services creates anomalies that are visible in the provider-patient-treatment-diagnosis graph.

A provider with 200 patients who all received the same expensive diagnostic test within 30 days is suspicious. A provider whose patients are referred exclusively by two other providers, and whose billing volume spiked 300% in 6 months, is more suspicious. A provider whose patient overlap with known fraudulent providers exceeds statistical norms is a strong lead.

The National Health Care Anti-Fraud Association estimates that health care fraud costs $68 billion annually in the US, representing 3-10% of total health care spending. Graph-based detection can identify suspicious provider networks 2-3x faster than rule-based systems while reducing false positive rates by 40-50%.

Underwriting: precision risk pricing

Underwriting has historically relied on actuarial tables: age, location, vehicle type, driving record, credit score. These features are predictive, but they treat each applicant as an independent data point. Graph-based underwriting adds the relational dimension.

applicants — identical actuarial profiles

applicant	age	vehicle	zip_code	credit_score	driving_record
Applicant A	35	2023 Honda Accord	97201	720	Clean (5 years)
Applicant B	34	2023 Honda Accord	97205	715	Clean (6 years)

Virtually identical profiles. Traditional underwriting assigns them the same risk tier and premium within $20/year.

geographic_risk_graph — what the relational model sees

metric	Applicant A (zip 97201)	Applicant B (zip 97205)
Auto theft claims (5-year)	2 claims	47 claims
Collision claims (5-year)	8 claims	31 claims
Avg claim severity	$4,200	$11,800
Nearby repair shop fraud rate	1%	12%
Similar policyholder loss ratio	52%	89%

Applicant B's zip code has 23x more auto thefts and nearly 4x the collision claims. Policyholders with similar profiles in that zip have an 89% loss ratio versus 52%. The expected loss differs by 3-5x despite identical flat features.

Beyond geography, relational underwriting considers: the claims history of similar policyholders (not just the applicant), the risk profiles of providers in the applicant's likely treatment network, correlation patterns between policy features and claims outcomes across the existing book, and economic indicators from connected entities (employers, industries, regions).

Traditional insurance AI

Scores individual claims with flat features
Underwriting uses actuarial tables and credit scores
Fraud detection catches individual bad actors
Claims triage based on simple rules
5-10% of claims cost lost to undetected fraud

Graph-based insurance AI

Analyzes claim-provider-claimant-attorney network
Underwriting uses relational risk signals
Fraud detection catches organized rings
Claims triage predicts complexity, fraud, and litigation
40-60% more fraud detected with fewer false positives

PQL Query

PREDICT fraud_ring_probability
FOR EACH claims.claim_id
WHERE claims.filed_date > '2025-01-01'

One query scores every new claim against the full claim-provider-claimant-attorney network. The model detects ring structures, shared entities, and coordinated filing patterns.

Output

claim_id	fraud_ring_prob	ring_size	shared_entities	action
CL-501	0.89	4	Provider + Attorney + timing	SIU investigation
CL-502	0.86	4	Provider + Attorney + timing	SIU investigation
CL-503	0.91	4	Attorney + timing + injury type	SIU investigation
CL-504	0.88	4	Provider + Attorney + timing	SIU investigation
CL-505	0.06	0	No network anomalies	Auto-adjudicate

Claims management: speed and accuracy

The claims process is where insurers deliver on their promise. Speed matters: customers who wait more than 30 days for claims resolution are 2.5x more likely to switch carriers at renewal. Accuracy matters: underpayment leads to litigation, overpayment erodes margins.

Intelligent triage

When a claim arrives at first notice of loss (FNOL), graph-based models can predict: expected settlement amount (within 15-20% accuracy at FNOL), claims complexity (simple, moderate, complex), litigation likelihood, fraud probability, and subrogation potential. These predictions enable immediate routing: simple claims go to auto-adjudication or junior adjusters, complex claims go to senior adjusters, suspicious claims go to SIU.

Insurers implementing AI-driven triage report 30-50% reductions in average cycle time for straightforward claims and 15-25% reductions in total claims cost through better reserve accuracy and faster settlement.

Settlement prediction

Predicting the final settlement amount at FNOL determines reserve accuracy, which directly affects financial reporting and reinsurance costs. Traditional models use claim characteristics (injury type, vehicle damage, jurisdiction). Graph-based models add the outcomes of similar claims with the same providers, attorneys, and adjusters, providing 20-30% more accurate reserve estimates.

The foundation model approach

Building separate models for fraud detection, underwriting optimization, claims triage, settlement prediction, and retention requires five separate ML pipelines, five separate feature engineering efforts, and five separate maintenance budgets. At a mid-size insurer, this represents $3M-8M annually in ML team and infrastructure costs.

KumoRFM connects directly to the insurer's data warehouse, understands the full policy-claim-provider-claimant schema, and answers any prediction question without task-specific engineering. The same model that detects fraud also predicts settlement amounts, scores underwriting risk, and identifies lapse-prone policyholders.

On the RelBench benchmark, KumoRFM zero-shot achieves 76.71 AUROC across classification tasks, outperforming supervised GNNs trained specifically on each task. Fine-tuning pushes accuracy to 81.14 AUROC.

For insurers, where the combined ratio determines profitability and every point of loss ratio improvement flows directly to the bottom line, the ability to predict better and faster across all lines of business simultaneously is not a technology upgrade. It is a competitive advantage that compounds every quarter.

Key Takeaways

1Insurance fraud costs $80 billion annually in the US. Organized fraud rings account for $5-8 billion in auto insurance alone and operate for years because individual claims pass traditional detection thresholds.
2Graph-based detection identifies ring structures (shared providers, attorneys, tow companies, timing clusters) in days rather than the 12-18 months typical of manual SIU investigation.
3Claims triage powered by graph AI predicts settlement amount, complexity, fraud probability, and litigation risk at FNOL, reducing cycle times by 30-50% and total claims cost by 15-25%.
4Underwriting models that incorporate relational risk signals (geographic claim density, provider network quality, policyholder similarity) improve loss ratios by 2-5 points.
5One foundation model serves fraud detection, claims triage, settlement prediction, underwriting, and retention from a single platform. For insurers where combined ratio determines profitability, this is a competitive inflection point.

AI in Insurance: Claims, Underwriting, and Fraud Detection