2Regression · Risk Scoring

Underwriting Risk Assessment

“What is the true risk for this applicant?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

What is the true risk for this applicant?

Insurers lose $15-30B annually from adverse selection (underpricing risky applicants) and over-conservatism (rejecting profitable applicants), according to Deloitte. Traditional underwriting models use 15-30 rating factors from the application and third-party data, but miss relational signals: an applicant's neighborhood claims history, the correlation between their vehicle type and local theft rates, or the interaction between their occupation and commute pattern. A 5-point improvement in loss ratio on a $10B book translates to $500M in annual savings.

Quick answer

Graph-based AI improves underwriting risk assessment by connecting applicant profiles with property records, geographic risk factors, claims history, and neighborhood patterns. Traditional GLMs use 15-30 rating factors and miss relational signals like how a zip code's pipe-burst claims correlate with a specific building vintage. Relational models produce risk scores that are 30-40% more predictive, improving loss ratios by 5-10 points on a given book.

Approaches compared

4 ways to solve this problem

1. Generalized Linear Models (GLMs)

The actuarial standard for decades. GLMs use rating factors (age, credit, territory, vehicle type) with multiplicative factor structures. Well-understood by regulators and easy to file.

Best for

Regulatory compliance and rate filings where interpretability is required by state DOIs.

Watch out for

GLMs assume independence between rating factors. They miss interactions like 'older roof + wildfire zone + wood frame' that compound risk non-linearly. Adding interaction terms manually is slow and limited.

2. Gradient-Boosted Trees (XGBoost / LightGBM)

Tree-based models that capture non-linear interactions between rating factors. Used as a supplement to GLMs, often feeding into the GLM structure as an additional factor.

Best for

Capturing non-linear interactions between known rating factors. Good improvement over pure GLMs with modest engineering effort.

Watch out for

Still operates on flat feature vectors. Cannot see neighborhood-level patterns, property-record correlations, or claims-history network effects without heavy manual feature engineering.

3. Third-Party Enrichment Scores

Vendors like LexisNexis, Verisk, and TransUnion provide pre-built risk scores that incorporate credit, claims history, and property data. Drop-in scores that require minimal modeling work.

Best for

Quick lift on top of existing models when you need incremental accuracy without building new infrastructure.

Watch out for

Black-box scores that every competitor also buys. No differentiation, and you cannot customize them for your specific book mix or geographic footprint.

4. Relational Deep Learning (Kumo's Approach)

Connects applicant data with property records, geographic risk, neighborhood claims patterns, and historical outcomes in a single relational graph. Learns cross-table interactions automatically without manual feature engineering.

Best for

Finding hidden risk signals that span multiple data sources: zip-code weather trends + roof age + construction type + neighborhood claims density.

Watch out for

Regulatory acceptance varies by state. Some DOIs require rate-factor transparency that may need a GLM wrapper around the relational model's output.

Key metric: SAP benchmarked relational models at 91% accuracy vs. 75% for XGBoost on structured prediction tasks. A 5-point loss ratio improvement on a $10B book saves $500M annually.

Why relational data changes the answer

Traditional underwriting models flatten each applicant into a single row of 15-30 features. They can see that Sarah Mitchell is 42, has A-tier credit, and lives in 90210. But they cannot see that her specific zip code had a 3x spike in wildfire claims last year, her roof is 18 years old (past the typical replacement threshold for her construction type), and homes in her neighborhood with similar characteristics filed water-damage claims at 2x the regional average. These signals live in different tables: property records, geographic risk databases, and historical claims. A flat model needs a data scientist to manually engineer each cross-table feature, and they inevitably miss combinations they did not think to create.

Relational learning connects these tables directly. The model walks from applicant to property to zip code to historical claims, discovering risk patterns across the full data graph. It finds that the combination of aging roof + wildfire zone + wood-frame construction + rising neighborhood claims density creates a risk profile that is 2x what the individual factors suggest in isolation. This is not theoretical: SAP benchmarked relational models at 91% accuracy versus 75% for XGBoost on structured prediction tasks. The result is sharper risk segmentation that prices accurately for both high-risk applicants (rate up or decline) and low-risk applicants (write more at competitive rates).

Underwriting from a flat applicant table is like evaluating a house for purchase using only the listing sheet. You see bedrooms, square footage, and asking price. But you cannot see that the house next door just had a foundation crack, the neighborhood's water main is 60 years old, and the previous owner filed two undisclosed insurance claims. Relational underwriting is the equivalent of hiring a local inspector who knows the neighborhood's history, the builder's track record, and every claim on the block.

How KumoRFM solves this

Relational intelligence built for insurance data

Kumo connects applicant profiles, claims history, policy data, geographic risk factors, vehicle databases, and external data sources into a relational graph. The model discovers that Applicant APP-3301 lives in a zip code where pipe-burst claims spiked 3x last winter, has a roof older than 15 years (from property records), and owns a breed of dog associated with 2.5x liability claim frequency. These cross-table signals produce a risk score that is 30-40% more predictive than traditional generalized linear models (GLMs), catching both overpriced low-risk and underpriced high-risk applicants.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

APPLICANTS

applicant_id	name	age	occupation	credit_tier	zip_code
APP-3301	Sarah Mitchell	42	Teacher	A	90210
APP-3302	David Kim	35	Software Engineer	A+	10001
APP-3303	Robert Brown	58	Contractor	B	33101

PROPERTY_DATA

applicant_id	property_type	year_built	roof_age	sqft	replacement_cost
APP-3301	Single Family	1998	18 years	2,400	$420,000
APP-3302	Condo	2019	6 years	1,100	$280,000
APP-3303	Single Family	1985	12 years	3,200	$510,000

ZIP_CODE_RISK

zip_code	weather_risk	theft_index	claims_per_1000	trend
90210	Wildfire: High	Low	42	Increasing
10001	Flood: Medium	High	38	Stable
33101	Hurricane: High	Medium	55	Increasing

CLAIMS_HISTORY

applicant_id	prior_claims_5yr	total_paid	largest_claim	claim_type
APP-3301	1	$8,200	$8,200	Water Damage
APP-3302	0	$0	$0	N/A
APP-3303	3	$45,000	$22,000	Wind Damage

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT SUM(CLAIMS.TOTAL_PAID, 0, 12, months)
FOR EACH APPLICANTS.APPLICANT_ID

Prediction output

Every entity gets a score, updated continuously

APPLICANT_ID	NAME	KUMO_RISK_SCORE	TRADITIONAL_SCORE	EXPECTED_LOSS	RECOMMENDATION
APP-3301	Sarah Mitchell	0.72	0.35	$14,200	Rate Up 40%
APP-3302	David Kim	0.18	0.22	$2,800	Rate Down 15%
APP-3303	Robert Brown	0.85	0.68	$28,500	Decline or Restrict

Understand why

Every prediction includes feature attributions — no black boxes

Applicant APP-3301 (Sarah Mitchell)

Predicted: Risk score 0.72, expected loss $14,200

Top contributing features

Roof age exceeding replacement threshold

18 years

27% attribution

Zip code wildfire risk increasing

High, +15% YoY

24% attribution

Prior water damage claim

$8,200 in 5yr

20% attribution

Property age and construction type

1998 wood frame

17% attribution

Neighborhood claims density

42 per 1,000

12% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about underwriting risk assessment

How does AI improve insurance underwriting accuracy?

AI improves underwriting accuracy by incorporating signals from connected data sources that traditional rating factors miss. Instead of evaluating each applicant in isolation, graph-based models connect applicant profiles with property records, geographic risk trends, neighborhood claims patterns, and historical outcomes to produce risk scores that are 30-40% more predictive than GLMs alone.

What is the ROI of AI underwriting in insurance?

A 5-point improvement in loss ratio on a $10B book translates to $500M in annual savings. Additionally, more accurate risk segmentation allows insurers to write 8-12% more business at profitable rates by identifying low-risk applicants that traditional models over-price.

Can AI underwriting models satisfy state insurance regulators?

Yes, but implementation matters. Most state DOIs require rate-factor transparency. The practical approach is to use the relational model's output as an additional rating factor within a filed GLM structure. This gives you the predictive power of graph-based learning with the interpretability regulators require.

How does graph-based underwriting differ from traditional actuarial models?

Traditional actuarial models (GLMs) use multiplicative rating factors that assume independence between variables. Graph-based models learn non-linear interactions across connected data: property age interacting with neighborhood claims trends, construction type interacting with local contractor costs, and applicant behavior interacting with competitive market dynamics. These cross-table patterns are invisible to GLMs.

Bottom line: Improve loss ratios by 5-10 points while writing 8-12% more profitable business, translating to $500M+ in annual savings on a $10B book.

Related use cases

Explore more insurance use cases

Use Case #8Pricing OptimizationLearn more

Use Case #4Claims Severity PredictionLearn more

Use Case #1Claims Fraud DetectionLearn more

Previous#1 Claims Fraud Detection

Next#3 Policyholder Churn Prediction

Topics covered

underwriting risk AIinsurance underwriting modelrisk assessment machine learningloss ratio improvementgraph neural network underwritingKumoRFMrelational deep learning insurancepredictive underwritinginsurance risk scoringactuarial AI model

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free