Underwriting Risk Assessment
“What is the true risk for this applicant?”
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
What is the true risk for this applicant?
Insurers lose $15-30B annually from adverse selection (underpricing risky applicants) and over-conservatism (rejecting profitable applicants), according to Deloitte. Traditional underwriting models use 15-30 rating factors from the application and third-party data, but miss relational signals: an applicant's neighborhood claims history, the correlation between their vehicle type and local theft rates, or the interaction between their occupation and commute pattern. A 5-point improvement in loss ratio on a $10B book translates to $500M in annual savings.
Quick answer
Graph-based AI improves underwriting risk assessment by connecting applicant profiles with property records, geographic risk factors, claims history, and neighborhood patterns. Traditional GLMs use 15-30 rating factors and miss relational signals like how a zip code's pipe-burst claims correlate with a specific building vintage. Relational models produce risk scores that are 30-40% more predictive, improving loss ratios by 5-10 points on a given book.
Approaches compared
4 ways to solve this problem
1. Generalized Linear Models (GLMs)
The actuarial standard for decades. GLMs use rating factors (age, credit, territory, vehicle type) with multiplicative factor structures. Well-understood by regulators and easy to file.
Best for
Regulatory compliance and rate filings where interpretability is required by state DOIs.
Watch out for
GLMs assume independence between rating factors. They miss interactions like 'older roof + wildfire zone + wood frame' that compound risk non-linearly. Adding interaction terms manually is slow and limited.
2. Gradient-Boosted Trees (XGBoost / LightGBM)
Tree-based models that capture non-linear interactions between rating factors. Used as a supplement to GLMs, often feeding into the GLM structure as an additional factor.
Best for
Capturing non-linear interactions between known rating factors. Good improvement over pure GLMs with modest engineering effort.
Watch out for
Still operates on flat feature vectors. Cannot see neighborhood-level patterns, property-record correlations, or claims-history network effects without heavy manual feature engineering.
3. Third-Party Enrichment Scores
Vendors like LexisNexis, Verisk, and TransUnion provide pre-built risk scores that incorporate credit, claims history, and property data. Drop-in scores that require minimal modeling work.
Best for
Quick lift on top of existing models when you need incremental accuracy without building new infrastructure.
Watch out for
Black-box scores that every competitor also buys. No differentiation, and you cannot customize them for your specific book mix or geographic footprint.
4. Relational Deep Learning (Kumo's Approach)
Connects applicant data with property records, geographic risk, neighborhood claims patterns, and historical outcomes in a single relational graph. Learns cross-table interactions automatically without manual feature engineering.
Best for
Finding hidden risk signals that span multiple data sources: zip-code weather trends + roof age + construction type + neighborhood claims density.
Watch out for
Regulatory acceptance varies by state. Some DOIs require rate-factor transparency that may need a GLM wrapper around the relational model's output.
Key metric: SAP benchmarked relational models at 91% accuracy vs. 75% for XGBoost on structured prediction tasks. A 5-point loss ratio improvement on a $10B book saves $500M annually.
Why relational data changes the answer
Traditional underwriting models flatten each applicant into a single row of 15-30 features. They can see that Sarah Mitchell is 42, has A-tier credit, and lives in 90210. But they cannot see that her specific zip code had a 3x spike in wildfire claims last year, her roof is 18 years old (past the typical replacement threshold for her construction type), and homes in her neighborhood with similar characteristics filed water-damage claims at 2x the regional average. These signals live in different tables: property records, geographic risk databases, and historical claims. A flat model needs a data scientist to manually engineer each cross-table feature, and they inevitably miss combinations they did not think to create.
Relational learning connects these tables directly. The model walks from applicant to property to zip code to historical claims, discovering risk patterns across the full data graph. It finds that the combination of aging roof + wildfire zone + wood-frame construction + rising neighborhood claims density creates a risk profile that is 2x what the individual factors suggest in isolation. This is not theoretical: SAP benchmarked relational models at 91% accuracy versus 75% for XGBoost on structured prediction tasks. The result is sharper risk segmentation that prices accurately for both high-risk applicants (rate up or decline) and low-risk applicants (write more at competitive rates).
Underwriting from a flat applicant table is like evaluating a house for purchase using only the listing sheet. You see bedrooms, square footage, and asking price. But you cannot see that the house next door just had a foundation crack, the neighborhood's water main is 60 years old, and the previous owner filed two undisclosed insurance claims. Relational underwriting is the equivalent of hiring a local inspector who knows the neighborhood's history, the builder's track record, and every claim on the block.
How KumoRFM solves this
Relational intelligence built for insurance data
Kumo connects applicant profiles, claims history, policy data, geographic risk factors, vehicle databases, and external data sources into a relational graph. The model discovers that Applicant APP-3301 lives in a zip code where pipe-burst claims spiked 3x last winter, has a roof older than 15 years (from property records), and owns a breed of dog associated with 2.5x liability claim frequency. These cross-table signals produce a risk score that is 30-40% more predictive than traditional generalized linear models (GLMs), catching both overpriced low-risk and underpriced high-risk applicants.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
APPLICANTS
| applicant_id | name | age | occupation | credit_tier | zip_code |
|---|---|---|---|---|---|
| APP-3301 | Sarah Mitchell | 42 | Teacher | A | 90210 |
| APP-3302 | David Kim | 35 | Software Engineer | A+ | 10001 |
| APP-3303 | Robert Brown | 58 | Contractor | B | 33101 |
PROPERTY_DATA
| applicant_id | property_type | year_built | roof_age | sqft | replacement_cost |
|---|---|---|---|---|---|
| APP-3301 | Single Family | 1998 | 18 years | 2,400 | $420,000 |
| APP-3302 | Condo | 2019 | 6 years | 1,100 | $280,000 |
| APP-3303 | Single Family | 1985 | 12 years | 3,200 | $510,000 |
ZIP_CODE_RISK
| zip_code | weather_risk | theft_index | claims_per_1000 | trend |
|---|---|---|---|---|
| 90210 | Wildfire: High | Low | 42 | Increasing |
| 10001 | Flood: Medium | High | 38 | Stable |
| 33101 | Hurricane: High | Medium | 55 | Increasing |
CLAIMS_HISTORY
| applicant_id | prior_claims_5yr | total_paid | largest_claim | claim_type |
|---|---|---|---|---|
| APP-3301 | 1 | $8,200 | $8,200 | Water Damage |
| APP-3302 | 0 | $0 | $0 | N/A |
| APP-3303 | 3 | $45,000 | $22,000 | Wind Damage |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT SUM(CLAIMS.TOTAL_PAID, 0, 12, months) FOR EACH APPLICANTS.APPLICANT_ID
Prediction output
Every entity gets a score, updated continuously
| APPLICANT_ID | NAME | KUMO_RISK_SCORE | TRADITIONAL_SCORE | EXPECTED_LOSS | RECOMMENDATION |
|---|---|---|---|---|---|
| APP-3301 | Sarah Mitchell | 0.72 | 0.35 | $14,200 | Rate Up 40% |
| APP-3302 | David Kim | 0.18 | 0.22 | $2,800 | Rate Down 15% |
| APP-3303 | Robert Brown | 0.85 | 0.68 | $28,500 | Decline or Restrict |
Understand why
Every prediction includes feature attributions — no black boxes
Applicant APP-3301 (Sarah Mitchell)
Predicted: Risk score 0.72, expected loss $14,200
Top contributing features
Roof age exceeding replacement threshold
18 years
27% attribution
Zip code wildfire risk increasing
High, +15% YoY
24% attribution
Prior water damage claim
$8,200 in 5yr
20% attribution
Property age and construction type
1998 wood frame
17% attribution
Neighborhood claims density
42 per 1,000
12% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Frequently asked questions
Common questions about underwriting risk assessment
How does AI improve insurance underwriting accuracy?
AI improves underwriting accuracy by incorporating signals from connected data sources that traditional rating factors miss. Instead of evaluating each applicant in isolation, graph-based models connect applicant profiles with property records, geographic risk trends, neighborhood claims patterns, and historical outcomes to produce risk scores that are 30-40% more predictive than GLMs alone.
What is the ROI of AI underwriting in insurance?
A 5-point improvement in loss ratio on a $10B book translates to $500M in annual savings. Additionally, more accurate risk segmentation allows insurers to write 8-12% more business at profitable rates by identifying low-risk applicants that traditional models over-price.
Can AI underwriting models satisfy state insurance regulators?
Yes, but implementation matters. Most state DOIs require rate-factor transparency. The practical approach is to use the relational model's output as an additional rating factor within a filed GLM structure. This gives you the predictive power of graph-based learning with the interpretability regulators require.
How does graph-based underwriting differ from traditional actuarial models?
Traditional actuarial models (GLMs) use multiplicative rating factors that assume independence between variables. Graph-based models learn non-linear interactions across connected data: property age interacting with neighborhood claims trends, construction type interacting with local contractor costs, and applicant behavior interacting with competitive market dynamics. These cross-table patterns are invisible to GLMs.
Bottom line: Improve loss ratios by 5-10 points while writing 8-12% more profitable business, translating to $500M+ in annual savings on a $10B book.
Related use cases
Explore more insurance use cases
Topics covered
One Platform. One Model. Infinite Predictions.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.




