2Classification · Credit Risk

Credit Risk Scoring

“Which borrowers will default within 12 months?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which borrowers will default within 12 months?

US banks charge off $50B+ in consumer loans annually. Traditional scorecards rely on bureau scores and static application data, missing dynamic behavioral signals like changing payment patterns across multiple credit lines, rising utilization on revolving accounts, and transaction velocity shifts. A mid-size lender with a $20B portfolio estimated that a 10% improvement in default prediction would save $80-120M per year in charge-off losses while approving 5% more creditworthy borrowers currently declined by blunt FICO cutoffs.

Quick answer

The most accurate credit risk models go beyond FICO and application data by incorporating cross-table behavioral signals: payment patterns across multiple credit lines, rising utilization trends, cash-advance frequency, and transaction velocity shifts. Relational ML models that read these connected tables outperform single-table logistic regression and gradient-boosted models. On the RelBench benchmark, relational approaches score 76.71 vs 62.44 for flat-table baselines.

Approaches compared

4 ways to solve this problem

1. Traditional scorecards (logistic regression on bureau data)

Use FICO score, DTI ratio, employment tenure, and application data in a logistic regression model.

Best for

Regulatory familiarity, interpretability, and fast deployment. Still the baseline at most banks.

Watch out for

Static snapshot. A 740 FICO borrower can be 6 months from default if their behavioral trajectory is deteriorating across multiple accounts.

2. XGBoost on enriched flat table

Add engineered features (payment consistency ratio, utilization trend, cash advance count) to bureau data and train a gradient-boosted model.

Best for

Meaningful lift over logistic regression. Good when you have a strong data science team to build and maintain features.

Watch out for

Feature engineering is manual, slow, and brittle. Cross-account patterns (rising utilization on Card A while making minimum payments on HELOC B) require custom joins that break when schemas change.

3. Graph-based risk scoring

Build a borrower-account-payment graph and compute network features (payment centrality, default contagion risk) for a downstream model.

Best for

Captures second-order risk: borrowers connected to other defaulting borrowers carry higher risk than their individual features suggest.

Watch out for

Graph features are typically batch-computed and stale by the time they reach the model. Temporal patterns (when payment behavior changed) are lost.

4. KumoRFM (relational graph ML)

Connect loans, payments, credit lines, transactions, and bureau data in Kumo. Write a PQL query to predict default probability. The GNN learns temporal cross-table distress signals automatically.

Best for

Catches the 740-FICO borrower whose payment consistency is declining across three revolving accounts, whose cash advances spiked 300%, and who switched to minimum-only payments. These multi-table trajectories are invisible to flat models.

Watch out for

Requires normalized relational data with clean foreign keys. Model explanations need translation for regulators accustomed to scorecard factor codes.

Key metric: RelBench benchmark: relational models score 76.71 vs 62.44 for single-table baselines on credit default prediction tasks.

Why relational data changes the answer

Credit distress is a multi-table phenomenon. A borrower does not default on one loan in isolation. The warning signs show up as rising utilization on revolving credit lines, a shift to minimum-only payments on the HELOC, late payments emerging on the auto loan, and increasing cash-advance frequency on credit cards. Each table holds one piece of the puzzle, and no amount of feature engineering on a single flat table captures the temporal convergence of these signals.

Relational models read the full payment-account-borrower graph and learn sequences like 'minimum-only payments for 5 of 6 months on credit cards, followed by a 12-day late payment on the auto loan, while total cross-account balance grew by $18K.' On RelBench, this approach scores 76.71 vs 62.44 for single-table baselines. For a $20B portfolio, that accuracy gap translates directly to $80-120M in avoided charge-off losses.

Judging credit risk from a FICO score alone is like evaluating a company's health from its stock price. The stock might look fine today, but if you read the quarterly earnings (payment history), balance sheet (utilization across all accounts), and cash flow statement (transaction patterns), you would see the distress signals months before the stock price drops. Relational ML reads all the financial statements together.

How KumoRFM solves this

Relational intelligence built for banking and financial data

Kumo connects loan applications, payment histories, account balances, transaction patterns, and bureau data into a unified relational graph. The model discovers that Borrower B-2041 has a 740 FICO but declining payment consistency across three revolving accounts, rising cash-advance frequency, and a new pattern of minimum-only payments. These cross-table behavioral signals produce a more accurate probability-of-default score than any single-table logistic regression, surfacing risk 3-6 months before a traditional model would flag it.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

BORROWERS

borrower_id	fico_score	income	employment_years	dti_ratio
B-2041	740	$95,000	6.2	0.34
B-2087	680	$62,000	3.1	0.41
B-2103	790	$142,000	14.5	0.22

LOANS

loan_id	borrower_id	type	principal	rate	origination_date
L-8001	B-2041	Auto	$32,000	5.9%	2024-03-15
L-8002	B-2087	Personal	$15,000	11.2%	2024-07-22
L-8003	B-2103	Mortgage	$450,000	6.5%	2023-11-01

PAYMENTS

payment_id	loan_id	amount	days_late	timestamp
P-001	L-8001	$542.18	0	2025-08-01
P-002	L-8001	$542.18	12	2025-09-01
P-003	L-8002	$310.00	0	2025-09-01

CREDIT_LINES

line_id	borrower_id	type	limit	balance	min_payment_only
CL-01	B-2041	Credit Card	$15,000	$13,200	True
CL-02	B-2041	HELOC	$50,000	$42,000	False
CL-03	B-2087	Credit Card	$8,000	$3,100	False

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(LOANS.STATUS = 'default', 0, 12, months)
FOR EACH BORROWERS.BORROWER_ID
WHERE LOANS.STATUS = 'active'

Prediction output

Every entity gets a score, updated continuously

BORROWER_ID	FICO	KUMO_PD_SCORE	TRADITIONAL_PD	RISK_BAND
B-2041	740	0.38	0.06	Elevated
B-2087	680	0.22	0.18	Moderate
B-2103	790	0.03	0.02	Low

Understand why

Every prediction includes feature attributions — no black boxes

Borrower B-2041 (FICO 740)

Predicted: 38% probability of default within 12 months

Top contributing features

Credit card utilization trend

88% and rising

29% attribution

Minimum-only payment pattern

5 of 6 months

24% attribution

Late payment emergence (auto loan)

12 days

20% attribution

Cash advance frequency increase

+300%

16% attribution

Cross-account balance growth

+$18K in 6mo

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about credit risk scoring

What is the best ML model for credit risk scoring?

For banks with rich transactional data, relational graph ML models outperform traditional scorecards and flat-table XGBoost. They capture cross-account behavioral trajectories (payment patterns, utilization trends, cash advance frequency) that static bureau-based models miss. RelBench benchmarks show a 76.71 vs 62.44 scoring gap between relational and single-table approaches.

Can ML credit risk models replace FICO scores?

Not replace, but significantly augment. FICO is a useful starting point, but it is a static snapshot that does not reflect recent behavioral deterioration. ML models that incorporate payment trajectory, utilization trends, and cross-account signals catch high-FICO borrowers heading toward default 3-6 months before traditional models flag them.

How do you explain ML credit risk models to regulators?

Modern relational ML models provide feature attribution scores showing which signals drove each prediction (e.g., 'credit card utilization trend contributed 29% to this risk score'). These explanations map to familiar risk factors regulators already understand. The key is demonstrating that the model's reasoning aligns with known credit-risk drivers, not that it uses a specific algorithm.

What data do you need for an ML credit risk model?

Bureau data and application fields are the starting point. The real lift comes from adding payment histories across all credit lines, transaction-level spending patterns, balance trajectories over time, and behavioral signals like cash advance frequency. More connected tables with clear foreign keys means more predictive signal.

How much can ML reduce credit losses?

A mid-size lender with a $20B portfolio can expect $80-120M in annual charge-off reduction by catching early distress signals 3-6 months sooner. The same model also approves 5% more creditworthy borrowers currently declined by blunt FICO cutoffs, adding origination volume without increasing risk.

Bottom line: Catch high-FICO borrowers showing early distress signals and reduce charge-off losses by $80-120M annually on a $20B portfolio while safely approving 5% more good borrowers.

Related use cases

Explore more financial services use cases

Use Case #8Collections OptimizationLearn more

Use Case #5AML DetectionLearn more

Use Case #4Transaction Fraud DetectionLearn more

Previous#1 Banking Customer Churn Prediction

Next#3 Cross-Sell Optimization

Topics covered

credit risk AIloan default predictionborrower risk modelinggraph neural network creditKumoRFMcredit scoring machine learningprobability of defaultrelational deep learning lendingcredit risk analyticsPD model banking

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free