8Classification · Collections

Collections Optimization

“Which delinquent accounts will self-cure vs need intervention?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which delinquent accounts will self-cure vs need intervention?

Banks spend $2-4B annually on collections operations, yet 40-60% of early-stage delinquent accounts self-cure without any intervention (McKinsey). Treating every 30-day-late account with the same urgency wastes collector time on accounts that would have paid anyway while under-prioritizing accounts heading toward charge-off. A bank with 200K delinquent accounts per month needs to know which 80K genuinely need a call, which 40K need a workout plan, and which 80K will resolve on their own.

Quick answer

The most effective collections models predict which delinquent accounts will self-cure without intervention vs which need immediate outreach. By connecting payment history, income signals, cross-account utilization, and prior collections outcomes, relational ML models route collectors to the 40% of accounts that genuinely need a call, instead of treating all 30-day-late accounts the same. This cuts collections costs while reducing charge-off rates by 15-20%.

Approaches compared

4 ways to solve this problem

1. Days-past-due waterfall

Route all accounts by delinquency bucket: 30-day gets a letter, 60-day gets a call, 90-day gets escalated. Same treatment within each bucket.

Best for

Simple, operationally straightforward, and compliant. The default approach at most banks.

Watch out for

Wastes 40-60% of collector time on accounts that would have self-cured anyway (McKinsey). Meanwhile, the accounts heading toward charge-off get the same generic treatment as those that just forgot a payment.

2. Scorecard-based prioritization

Score each delinquent account on static features (balance, DPD, FICO, product type) to rank-order the call list.

Best for

Better than uniform treatment. Focuses collectors on higher-balance, lower-FICO accounts first.

Watch out for

Static features miss behavioral trajectory. An account with declining direct deposits, rising utilization on other cards, and a prior workout history is much less likely to self-cure than the FICO score alone suggests.

3. XGBoost on enriched features

Add behavioral features (payment consistency, deposit trend, utilization trajectory) to the scorecard inputs and train a gradient-boosted self-cure classifier.

Best for

Meaningful lift over scorecards. Handles nonlinear interactions between income stability and utilization.

Watch out for

Each delinquent account is scored in isolation. Cross-account signals (borrower has 2 other delinquencies, income declining across all deposit accounts) require manual joins that are brittle and slow to update.

4. KumoRFM (relational graph ML)

Connect delinquent accounts to payment histories, borrower income signals, cross-account balances, and prior collections outcomes. The GNN learns self-cure patterns from the full relational context.

Best for

Separates the 'forgot to pay' accounts (stable deposits, low utilization, prior self-cure history) from the 'heading toward charge-off' accounts (declining income, 87% utilization, multiple delinquencies). Routes collectors where they matter most.

Watch out for

Requires payment-level and cross-account data, not just the delinquent account in isolation. Data integration across lending products can be a challenge at siloed institutions.

Key metric: 40-60% of early-stage delinquent accounts self-cure without intervention (McKinsey). Relational ML identifies which ones, cutting collector workload by 40-60%.

Why relational data changes the answer

Self-cure prediction depends on signals spread across multiple systems. Payment history shows whether this borrower has caught up before and how quickly. Deposit account data shows whether income is stable or declining. Credit line data shows total utilization across all accounts. Prior collections records show how this borrower responded to previous intervention. No single table tells you whether an account will self-cure.

Relational models read the full borrower-account-payment graph and learn patterns like 'stable direct deposits + low cross-account utilization + prior self-cure in 12 days = 82% self-cure probability' vs 'declining deposits + 87% total utilization + 2 other delinquencies + prior workout = 11% self-cure probability.' This separation lets banks focus 100% of collector effort on the accounts where intervention actually changes the outcome, cutting costs while reducing charge-offs by 15-20%.

Treating every 30-day-late account the same is like an emergency room triaging every patient with the same urgency. The person with a sprained ankle will heal on their own. The person with chest pains needs immediate attention. Self-cure prediction is triage: it tells you which accounts need the doctor and which ones just need an ice pack.

How KumoRFM solves this

Relational intelligence built for banking and financial data

Kumo connects delinquent accounts to their full payment history, transaction patterns, employment signals, other credit lines, and prior collections outcomes. The model learns that Account L-8002 missed a payment but has consistent direct deposits, no balance growth on other lines, and a history of catching up within 15 days. Meanwhile, Account L-8045 shows declining income signals, rising utilization across all cards, and a pattern of minimum-only payments. Kumo routes collectors to the 40% of accounts that truly need intervention.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

DELINQUENT_ACCOUNTS

account_id	borrower_id	product	days_past_due	balance_owed
L-8002	B-2087	Personal Loan	32	$14,200
L-8045	B-2120	Auto Loan	45	$22,800
L-8067	B-2155	Credit Card	31	$6,400

PAYMENT_HISTORY

account_id	month	amount_due	amount_paid	days_late
L-8002	2025-07	$310	$310	0
L-8002	2025-08	$310	$310	3
L-8045	2025-07	$485	$485	0

BORROWER_SIGNALS

borrower_id	direct_deposit_trend	total_utilization	other_delinquencies
B-2087	Stable	38%	0
B-2120	Declining -15%	87%	2
B-2155	Stable	52%	0

PRIOR_COLLECTIONS

borrower_id	prior_delinquency	outcome	days_to_resolve
B-2087	2024-03	Self-cured	12
B-2120	2024-11	Workout plan	90
B-2155	None	N/A	N/A

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(DELINQUENT_ACCOUNTS.STATUS = 'self_cured', 0, 30, days)
FOR EACH DELINQUENT_ACCOUNTS.ACCOUNT_ID
WHERE DELINQUENT_ACCOUNTS.DAYS_PAST_DUE > 30

Prediction output

Every entity gets a score, updated continuously

ACCOUNT_ID	BORROWER	SELF_CURE_PROB	RECOMMENDED_ACTION	PRIORITY
L-8002	B-2087	0.82	Monitor Only	Low
L-8067	B-2155	0.61	Soft Reminder	Medium
L-8045	B-2120	0.11	Collector Outreach	Critical

Understand why

Every prediction includes feature attributions — no black boxes

Account L-8045 (Auto Loan, B-2120)

Predicted: 11% self-cure probability (needs intervention)

Top contributing features

Income signal declining

-15% deposits

28% attribution

Cross-account utilization

87% total

25% attribution

Multiple concurrent delinquencies

2 other

21% attribution

Prior collections required workout

90 days

15% attribution

Days past due trajectory

Worsening

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about collections optimization

How do you predict which delinquent accounts will self-cure?

Connect payment history, income signals (direct deposit trends), cross-account utilization, and prior collections outcomes into a relational model. The strongest self-cure predictors are stable income, low cross-account utilization, and a history of catching up quickly on prior delinquencies. Accounts with all three signals self-cure 80%+ of the time without any collector intervention.

What percentage of delinquent accounts self-cure?

40-60% of early-stage delinquent accounts (30-60 days past due) self-cure without intervention (McKinsey). The challenge is identifying which ones. Without a predictive model, banks either call everyone (wasting 40-60% of collector time) or apply static rules that miss the behavioral signals distinguishing self-cures from charge-offs.

How can ML reduce collections costs?

By routing collectors only to the 40% of delinquent accounts that genuinely need intervention. This reduces call volume by 40-60% while maintaining or improving recovery rates. For a bank with 200K delinquent accounts per month, that means 80-120K fewer unnecessary calls per month, saving $800M-$1.2B industry-wide.

What data do you need for a collections optimization model?

Delinquent account details (DPD, balance, product type), full payment history (consistency, timing, amount patterns), borrower income signals (direct deposit trends), cross-account data (utilization on other products), and prior collections outcomes (self-cured, workout, charge-off, days to resolve). The more cross-account context, the better the self-cure prediction.

How do you balance collections efficiency with customer experience?

Accurate self-cure prediction improves both. Customers who would self-cure are not bothered by unnecessary collection calls (better experience). Customers heading toward charge-off get intervention sooner (better outcome). The model does not reduce effort overall. It redirects effort from accounts that do not need it to accounts that do.

Bottom line: Focus collector effort on the 40% of delinquent accounts that genuinely need intervention, reducing collections costs by $800M-$1.2B industry-wide while cutting charge-off rates by 15-20%.

Related use cases

Explore more financial services use cases

Use Case #2Credit Risk ScoringLearn more

Use Case #6Customer Lifetime ValueLearn more

Use Case #1Banking Customer Churn PredictionLearn more

Previous#7 Next Best Action

Next#9 Branch Demand Forecasting

Topics covered

collections optimization AIself-cure prediction bankingdelinquency managementcollections prioritizationgraph neural network collectionsKumoRFMloan recovery predictionrelational deep learning collectionscharge-off preventioncollections strategy AI

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free