What data is needed for readmission prediction?

Kumo connects directly to your existing relational tables: PATIENTS, ENCOUNTERS, DIAGNOSES, PROCEDURES, MEDICATIONS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

1Binary Classification · Readmission Risk

Readmission Prediction

“Which patients will be readmitted within 30 days?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which patients will be readmitted within 30 days?

CMS penalizes hospitals up to 3% of Medicare reimbursements for excess readmissions. For a 400-bed hospital processing 25,000 discharges per year, each avoidable readmission costs $15,000 on average. A 15% reduction in 30-day readmissions saves $5.6M annually in penalties and direct care costs. Traditional LACE scores miss the cross-patient signals hidden in shared providers, medication overlaps, and procedure histories.

Quick answer

Graph-based AI predicts 30-day hospital readmissions by connecting patient records, encounter histories, diagnoses, procedures, and medications into a relational network. Unlike LACE scores that evaluate each patient in isolation, relational models detect cross-patient patterns: patients sharing the same attending physician with specific procedure-diagnosis combinations have correlated readmission risk. A 400-bed hospital reducing readmissions by 15% saves $5.6M annually in CMS penalties and direct care costs.

Approaches compared

4 ways to solve this problem

1. LACE Index

A validated scoring tool using four variables: Length of stay, Acuity of admission, Comorbidities (Charlson score), and Emergency department visits in prior 6 months. The most widely used readmission risk tool in US hospitals.

Best for

Quick risk stratification at discharge when you need a simple, interpretable score that nurses and case managers can act on immediately.

Watch out for

LACE uses only four variables and ignores medication complexity, provider patterns, and cross-patient signals. Its discriminative ability (C-statistic 0.68-0.72) is not much better than a coin flip for individual patient decisions.

2. Logistic Regression on EHR Features

Custom models trained on 20-50 EHR features: demographics, diagnoses, lab values, prior utilization. Built by hospital analytics teams using their own data.

Best for

Hospitals with mature analytics teams that can extract and maintain EHR feature pipelines. Better than LACE because it uses more features and is trained on local data.

Watch out for

Feature engineering from EHR data is slow and fragile. Each new predictor requires clinical validation, ETL pipeline changes, and model retraining. The models still treat each patient as an independent row.

3. Commercial Risk Scores (Epic, Cerner)

EHR vendor-provided readmission risk scores embedded in the clinical workflow. Trained on multi-site data with periodic updates. Integrated into discharge planning dashboards.

Best for

Hospitals that want a turnkey solution integrated into their existing EHR without building custom models.

Watch out for

Black-box scores trained on general populations may not reflect your specific patient mix, provider patterns, or community factors. You cannot customize them for your highest-value use cases.

4. Graph Neural Networks (Kumo's Approach)

Builds a heterogeneous graph across patients, encounters, diagnoses, procedures, medications, and providers. Learns cross-patient readmission patterns from the full relational structure of the EHR.

Best for

Detecting network-level risk signals: medication interaction patterns across the patient population, provider-specific readmission correlations, and procedure-diagnosis combinations that predict complications.

Watch out for

Requires connected EHR data across multiple tables. If your hospital's data is trapped in disconnected clinical systems, the integration work comes first.

Key metric: LACE scores achieve a C-statistic of 0.68-0.72. Graph-based models exceed 0.82 by capturing cross-patient signals. A 15% readmission reduction saves $5.6M annually for a 400-bed hospital.

Why relational data changes the answer

Flat readmission models score each patient based on their own clinical data: age, diagnoses, lab results, prior visits. They can identify that an 81-year-old with ESRD and 4 recent inpatient encounters is high risk. But they cannot see that this patient's attending physician has a 34% readmission rate across all their patients (versus a 18% hospital average), that the specific combination of hemodialysis plus the 7 active medications this patient takes has a known interaction pattern, and that other patients discharged from the same unit in the same week had elevated readmission rates suggesting a systemic issue (perhaps staffing or discharge-process related). These cross-patient signals live in the relationships between patients, providers, medications, and encounters.

Relational learning connects these entities directly. The model walks from patient to their encounters, to the providers on those encounters, to other patients seen by those providers, to the medication patterns across those patient cohorts. It discovers that shared-provider readmission rate is the third most predictive feature for Patient P1003, contributing 18% of the prediction attribution. This is a signal that no patient-level model can capture because it requires looking across the patient network. The result is a readmission score that accounts for both individual clinical risk and systemic network factors, producing a C-statistic above 0.82 compared to 0.68-0.72 for LACE.

Predicting readmissions from individual patient records is like predicting which students will fail a class by looking only at their GPA. You miss that the professor has a 40% failure rate across all sections, the textbook was recently changed to a harder edition, and students in the Monday section consistently underperform Tuesday students. The individual student matters, but the network they are embedded in matters just as much. Relational readmission models see both the student and the classroom.

How KumoRFM solves this

Graph-learned clinical intelligence across your entire patient network

Kumo builds a heterogeneous graph across patients, encounters, diagnoses, procedures, and medications. It learns that patients sharing the same attending physician with specific procedure-diagnosis combinations have correlated readmission risk. The model captures medication interaction patterns across the patient network, not just individual patient features. One PQL query replaces months of manual feature engineering across siloed EHR tables.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

PATIENTS

patient_id	age	gender	insurance
P1001	72	M	Medicare
P1002	58	F	Commercial
P1003	81	M	Medicare

ENCOUNTERS

encounter_id	patient_id	admit_date	discharge_date	type
E5001	P1001	2025-02-10	2025-02-16	Inpatient
E5002	P1002	2025-02-20	2025-02-23	Inpatient
E5003	P1003	2025-02-25	2025-03-02	Inpatient

DIAGNOSES

diagnosis_id	encounter_id	icd10_code	description
D001	E5001	I50.9	Heart failure, unspecified
D002	E5002	J44.1	COPD with acute exacerbation
D003	E5003	N18.6	End-stage renal disease

PROCEDURES

procedure_id	encounter_id	cpt_code	description
PR001	E5001	93306	Echocardiogram
PR002	E5002	94640	Nebulizer treatment
PR003	E5003	90935	Hemodialysis

MEDICATIONS

rx_id	patient_id	drug_name	start_date	active
RX01	P1001	Furosemide 40mg	2025-02-10	Y
RX02	P1002	Albuterol inhaler	2025-02-20	Y
RX03	P1003	Epoetin alfa	2025-01-15	Y

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(ENCOUNTERS.*, 0, 30, days)
FOR EACH PATIENTS.PATIENT_ID
WHERE ENCOUNTERS.TYPE = 'Inpatient'

Prediction output

Every entity gets a score, updated continuously

PATIENT_ID	DISCHARGE_DATE	READMIT_30D	PROBABILITY
P1001	2025-02-16	True	0.74
P1002	2025-02-23	False	0.18
P1003	2025-03-02	True	0.89

Understand why

Every prediction includes feature attributions — no black boxes

Patient P1003 -- 81yo Male, ESRD

Predicted: True (89% readmission probability)

Top contributing features

Number of inpatient encounters (last 90d)

4 visits

31% attribution

Active high-risk medication count

7 drugs

24% attribution

Shared-provider readmission rate

34%

18% attribution

Diagnosis complexity (HCC score)

3.2

15% attribution

Days between last two admissions

12 days

12% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about readmission prediction

How does AI predict hospital readmissions?

AI predicts readmissions by analyzing connected patient data: diagnoses, procedures, medications, provider patterns, and encounter history. Graph-based models go beyond individual patient features to detect cross-patient signals like shared-provider readmission rates and medication interaction patterns. This produces more accurate risk scores (C-statistic above 0.82) than traditional tools like LACE (0.68-0.72).

What is the CMS Hospital Readmissions Reduction Program penalty?

CMS penalizes hospitals up to 3% of Medicare reimbursements for excess 30-day readmissions for targeted conditions (heart failure, pneumonia, COPD, hip/knee replacement, coronary artery bypass graft, and acute MI). For a large hospital, penalties can reach $3-5M annually. Reducing readmissions by even 10-15% can eliminate the penalty entirely.

What is the LACE score and how accurate is it?

LACE is a readmission risk score using four variables: Length of stay, Acuity (emergency vs. elective), Comorbidities (Charlson score), and Emergency visits in prior 6 months. Its C-statistic ranges from 0.68-0.72, meaning it correctly ranks a readmitted patient above a non-readmitted patient only 68-72% of the time. Graph-based models improve this to 82%+ by incorporating medication, provider, and cross-patient signals.

How much does a hospital readmission cost?

The average cost of a 30-day readmission is $15,000 in direct care costs. On top of that, CMS penalties for excess readmissions can reach 3% of total Medicare reimbursements. For a 400-bed hospital processing 25,000 discharges per year, a 15% reduction in readmissions saves $5.6M annually from the combined effect of avoided care costs and eliminated penalties.

Bottom line: A 400-bed hospital reducing 30-day readmissions by 15% saves $5.6M per year in CMS penalties and direct care costs. Kumo learns cross-patient signals from shared providers, medication overlaps, and procedure patterns that LACE scores miss entirely.

Related use cases

Explore more healthcare use cases

Use Case #2No-Show PredictionLearn more

Use Case #3Length of StayLearn more

Use Case #6Patient DeteriorationLearn more

Next#2 No-Show Prediction

Topics covered

hospital readmission prediction30-day readmission AICMS readmission penaltypatient readmission modelEHR predictive analyticsgraph neural network healthcareKumoRFM readmissionrelational deep learning clinicalreadmission risk scoring

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free