Provider Fraud Detection
“Which healthcare providers are submitting suspicious claims?”
Book a demo and get a free trial of the full platform: data science agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
Which healthcare providers are submitting suspicious claims?
Healthcare fraud accounts for 3-10% of total health spending, costing $68-230B annually in the US (National Health Care Anti-Fraud Association). Provider-driven fraud (upcoding, unbundling, phantom billing, unnecessary procedures) is the largest category, yet most schemes are detected only through retrospective audits 12-24 months after the billing occurs. By then, the insurer has already paid out millions. A single fraudulent medical practice can bill $5-15M before detection. SIU teams can only audit 1-2% of providers annually, so targeting accuracy is critical.
How KumoRFM solves this
Relational intelligence built for insurance data
Kumo connects providers, claims, patients, referral networks, procedure codes, and billing patterns into a relational graph. The model detects that Provider PRV-501 bills 3x the average number of high-complexity procedures, shares patients with a referring provider at an unusually high rate (92% of referrals come from one source), and has a billing-code distribution that deviates significantly from peer providers in the same specialty and region. These graph-based signals surface suspicious providers 6-12 months earlier than traditional audit triggers.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
PROVIDERS
| provider_id | name | specialty | region | years_in_network |
|---|---|---|---|---|
| PRV-501 | MedPro Spine Clinic | Orthopedics | Southeast | 4.2 |
| PRV-502 | City General Radiology | Radiology | Northeast | 12.8 |
| PRV-503 | Sunrise Physical Therapy | PT/Rehab | West | 7.5 |
BILLING_PATTERNS
| provider_id | avg_claim_amount | high_complexity_rate | claims_per_patient | vs_peer_avg |
|---|---|---|---|---|
| PRV-501 | $4,800 | 78% | 8.4 | 3.2x peer avg |
| PRV-502 | $1,200 | 32% | 3.1 | 1.1x peer avg |
| PRV-503 | $850 | 15% | 12.2 | 1.8x peer avg |
REFERRAL_NETWORK
| provider_id | top_referrer | referral_concentration | patient_overlap_pct |
|---|---|---|---|
| PRV-501 | Dr. R. Martinez | 92% | 88% |
| PRV-502 | Multiple (15+) | 12% | 8% |
| PRV-503 | Dr. K. Patel | 65% | 52% |
PROCEDURE_ANALYSIS
| provider_id | top_code | frequency | peer_frequency | upcoding_signal |
|---|---|---|---|---|
| PRV-501 | 99214 (Moderate) | 12% | 45% | Low usage (possible upcoding to 99215) |
| PRV-501 | 99215 (High) | 68% | 22% | 3.1x above peer norm |
| PRV-503 | 97110 (Therapeutic) | 45% | 38% | 1.2x above peer norm |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT BOOL(PROVIDERS.FRAUD_CONFIRMED = 'True', 0, 12, months) FOR EACH PROVIDERS.PROVIDER_ID WHERE BILLING_PATTERNS.VS_PEER_AVG > 1.5
Prediction output
Every entity gets a score, updated continuously
| PROVIDER_ID | SPECIALTY | FRAUD_SCORE | EST_OVERPAYMENT | SIU_PRIORITY |
|---|---|---|---|---|
| PRV-501 | Orthopedics | 0.89 | $2.4M/yr | Critical |
| PRV-503 | PT/Rehab | 0.52 | $420K/yr | High |
| PRV-502 | Radiology | 0.08 | $0 | Low |
Understand why
Every prediction includes feature attributions — no black boxes
Provider PRV-501 (MedPro Spine Clinic)
Predicted: 89% fraud probability, est. $2.4M/yr overpayment
Top contributing features
High-complexity code rate (99215)
68% vs 22% peer
28% attribution
Referral concentration from single source
92%
25% attribution
Claims per patient far above peer
8.4 vs 2.6
21% attribution
Patient overlap with referring provider
88%
15% attribution
Average claim amount anomaly
$4,800 vs $1,500
11% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Bottom line: Detect fraudulent providers 6-12 months earlier and recover $100-300M in annual overpayments for a top-10 health insurer while focusing SIU resources on the highest-impact investigations.
Related use cases
Explore more insurance use cases
Topics covered
One Platform. One Model. Predict Instantly.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Data Science Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: data science agent, fine-tune capabilities, and forward-deployed engineer support.




