The business problem
The Coalition Against Insurance Fraud estimates that fraud costs US insurers $80 billion annually. This cost is passed directly to consumers through higher premiums. The most damaging fraud involves organized rings: networks of claimants, providers, and professionals who submit coordinated false claims. Individual claim analysis flags obvious outliers but misses these sophisticated operations.
Why flat ML fails
- No ring detection: Organized fraud rings involve 5-50 participants filing claims that individually look normal. The pattern is only visible in the network connections between claims.
- No provider context: A doctor with 100 patients is normal. A doctor whose patients all share the same lawyer and body shop is suspicious. Flat models see patient counts, not connection patterns.
- Relationship type matters: The claimant-doctor relationship carries different fraud signal than claimant-witness or claim-adjuster relationships. Flat models treat all connections equally.
- Temporal patterns: Fraud rings ramp up gradually, filing increasingly brazen claims over time. The temporal evolution of the network is a strong signal that flat features miss.
The relational schema
Node types:
Claim (id, amount, type, date, status)
Claimant (id, age, policy_tenure, claim_history)
Provider (id, specialty, license_date, avg_billing)
Policy (id, type, premium, coverage_limit)
Edge types:
Claim --[filed_by]--> Claimant
Claim --[treated_by]--> Provider
Claim --[under]--> Policy
Claimant --[referred_by]--> Provider
Provider --[co_billed]--> Provider (shared_claims_count)Five node/edge types. The co_billed edges between providers surface ring structures: providers who repeatedly appear on the same claims.
PyG architecture: RGCNConv for claims networks
import torch
import torch.nn.functional as F
from torch_geometric.nn import RGCNConv, Linear
class ClaimsFraudGNN(torch.nn.Module):
def __init__(self, in_dim, hidden_dim=64, num_relations=5):
super().__init__()
self.lin = Linear(in_dim, hidden_dim)
# RGCNConv: separate weights per edge type
self.conv1 = RGCNConv(
hidden_dim, hidden_dim, num_relations=num_relations,
num_bases=8) # basis decomposition for efficiency
self.conv2 = RGCNConv(
hidden_dim, hidden_dim, num_relations=num_relations,
num_bases=8)
self.classifier = torch.nn.Sequential(
Linear(hidden_dim, 32),
torch.nn.ReLU(),
Linear(32, 1),
)
def forward(self, x, edge_index, edge_type):
x = F.relu(self.lin(x))
x = F.relu(self.conv1(x, edge_index, edge_type))
x = self.conv2(x, edge_index, edge_type)
return torch.sigmoid(self.classifier(x).squeeze(-1))
# Training with focal loss for class imbalance
def focal_loss(pred, target, gamma=2.0, alpha=0.75):
bce = F.binary_cross_entropy(pred, target, reduction='none')
pt = torch.where(target == 1, pred, 1 - pred)
weight = alpha * (1 - pt) ** gamma
return (weight * bce).mean()RGCNConv with basis decomposition. Separate weight matrices per relationship type let the model learn that claimant-provider connections carry different fraud signal than provider-provider connections.
Expected performance
- Rule-based system: ~45 AUROC (high false positive rate)
- LightGBM (flat-table): 62.44 AUROC
- GNN (RGCNConv): 75.83 AUROC
- KumoRFM (zero-shot): 76.71 AUROC
Or use KumoRFM in one line
PREDICT is_fraudulent FOR claim
USING claim, claimant, provider, policyOne PQL query. KumoRFM constructs the claims network, detects ring patterns, and outputs fraud probabilities with explainable attributions.