Global fraud losses reached $485 billion in 2023, according to Nasdaq's Global Financial Crime Report. Payment fraud alone accounts for $32 billion in card-not-present losses. And these numbers are growing: digital transaction volumes are increasing 15% year-over-year while fraud techniques become more sophisticated.
The technology used to detect fraud has evolved through three distinct eras. Each era solved a category of fraud that the previous one could not see. Understanding this evolution is critical, because most enterprises are still stuck in era two while fraudsters have moved to era three tactics.
Era 1: Rules engines (1990s-2010s)
The first generation of fraud detection used hand-written rules. If-then logic encoded known fraud patterns:
- If transaction amount > $5,000 and country is not home country, flag
- If more than 3 transactions within 10 minutes, flag
- If card-not-present and shipping address differs from billing, flag
- If new account and transaction exceeds $1,000 within first 24 hours, flag
These rules work for the specific patterns they encode. A fraudster using a stolen card for a $10,000 purchase from an unusual country will get caught. The problem is that rules only detect patterns someone has already seen and thought to codify.
The limitations
Rules-based systems have three structural weaknesses:
- Reactive, not proactive. Every rule was written in response to a fraud pattern that already succeeded. There is always a lag between a new technique and the rule that catches it.
- High false positive rates. Simple threshold rules flag 90-95% legitimate transactions. A bank processing 10 million transactions per day might flag 500,000, of which 475,000 are false positives. Each manual review costs $15-25, adding up to millions per year in wasted investigation.
- Brittle under adaptation. Fraudsters test rules by probing limits. If the threshold is $5,000, they run transactions at $4,999. If velocity is checked over 10 minutes, they space transactions 11 minutes apart. Rules are easy to reverse-engineer and circumvent.
Here is what suspicious transaction data looks like in practice. The fraud is invisible at the transaction level but obvious in the graph.
accounts
| account_id | holder | type | opened | branch |
|---|---|---|---|---|
| BA-1001 | Greenfield LLC | Business Checking | 2024-08-14 | Miami |
| BA-1002 | Sandra Keyes | Personal Checking | 2025-01-22 | Miami |
| BA-1003 | Oceanview Trading | Business Checking | 2025-02-03 | Tampa |
| BA-1004 | Marcus Avery | Personal Savings | 2022-06-10 | Atlanta |
Four accounts across three branches. Nothing suspicious in isolation.
transactions
| txn_id | from_account | to_account | amount | date | type |
|---|---|---|---|---|---|
| TX-7701 | BA-1004 | BA-1001 | $9,800 | 2025-10-01 | Wire |
| TX-7702 | BA-1001 | BA-1002 | $9,700 | 2025-10-02 | Wire |
| TX-7703 | BA-1002 | BA-1003 | $9,500 | 2025-10-03 | Wire |
| TX-7704 | BA-1003 | Offshore Corp | $9,200 | 2025-10-04 | Intl Wire |
| TX-7705 | BA-1004 | BA-1001 | $9,900 | 2025-10-08 | Wire |
| TX-7706 | BA-1001 | BA-1002 | $9,850 | 2025-10-09 | Wire |
Highlighted: a layering chain. Funds flow BA-1004 to BA-1001 to BA-1002 to BA-1003 to offshore, each just below the $10K reporting threshold. Each individual wire looks routine. The 4-hop path reveals money laundering.
shared_attributes
| attribute | value | accounts |
|---|---|---|
| Phone number | (305) 555-0147 | BA-1001, BA-1002 |
| IP address | 198.51.100.42 | BA-1001, BA-1002, BA-1003 |
| Registered agent | CorpServ Inc | BA-1001, BA-1003 |
Highlighted: three 'independent' accounts share the same IP address. Combined with the transaction chain, this reveals a single actor operating a laundering network.
Era 2: Tree-based ML (2010s-2020s)
The second era replaced hand-written rules with statistical models, primarily gradient boosted trees (XGBoost, LightGBM). Instead of codifying known patterns, these models learn patterns from historical transaction data.
How it works
A data scientist engineers features from transaction data:
- Transaction amount, currency, merchant category
- Time since last transaction, transaction frequency (1h, 24h, 7d)
- Distance from last transaction location
- Ratio of current amount to average amount for this customer
- Device fingerprint, IP geolocation, browser metadata
- Historical fraud rate for this merchant, this BIN, this country pair
These features go into a flat table (one row per transaction), and a gradient boosted tree learns the statistical boundary between fraud and non-fraud. The model can discover non-linear combinations that no human would write as a rule: transactions that are individually unremarkable but collectively anomalous given a customer's history.
What it improved
Tree-based models improved detection rates from 50-60% to 70-80% and reduced false positive rates by 30-50% compared to rules alone. They can detect novel patterns (not just codified ones) and adapt to new fraud techniques through retraining.
What it still misses
Tree-based models analyze each transaction (or each customer) in isolation. Here is what the era-2 model actually sees for the laundering chain shown above.
flat_feature_table (what XGBoost sees per transaction)
| txn_id | amount | txn_type | sender_age_days | sender_txn_count_30d | amount_vs_avg |
|---|---|---|---|---|---|
| TX-7701 | $9,800 | Wire | 1,204 | 3 | 1.2x |
| TX-7702 | $9,700 | Wire | 427 | 2 | 1.1x |
| TX-7703 | $9,500 | Wire | 310 | 2 | 1.0x |
| TX-7704 | $9,200 | Intl Wire | 310 | 1 | 1.0x |
Each transaction looks normal in isolation: amounts are just below $10K but within 1.2x of the sender's average. Transaction counts are low. The flat table gives no indication that these four wires form a chain (BA-1004 to BA-1001 to BA-1002 to BA-1003 to offshore), or that three of the four accounts share an IP address.
This is the critical blind spot. Modern fraud is organized. A synthetic identity fraud ring creates 50 fake accounts over 6 months, builds credit on each, then maxes them all out in a coordinated burst. Each individual transaction looks normal. Each individual account looks normal. The fraud is only visible in the relational structure: these accounts share devices, phone numbers, addresses, or behavioral patterns that connect them.
A tree-based model that processes one row per transaction cannot see these connections. It is analyzing pixels when the fraud is in the picture.
Transaction-level ML
- One row per transaction or customer
- Features engineered from single entity
- Cannot see cross-entity relationships
- Misses organized fraud rings
- 70-80% detection rate
Graph-based ML
- Entities and relationships as a graph
- Patterns learned across the full network
- Detects shared devices, addresses, behaviors
- Catches coordinated fraud and synthetic IDs
- 85-95% detection rate on organized fraud
Era 3: Graph-based ML (2020s-present)
The third era represents the fraud detection problem as a graph. Accounts, transactions, devices, IP addresses, phone numbers, addresses, and merchants become nodes. Edges represent relationships: "sent money to," "shares device with," "same billing address," "logged in from same IP."
Graph neural networks process this structure by passing messages along edges, learning which relational patterns distinguish fraud from legitimate activity.
What graphs reveal
Graph-based fraud detection finds patterns that are invisible at the transaction level:
- Fraud rings. A cluster of accounts that share devices, IP addresses, or phone numbers and exhibit coordinated behavior. Each account looks independent. The graph shows they are connected.
- Synthetic identities. Fake identities built from combinations of real and fabricated information. Graph analysis reveals that a "new" identity shares a phone number with a known fraud account, or that the Social Security number was created recently and is connected to multiple applications.
- Money laundering paths. Funds flowing through multiple accounts in patterns designed to obscure the origin. Graph traversal reveals the full path even when individual transfers look routine.
- Account takeover chains. A compromised account is used to compromise other accounts through shared credentials or social engineering. The graph shows the propagation pattern.
Production results
Graph-based approaches have shown significant improvements in production fraud detection systems:
- PayPal reported detecting 40% more fraud using graph-based approaches compared to transaction-level models
- Capital One's graph-based system reduced false positives by 50% while maintaining detection rates
- Stripe's Radar uses network-level signals to block $35 billion in fraud annually across its platform
The feature engineering problem in fraud
Fraud detection has a particularly severe version of the feature engineering bottleneck. Transaction databases are complex: accounts, transactions, devices, sessions, merchants, IP addresses, phone numbers, and address histories. A typical fraud detection schema has 8 to 15 interconnected tables.
Building features from this schema is painstaking. A data scientist must decide which tables to join, which aggregations to compute, what time windows to use, and which entity-level features matter. The Stanford RelBench study measured 12.3 hours and 878 lines of code per prediction task on simpler schemas.
For fraud, the problem is worse because the features need to be computed in real time. A fraud decision happens in 50-100 milliseconds. Features like "number of transactions from this device in the last hour" must be computed on the fly, which requires low-latency feature serving infrastructure on top of the engineering effort.
And the features go stale. Fraud patterns change quarterly as attackers adapt. A feature set that works in January may be ineffective by April. This means continuous re-engineering, not a one-time investment.
How foundation models change fraud detection
A relational foundation model like KumoRFM addresses all three of these challenges simultaneously.
No feature engineering
KumoRFM reads the raw transaction database directly, representing it as a temporal heterogeneous graph. No manual feature engineering is needed. The model discovers which patterns across accounts, transactions, devices, and merchants are predictive of fraud.
Graph-native architecture
Because the model represents data as a graph, it naturally captures the relational patterns that define organized fraud. Fraud rings, synthetic identity clusters, and money laundering paths are visible in the graph structure without anyone building explicit graph features.
Pre-trained pattern recognition
KumoRFM has been trained on thousands of diverse relational databases. It has seen fraud-like patterns (anomalous graph topology, velocity spikes, cross-entity propagation) across many different domains. This pre-training means it can detect fraud patterns zero-shot, without task-specific training data, which is critical for catching new fraud techniques before you have labeled examples.
PQL Query
PREDICT transactions.is_suspicious FOR EACH transactions.txn_id
The model traverses the transaction graph: from each wire to the sending and receiving accounts, to shared attributes (IP, phone, registered agent), to other accounts in the cluster. It discovers the layering chain and shared-IP ring without anyone defining these as features.
Output
| txn_id | fraud_score | top_signal |
|---|---|---|
| TX-7701 | 0.89 | Source of layering chain, shared IP cluster |
| TX-7702 | 0.93 | Mid-chain transfer, shared phone with BA-1001 |
| TX-7703 | 0.95 | Chain continues, 3 shared-IP accounts |
| TX-7704 | 0.97 | Offshore destination, end of layering chain |
Adaptation speed
When fraud patterns shift, a traditional model requires re-engineering features, retraining, and redeploying. A foundation model can be fine-tuned on new data in minutes rather than rebuilt from scratch over weeks. This compresses the response time to new fraud techniques from months to hours.
Choosing the right era for your organization
Not every organization needs graph-based fraud detection today. The right approach depends on the type of fraud you face and the complexity of your data.
Rules engines are sufficient when
- Fraud patterns are well-known and stable
- Transaction volumes are low enough for high false positive rates to be manageable
- Regulatory requirements mandate explainable, auditable rules
Tree-based ML is the right step up when
- You have historical labeled fraud data for training
- Fraud patterns are evolving and rules cannot keep up
- False positive rates are too high and need statistical optimization
Graph-based approaches are necessary when
- You face organized fraud (rings, synthetic identities, coordinated attacks)
- Your fraud losses are concentrated in network-level patterns, not individual anomalies
- You need to detect money laundering, account takeover chains, or collusion
- Your current models miss fraud that is only visible through entity relationships
The $485 billion question
The fraud detection industry is in a transition. Most enterprises are running era-two systems (tree-based ML on engineered features) while facing era-three threats (organized networks, synthetic identities, coordinated attacks). The technology to close this gap exists. Graph-based approaches detect 40-85% more organized fraud than transaction-level models.
The barrier has been implementation complexity. Building a graph-based fraud system from scratch requires graph database infrastructure, GNN training pipelines, and real-time graph computation. This is a multi-year, multi-million-dollar engineering project.
Foundation models change the economics. Instead of building graph infrastructure from scratch, you connect your transaction database and get graph-based fraud scores. The model has already learned the graph patterns. Your team spends time on fraud strategy, not graph engineering.
At $485 billion in annual losses, even small improvements in detection rates translate to enormous value. Moving from era-two to era-three detection is not a marginal upgrade. It is a structural one.