When someone asks "how does Coinbase detect fraud?" or "how does Chime prevent unauthorized transactions?" the real answer is never a single algorithm. It is an architecture. Modern fintechs build layered systems where each layer catches a different fraud type, and the layers work together to balance detection rates against false positive rates against customer experience.
This matters if you are building or upgrading a fraud detection system, because the biggest mistake teams make is over-investing in one layer (usually supervised ML) while leaving other layers completely unaddressed. The fraud you miss is usually the fraud your architecture cannot structurally detect, not the fraud your model scored incorrectly.
The five layers of fintech fraud defense
Every serious fintech fraud stack has these five layers. The specifics vary, but the architecture is consistent across Coinbase, Chime, Stripe, Square, Revolut, and most well-funded neobanks and payment processors.
Layer 1: Rules engine
Velocity checks (max 5 transactions per hour from a single device), blocklists (known fraudulent IPs, device fingerprints, email domains), hard constraints (transaction limits, geo restrictions), and sanctions screening. Rules are deterministic: they always fire the same way. They catch obvious fraud instantly and enforce regulatory requirements.
- Best for: Known patterns, repeat offenders, sanctions screening, and regulatory requirements that demand deterministic outputs.
- Watch out for: Cannot generalize. Every new fraud pattern requires a new rule, and sophisticated fraudsters learn the rules and stay just below the thresholds.
Layer 2: Supervised ML
Typically XGBoost or gradient boosted trees trained on transaction features: amount, time of day, merchant category, device type, velocity aggregates, geographic distance from home location. The model scores every transaction in real time (sub-100ms) with a fraud probability. Transactions above the threshold get blocked or sent to review. Supervised ML generalizes better than rules because it learns patterns from labeled data rather than matching explicit scenarios.
- Best for: Real-time scoring of individual transactions. Catches card-not-present fraud, amount anomalies, and merchant-category patterns.
- Watch out for: Flat table input. Cannot see connections between entities. Coordinated fraud across multiple accounts is invisible to per-transaction scoring.
Layer 3: Behavioral analytics
Session-level patterns that go beyond transaction features. How quickly does the user navigate the app? What is their typical login-to-transaction time? Do they usually transfer to this recipient? Device fingerprinting identifies devices even across browser resets and VPN changes. Typing cadence, swipe patterns, and navigation paths create a behavioral biometric that is hard to replicate.
- Best for: Account takeover detection where credentials are correct but behavior is wrong. Bot detection and credential-stuffing attacks.
- Watch out for: Per-session view. Cannot detect fraud rings where each individual session looks normal but the network of accounts is coordinated.
Layer 4: Graph-based detection
This layer analyzes the network of connections between accounts, devices, addresses, and transaction counterparties. It catches fraud that lives in relationships: 20 accounts sharing 3 devices, money mule chains forwarding funds through 5 intermediaries, coordinated account openings from the same IP block. Graph-based detection finds the organized fraud that layers 1-3 miss because their per-transaction or per-session view cannot see coordination across multiple entities. This is where KumoRFM fits.
- Best for: Fraud rings, money mule chains, coordinated account takeovers, and any organized fraud that spans multiple entities and accounts.
- Watch out for: Higher latency than layers 1-2 (typically under 500ms vs under 100ms). For instant-decision products, graph scores may need to run asynchronously and feed into risk thresholds.
Layer 5: Human-in-the-loop review
Transactions in the uncertain zone between clear-approve and clear-block get routed to human analysts. Good fraud teams use this layer not just for decisioning but as a feedback loop: analyst decisions become training labels for the ML models, and patterns that analysts catch repeatedly get encoded as rules or model features. The goal is to shrink this layer over time as the automated layers improve.
- Best for: Edge cases, novel fraud patterns, and building training data to improve automated layers over time.
- Watch out for: Scale-limited. If more than 5-10% of transactions reach this layer, your automated layers need improvement. Human review is expensive and slow.
five_layers_fintech_fraud_defense
| layer | what_it_does | fraud_types_caught | fraud_types_missed | latency |
|---|---|---|---|---|
| 1. Rules engine | Velocity checks, blocklists, hard constraints, sanctions | Known patterns, repeat offenders, sanctions violations | Novel fraud, pattern variations, organized rings | <10ms |
| 2. Supervised ML | Real-time transaction scoring on tabular features | Individually anomalous transactions, card-not-present fraud | Coordinated fraud, shared-device rings, money mule chains | <100ms |
| 3. Behavioral analytics | Session patterns, device fingerprinting, biometric signals | Account takeover, bot attacks, credential stuffing | Fraud from legitimate devices, coordinated ring activity | <100ms |
| 4. Graph-based detection | Network analysis of entity relationships and connections | Fraud rings, money mules, coordinated attacks, shared-device clusters | Individual anomalous transactions (better caught by layer 2) | <500ms |
| 5. Human review | Analyst decisioning + model feedback loop | Edge cases, novel patterns, complex scenarios | Scale-limited - cannot review every transaction | Minutes to hours |
Each layer catches fraud types that the others miss. The architecture is layered precisely because no single approach catches everything.
What Coinbase does differently
Coinbase operates at the intersection of traditional financial fraud and crypto-native fraud. Their fraud stack handles both fiat transactions (bank deposits, card purchases) and cryptocurrency transactions (sends, swaps, DeFi interactions), each with different signal profiles.
Three things stand out about Coinbase's approach:
- Sequence features from on-chain history. For crypto transactions, Coinbase builds features from the blockchain itself: wallet age, transaction frequency patterns, interaction history with known high-risk protocols (mixers, bridges that have been exploited, sanctioned addresses). A withdrawal to a wallet that has only existed for 2 hours and received funds from a known mixer carries different risk than a withdrawal to a wallet with 3 years of normal DeFi activity. These sequence features are crypto-specific signals that traditional fraud models never see.
- Blockchain address risk scoring. Every destination address gets a risk score based on its on-chain history and connections. This is essentially a graph problem on the blockchain: how many hops away is this address from known bad actors? Has it received funds from sanctioned wallets? Does it have the transaction pattern of a personal wallet, an exchange, or a mixing service? Coinbase integrates with blockchain analytics providers like Chainalysis and also builds proprietary scoring.
- Multi-model architecture. Coinbase does not run one fraud model. They run multiple specialized models: one for fiat deposit fraud, one for crypto send risk, one for account takeover, one for new account fraud. Each model uses different features and different thresholds because the fraud patterns and acceptable false positive rates differ by transaction type. The models feed into a unified decisioning layer that combines scores with rules and routes to human review when needed.
What Chime does differently
Chime serves a different customer base than Coinbase. Their users are primarily underbanked consumers who may have thin credit files, limited banking history, and less conventional income patterns. This creates a unique fraud detection challenge: the signals that traditional banks use to establish trust (long credit history, stable income, existing banking relationships) are often unavailable.
Three things stand out about Chime's approach:
- Real-time ML scoring for instant products. Chime's SpotMe (overdraft coverage) and pay-anyone features require fraud decisions in milliseconds. You cannot hold a transaction for manual review when the product promise is instant money movement. Chime runs real-time ML models that score every transaction against the account's behavioral baseline: is this amount typical? Is this recipient in the user's usual transfer pattern? Is this device consistent with their history?
- Behavioral signals over identity signals. Because their customer base has thinner identity histories, Chime relies more on behavioral analytics: how users interact with the app over time, spending pattern consistency, peer transaction networks (who sends money to whom regularly). A user who has been depositing their paycheck biweekly for 8 months and sending rent to the same recipient monthly has built a strong behavioral baseline. A sudden $2,000 transfer to a new recipient at 3 AM triggers an anomaly score based on behavioral deviation, not just transaction features.
- Peer transaction network analysis. Chime's pay-anyone feature creates a natural transaction graph between users. Who sends money to whom, how often, and in what amounts. This peer network contains fraud signals: newly opened accounts that immediately receive transfers from multiple established accounts (potential money mule pattern), clusters of accounts that only transact with each other (potential fraud ring), accounts that receive funds and immediately transfer everything out (pass-through behavior).
Traditional bank vs fintech fraud stack
The difference is not just technology. It is architecture philosophy. Banks started with rules and added ML as an overlay. Fintechs started with ML and use rules as a floor.
traditional_bank_vs_fintech_fraud_stack
| dimension | traditional_bank | modern_fintech |
|---|---|---|
| Primary detection | Rule-based (NICE Actimize, SAS AML) | ML-first (XGBoost, custom models) |
| Model update cycle | Quarterly to annually | Weekly to daily |
| Decision speed | Minutes to days (batch + manual review heavy) | Milliseconds (real-time scoring, minimal manual review) |
| False positive rate | 90%+ on rule-based alerts | 20-40% with ML scoring |
| Graph-based detection | Rare - some have Quantexa or similar for investigation | Growing - layer 4 adoption increasing as organized fraud grows |
| Behavioral analytics | Limited - session monitoring for online banking | Deep - device fingerprinting, typing patterns, navigation analysis |
| Customer friction | High - frequent blocks, slow resolution | Low - step-up verification instead of hard blocks |
| Feedback loop speed | Slow - analyst labels take weeks to reach models | Fast - automated label pipelines, rapid retraining |
| Organized fraud detection | Weak - rules miss coordination, no native graph analysis | Improving - graph-based layer catching rings and mule chains |
| Tech stack ownership | Vendor-dependent (long integration cycles) | In-house or API-first (rapid iteration) |
Fintechs move faster at every layer. But both face the same structural gap: organized fraud requires graph-based detection that most stacks still lack.
Where graph ML fits: catching what the other layers miss
Layers 1-3 are good at catching fraud from individual bad actors: stolen cards, compromised accounts, bot-driven attacks. They struggle with organized fraud because each layer evaluates transactions or sessions individually. They cannot see coordination across multiple entities.
Here is what organized fraud looks like in practice, and why it requires a graph-based approach:
- Fraud rings on shared devices. A ring of 20 accounts controlled by the same group, all logging in from 3 physical devices. Each account individually passes behavioral checks because the fraudsters have learned to mimic normal behavior. Layer 2 (ML) scores each transaction as low risk because the amounts and patterns look normal. Layer 3 (behavioral) does not flag anything because the session patterns are realistic. But the graph reveals that 20 accounts sharing 3 devices is not normal. That cluster structure is the fraud signal.
- Money mule chains. Stolen funds move through a chain of accounts: A sends to B, B sends to C, C sends to D, D withdraws or converts to crypto. Each individual transfer is below thresholds and between apparently unrelated accounts. The chain is only visible in the transaction graph: rapid sequential transfers through newly opened accounts with no prior relationship to the sender.
- Coordinated account takeover. Credentials from a data breach are sold in batches. The buyer tests and takes over accounts in bulk. Individually, each account takeover might trigger behavioral anomaly detection (Layer 3). But the coordination, with 50 accounts taken over within the same 24-hour window using the same credential-testing patterns, is only visible in the graph. The graph shows the temporal cluster and the shared behavioral fingerprint across the batch.
Fintech fraud stack without graph layer
- Rules catch known patterns and repeat offenders
- Supervised ML scores individual transactions accurately
- Behavioral analytics catches account takeover from anomalous sessions
- Blind to fraud rings sharing devices across 20+ accounts
- Cannot trace money mule chains through 4-7 intermediary accounts
- Misses coordinated attacks where individual transactions look normal
- A significant share of fraud losses come from organized ring attacks these layers cannot see
Fintech fraud stack with KumoRFM at layer 4
- Layers 1-3 continue handling individual fraud types
- KumoRFM reads the full account-device-transaction-address graph natively
- Detects fraud rings by identifying shared-device clusters and behavioral similarity
- Traces money mule chains 6-7 hops deep through intermediary accounts
- Identifies coordinated attacks from temporal and network patterns
- Reduces false positives by 40-60% through network context
- No graph construction, no feature engineering - reads raw relational tables directly
PQL Query
PREDICT is_fraud FOR EACH transactions.transaction_id WHERE transactions.created_at > '2026-03-01'
One PQL query adds graph-based detection to your existing fraud stack. KumoRFM reads raw accounts, transactions, devices, and address tables from your data warehouse and discovers both single-transaction and network-based fraud patterns automatically.
Output
| transaction_id | fraud_prob_kumo | fraud_prob_layer2_ml | what_kumo_sees |
|---|---|---|---|
| TXN-91001 | 0.93 | 0.88 | Both flag - stolen card, high-amount anomaly (tabular signal) |
| TXN-91002 | 0.89 | 0.15 | Fraud ring: account shares 2 devices with 18 other accounts opened in same week |
| TXN-91003 | 0.85 | 0.22 | Money mule chain: 5-hop transfer path from compromised account to crypto exchange |
| TXN-91004 | 0.04 | 0.38 | Layer 2 flags velocity spike but graph shows legitimate payroll batch pattern |
The benchmark evidence
Fintech fraud data lives in multiple related tables: accounts, transactions, devices, sessions, addresses, merchants. The benchmarks that test multi-table prediction directly measure the capability that matters for fintech fraud detection.
sap_salt_benchmark_fintech
| approach | accuracy | what_it_means_for_fintech_fraud |
|---|---|---|
| LLM + AutoML | 63% | Generates features from table descriptions. No relational pattern discovery. |
| PhD Data Scientist + XGBoost | 75% | Expert hand-crafts cross-table features. Captures some relational signal but limited depth. |
| KumoRFM (zero-shot) | 91% | Reads relational tables directly. Discovers multi-hop fraud patterns automatically. |
SAP SALT benchmark: KumoRFM outperforms expert-tuned XGBoost by 16 percentage points. For fintech fraud, this gap means catching organized attacks that flat-table models structurally miss.
relbench_benchmark_fintech
| approach | AUROC | feature_engineering_time |
|---|---|---|
| LightGBM + manual features | 62.44 | 12.3 hours per task |
| KumoRFM zero-shot | 76.71 | ~1 second |
| KumoRFM fine-tuned | 81.14 | Minutes |
RelBench benchmark across 7 databases and 30 tasks: KumoRFM zero-shot outperforms manually engineered LightGBM by 14+ AUROC points.
How to add graph-based detection to your existing stack
You do not need to rebuild your fraud system to get graph-based detection. The practical path is additive:
- Keep layers 1-3 running. Your rules engine, supervised ML, and behavioral analytics are catching individual fraud effectively. Do not break what works.
- Connect KumoRFM to your data warehouse. Point it at your accounts, transactions, devices, and address tables. No ETL, no graph database, no feature engineering. KumoRFM reads the relational tables directly.
- Run graph-based scoring in parallel. Generate fraud probability scores that incorporate network context. These scores complement your existing layer 2 ML scores, not replace them.
- Combine scores in your decisioning layer. Use layer 2 scores for individual transaction risk and layer 4 scores for network-based risk. Weight them based on fraud type: individual card fraud leans on layer 2, organized ring activity leans on layer 4.
- Feed layer 5 analyst decisions back to all models. Human review labels improve both your supervised ML and graph-based detection over time. The feedback loop makes every layer better.