A transaction graph represents financial activity as a directed temporal graph where accounts are nodes and transactions are timestamped directed edges. Account A sending $500 to account B on March 1 becomes a directed edge from A to B with features (amount=$500, date=Mar 1, channel= wire). When millions of these edges are assembled, structural patterns emerge that reveal fraud, money laundering, and credit risk.
Why graphs transform fraud detection
Traditional fraud detection uses transaction-level features: amount, time, location, merchant category. This misses structural patterns:
- Money laundering cycles: A → B → C → A. Money flows in a circle to obscure its origin. Invisible in a single transaction record, obvious in a graph.
- Mule networks: A central fraudster distributes stolen funds across many accounts that then withdraw cash. The star topology is a clear graph signature.
- Account takeover: A legitimate account suddenly transacts with counterparties it has never interacted with. The graph neighborhood changes dramatically.
- Coordinated fraud: Multiple accounts making similar transactions to the same merchants in the same time window. Dense temporal subgraphs.
Graph construction
A production transaction graph is heterogeneous and temporal:
- Node types: accounts, merchants, devices, IP addresses, card numbers
- Edge types: transactions (directed, temporal), shared-device (account-device), shared-merchant (account-merchant)
- Node features: account age, average balance, transaction frequency, KYC status
- Edge features: amount, timestamp, channel (wire, ACH, card), currency
Multiple edge types provide different signals. Direct transactions carry the strongest fraud signal. Shared-device edges reveal accounts controlled by the same person. Shared-merchant edges provide weaker but useful context.
Temporal integrity
Transaction graphs are inherently temporal, and temporal integrity is non-negotiable:
- Temporal sampling: When scoring a transaction at time T, the GNN can only see transactions that occurred before T. Future transactions leak the outcome.
- Temporal splits: Train on January-March, test on April. Never use random splits, which mix future and past.
- Consequence feature removal: Remove features created by the fraud investigation process (account freeze, chargeback) from training data.
Real-time inference
Production fraud detection requires scoring each transaction in real-time (under 100ms). This means the GNN cannot re-process the entire graph for each new transaction. Instead:
- Maintain pre-computed node embeddings for all accounts
- When a new transaction arrives, fetch embeddings for sender, receiver, and their neighbors
- Run a lightweight GNN forward pass on the local subgraph
- Update the sender and receiver embeddings with the new transaction information
This incremental approach processes each transaction in 10-50ms while maintaining graph context that accumulates over millions of historical transactions.
Results at scale
Transaction graph GNNs deployed at financial institutions report:
- 15-25% reduction in false positive rates at the same fraud catch rate
- Detection of organized fraud rings that flat-table models miss entirely
- Earlier detection: graph signals appear 2-3 days before individual account anomalies
- Recovery of $50-200M annually in prevented fraud at major banks