What is a transaction graph?

A transaction graph represents financial activity as a directed temporal graph. Accounts (or entities like merchants, devices) are nodes. Transactions are directed edges from sender to receiver, with features like amount, timestamp, channel, and currency. The graph is temporal: edges have timestamps, and the structure evolves over time.

Why are transaction graphs effective for fraud detection?

Fraud patterns are fundamentally structural: money laundering involves circular flows, account takeover involves unusual transaction partners, and organized fraud involves coordinated accounts. These patterns are invisible in flat transaction tables but clearly visible as graph motifs (cycles, stars, dense subgraphs).

How large are production transaction graphs?

A major bank processes 10-50 million transactions per day, producing a graph with 100M+ nodes and billions of edges over a 90-day window. Real-time fraud detection requires processing each new transaction within milliseconds, necessitating efficient incremental GNN inference.

Transaction Graphs: Financial Transactions as Directed Temporal Graphs | Kumo.ai

A transaction graph represents financial activity as a directed temporal graph where accounts are nodes and transactions are timestamped directed edges. Account A sending $500 to account B on March 1 becomes a directed edge from A to B with features (amount=$500, date=Mar 1, channel= wire). When millions of these edges are assembled, structural patterns emerge that reveal fraud, money laundering, and credit risk.

Why graphs transform fraud detection

Traditional fraud detection uses transaction-level features: amount, time, location, merchant category. This misses structural patterns:

Money laundering cycles: A → B → C → A. Money flows in a circle to obscure its origin. Invisible in a single transaction record, obvious in a graph.
Mule networks: A central fraudster distributes stolen funds across many accounts that then withdraw cash. The star topology is a clear graph signature.
Account takeover: A legitimate account suddenly transacts with counterparties it has never interacted with. The graph neighborhood changes dramatically.
Coordinated fraud: Multiple accounts making similar transactions to the same merchants in the same time window. Dense temporal subgraphs.

Graph construction

A production transaction graph is heterogeneous and temporal:

Node types: accounts, merchants, devices, IP addresses, card numbers
Edge types: transactions (directed, temporal), shared-device (account-device), shared-merchant (account-merchant)
Node features: account age, average balance, transaction frequency, KYC status
Edge features: amount, timestamp, channel (wire, ACH, card), currency

Multiple edge types provide different signals. Direct transactions carry the strongest fraud signal. Shared-device edges reveal accounts controlled by the same person. Shared-merchant edges provide weaker but useful context.

Temporal integrity

Transaction graphs are inherently temporal, and temporal integrity is non-negotiable:

Temporal sampling: When scoring a transaction at time T, the GNN can only see transactions that occurred before T. Future transactions leak the outcome.
Temporal splits: Train on January-March, test on April. Never use random splits, which mix future and past.
Consequence feature removal: Remove features created by the fraud investigation process (account freeze, chargeback) from training data.

Real-time inference

Production fraud detection requires scoring each transaction in real-time (under 100ms). This means the GNN cannot re-process the entire graph for each new transaction. Instead:

Maintain pre-computed node embeddings for all accounts
When a new transaction arrives, fetch embeddings for sender, receiver, and their neighbors
Run a lightweight GNN forward pass on the local subgraph
Update the sender and receiver embeddings with the new transaction information

This incremental approach processes each transaction in 10-50ms while maintaining graph context that accumulates over millions of historical transactions.

Results at scale

Transaction graph GNNs deployed at financial institutions report:

15-25% reduction in false positive rates at the same fraud catch rate
Detection of organized fraud rings that flat-table models miss entirely
Earlier detection: graph signals appear 2-3 days before individual account anomalies
Recovery of $50-200M annually in prevented fraud at major banks

Key Takeaways

1Transaction graphs represent accounts as nodes and transactions as directed temporal edges. Fraud patterns (laundering cycles, mule networks, coordinated fraud) are structural and visible only in graphs.
2GNNs add 10-30% fraud detection lift over flat-table models. The improvement comes from neighborhood context: who you transact with, and who they transact with.
3Temporal integrity is non-negotiable. Temporal sampling, temporal splits, and consequence feature removal prevent the most common source of inflated fraud detection metrics.
4Real-time inference requires incremental processing: pre-computed embeddings, local subgraph extraction, lightweight forward pass. Full graph re-computation per transaction is infeasible.
5Production impact at major banks: 15-25% false positive reduction, earlier detection by 2-3 days, and $50-200M in annual prevented fraud.

Transaction Graphs: Financial Transactions as Directed Temporal Graphs

Why graphs transform fraud detection

Graph construction

Temporal integrity

Real-time inference

Results at scale

Frequently asked questions

What is a transaction graph?

Why are transaction graphs effective for fraud detection?

How large are production transaction graphs?

Related

From the Kumo Learn Hub

Learn more about graph ML