Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Use Case12 min read

Anti-Money Laundering: Cycle Detection on Transaction Graphs

Global money laundering exceeds $2 trillion annually. Rule-based AML systems generate 95%+ false positives. Here is how to build a GNN that detects laundering typologies, cycles, layering, and fan-out, while cutting false positives by 50-70%.

PyTorch Geometric

TL;DR

  • 1Money laundering is a graph pattern problem. Layering, cycling, and fan-out are structural signatures in the transaction graph that single-transaction rules cannot detect.
  • 2Multi-hop GNNs with GATConv detect cycles naturally: 3 layers let each node see if its own funds have returned, the defining signature of circular laundering.
  • 3On RelBench benchmarks, GNNs achieve 75.83 AUROC vs 62.44 for flat-table LightGBM. More importantly, they reduce false positives by 50-70%, saving thousands of analyst hours.
  • 4The PyG model is ~35 lines, but production AML systems need real-time graph updates, regulatory reporting integration, and audit trails.
  • 5KumoRFM detects laundering patterns with one PQL query (76.71 AUROC zero-shot), automatically constructing temporal transaction graphs.

The business problem

The UN estimates that $800 billion to $2 trillion is laundered globally each year. Banks spend $30+ billion annually on AML compliance, yet current systems detect less than 1% of illicit flows. The core problem: rule-based systems flag individual transactions (large amounts, unusual timing) but miss the structural patterns that define laundering.

A $9,500 cash deposit (just under the $10,000 reporting threshold) looks suspicious as a rule. But what about 50 transfers of $5,000 flowing through a chain of shell companies and returning to the originator minus a 2% fee? Each individual transaction looks normal. The pattern is only visible in the graph.

Why flat ML fails

  • No cycle detection: Circular transaction chains (A to B to C to A) are the hallmark of laundering. Flat models see individual transactions, not chains.
  • No layering detection: Funds split across multiple intermediaries and recombine. This fan-out/fan-in pattern requires multi-hop graph analysis.
  • Excessive false positives: Rule-based systems generate 95%+ false positives because they lack context. A high-value transfer between long-time business partners is normal; the same amount to a newly created shell company is suspicious. Context requires the graph.
  • Slow adaptation: Launderers adapt to rules. Graph patterns are harder to evade because the fundamental need to move and recombine funds creates structural signatures.

The relational schema

schema.txt
Node types:
  Account     (id, type, creation_date, country, kyc_level)
  Entity      (id, type, incorporation_date, industry)
  Transaction (id, amount, currency, timestamp, channel)

Edge types:
  Account     --[owned_by]-->     Entity
  Account     --[transfers_to]--> Account  (amount, timestamp)
  Entity      --[controls]-->     Entity   (ownership_pct)
  Transaction --[from]-->         Account
  Transaction --[to]-->           Account

The account-entity-transaction graph captures ownership chains and money flows. Cycles in transfers_to edges signal potential laundering.

PyG architecture: GATConv for cycle-aware AML

aml_model.py
import torch
import torch.nn.functional as F
from torch_geometric.nn import GATConv, HeteroConv, Linear

class AMLGNN(torch.nn.Module):
    def __init__(self, hidden_dim=64, heads=4):
        super().__init__()
        self.account_lin = Linear(-1, hidden_dim)
        self.entity_lin = Linear(-1, hidden_dim)

        # 3 layers for cycle detection (3-hop paths)
        self.convs = torch.nn.ModuleList()
        for _ in range(3):
            conv = HeteroConv({
                ('account', 'transfers_to', 'account'): GATConv(
                    hidden_dim, hidden_dim // heads, heads=heads),
                ('account', 'owned_by', 'entity'): GATConv(
                    hidden_dim, hidden_dim // heads, heads=heads),
                ('entity', 'controls', 'entity'): GATConv(
                    hidden_dim, hidden_dim // heads, heads=heads),
            }, aggr='sum')
            self.convs.append(conv)

        self.classifier = torch.nn.Sequential(
            Linear(hidden_dim, 32),
            torch.nn.ReLU(),
            Linear(32, 1),
        )

    def forward(self, x_dict, edge_index_dict):
        x_dict['account'] = self.account_lin(x_dict['account'])
        x_dict['entity'] = self.entity_lin(x_dict['entity'])

        for conv in self.convs:
            x_dict = {k: F.elu(v) for k, v in
                      conv(x_dict, edge_index_dict).items()}

        return torch.sigmoid(
            self.classifier(x_dict['account']).squeeze(-1))

3-layer GATConv enables cycle detection: each account sees its 3-hop neighborhood, including paths that circle back to itself. Attention weights identify which transaction paths are most suspicious.

Training considerations

  • Label scarcity: Confirmed laundering cases are rare and often discovered months later. Use suspicious activity reports (SARs) as weak labels and supplement with synthetic laundering patterns.
  • Temporal ordering: Transaction order matters. Use temporal edge features and ensure the model only sees transactions before the prediction timestamp.
  • Graph snapshots: Build daily or weekly graph snapshots. Laundering patterns evolve, and the model should see the graph at the time of each prediction.
  • False positive optimization: Optimize for precision at high recall thresholds. Compliance teams need manageable alert volumes, not maximum recall.

Expected performance

  • Rule-based system: ~40 AUROC (high recall, 95%+ false positive rate)
  • LightGBM (flat-table): 62.44 AUROC
  • GNN (3-layer GATConv): 75.83 AUROC
  • KumoRFM (zero-shot): 76.71 AUROC

Or use KumoRFM in one line

KumoRFM PQL
PREDICT is_suspicious FOR account
USING account, entity, transaction

One PQL query. KumoRFM constructs the temporal transaction graph, detects cycle and layering patterns automatically, and outputs suspicion scores per account.

KumoRFM replaces graph construction, cycle-aware architecture design, and training with a single query. It achieves 76.71 AUROC zero-shot while providing the prediction explanations needed for SAR filing and regulatory review.

Frequently asked questions

Why are GNNs especially effective for AML?

Money laundering is fundamentally a graph pattern: funds flow through layered shell companies, mule accounts, and circular transaction chains. These patterns are invisible to single-transaction analysis but obvious when you examine the graph structure. GNNs can detect cycles, fan-out patterns, and rapid succession transfers that define laundering typologies.

How do you detect cycles in a transaction graph with PyG?

Use multi-hop message passing (2-3 GNN layers) so each node receives information about its extended neighborhood. If money flows A to B to C to A, a 3-layer GNN at node A will see its own features propagated back, signaling a cycle. You can also add explicit cycle count features as node attributes.

What is the false positive rate challenge in AML?

Traditional rule-based AML systems generate 95%+ false positives, creating massive alert backlogs that compliance teams cannot review. GNNs reduce false positives by 50-70% by incorporating network context: a high-value transaction looks suspicious alone but normal when you see the sender and receiver have a long transaction history.

How do you handle the temporal aspect of money laundering?

Laundering transactions often happen in rapid succession (minutes to hours). Encode timestamps as edge attributes and use temporal attention or time-decay weighting so the model gives higher weight to recent transactions. Temporal windowing also helps: build separate graphs per time window and compare patterns across windows.

Can KumoRFM detect money laundering patterns?

Yes. KumoRFM constructs a temporal heterogeneous graph from your transaction database and identifies suspicious patterns with a single PQL query. It captures cycle patterns, fan-out structures, and rapid succession transfers automatically through its multi-hop graph transformer architecture.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.