Why are GNNs especially effective for AML?

Money laundering is fundamentally a graph pattern: funds flow through layered shell companies, mule accounts, and circular transaction chains. These patterns are invisible to single-transaction analysis but obvious when you examine the graph structure. GNNs can detect cycles, fan-out patterns, and rapid succession transfers that define laundering typologies.

How do you detect cycles in a transaction graph with PyG?

Use multi-hop message passing (2-3 GNN layers) so each node receives information about its extended neighborhood. If money flows A to B to C to A, a 3-layer GNN at node A will see its own features propagated back, signaling a cycle. You can also add explicit cycle count features as node attributes.

What is the false positive rate challenge in AML?

Traditional rule-based AML systems generate 95%+ false positives, creating massive alert backlogs that compliance teams cannot review. GNNs reduce false positives by 50-70% by incorporating network context: a high-value transaction looks suspicious alone but normal when you see the sender and receiver have a long transaction history.

How do you handle the temporal aspect of money laundering?

Laundering transactions often happen in rapid succession (minutes to hours). Encode timestamps as edge attributes and use temporal attention or time-decay weighting so the model gives higher weight to recent transactions. Temporal windowing also helps: build separate graphs per time window and compare patterns across windows.

Can KumoRFM detect money laundering patterns?

Yes. KumoRFM constructs a temporal heterogeneous graph from your transaction database and identifies suspicious patterns with a single PQL query. It captures cycle patterns, fan-out structures, and rapid succession transfers automatically through its multi-hop graph transformer architecture.

Anti-Money Laundering with PyG: Cycle Detection on Transaction Graphs | PyG Guide

The business problem

The UN estimates that $800 billion to $2 trillion is laundered globally each year. Banks spend $30+ billion annually on AML compliance, yet current systems detect less than 1% of illicit flows. The core problem: rule-based systems flag individual transactions (large amounts, unusual timing) but miss the structural patterns that define laundering.

A $9,500 cash deposit (just under the $10,000 reporting threshold) looks suspicious as a rule. But what about 50 transfers of $5,000 flowing through a chain of shell companies and returning to the originator minus a 2% fee? Each individual transaction looks normal. The pattern is only visible in the graph.

Why flat ML fails

No cycle detection: Circular transaction chains (A to B to C to A) are the hallmark of laundering. Flat models see individual transactions, not chains.
No layering detection: Funds split across multiple intermediaries and recombine. This fan-out/fan-in pattern requires multi-hop graph analysis.
Excessive false positives: Rule-based systems generate 95%+ false positives because they lack context. A high-value transfer between long-time business partners is normal; the same amount to a newly created shell company is suspicious. Context requires the graph.
Slow adaptation: Launderers adapt to rules. Graph patterns are harder to evade because the fundamental need to move and recombine funds creates structural signatures.

The relational schema

schema.txt

Node types:
  Account     (id, type, creation_date, country, kyc_level)
  Entity      (id, type, incorporation_date, industry)
  Transaction (id, amount, currency, timestamp, channel)

Edge types:
  Account     --[owned_by]-->     Entity
  Account     --[transfers_to]--> Account  (amount, timestamp)
  Entity      --[controls]-->     Entity   (ownership_pct)
  Transaction --[from]-->         Account
  Transaction --[to]-->           Account

The account-entity-transaction graph captures ownership chains and money flows. Cycles in transfers_to edges signal potential laundering.

PyG architecture: GATConv for cycle-aware AML

aml_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import GATConv, HeteroConv, Linear

class AMLGNN(torch.nn.Module):
    def __init__(self, hidden_dim=64, heads=4):
        super().__init__()
        self.account_lin = Linear(-1, hidden_dim)
        self.entity_lin = Linear(-1, hidden_dim)

        # 3 layers for cycle detection (3-hop paths)
        self.convs = torch.nn.ModuleList()
        for _ in range(3):
            conv = HeteroConv({
                ('account', 'transfers_to', 'account'): GATConv(
                    hidden_dim, hidden_dim // heads, heads=heads),
                ('account', 'owned_by', 'entity'): GATConv(
                    hidden_dim, hidden_dim // heads, heads=heads),
                ('entity', 'controls', 'entity'): GATConv(
                    hidden_dim, hidden_dim // heads, heads=heads),
            }, aggr='sum')
            self.convs.append(conv)

        self.classifier = torch.nn.Sequential(
            Linear(hidden_dim, 32),
            torch.nn.ReLU(),
            Linear(32, 1),
        )

    def forward(self, x_dict, edge_index_dict):
        x_dict['account'] = self.account_lin(x_dict['account'])
        x_dict['entity'] = self.entity_lin(x_dict['entity'])

        for conv in self.convs:
            x_dict = {k: F.elu(v) for k, v in
                      conv(x_dict, edge_index_dict).items()}

        return torch.sigmoid(
            self.classifier(x_dict['account']).squeeze(-1))

3-layer GATConv enables cycle detection: each account sees its 3-hop neighborhood, including paths that circle back to itself. Attention weights identify which transaction paths are most suspicious.

Training considerations

Label scarcity: Confirmed laundering cases are rare and often discovered months later. Use suspicious activity reports (SARs) as weak labels and supplement with synthetic laundering patterns.
Temporal ordering: Transaction order matters. Use temporal edge features and ensure the model only sees transactions before the prediction timestamp.
Graph snapshots: Build daily or weekly graph snapshots. Laundering patterns evolve, and the model should see the graph at the time of each prediction.
False positive optimization: Optimize for precision at high recall thresholds. Compliance teams need manageable alert volumes, not maximum recall.

Expected performance

Rule-based system: ~40 AUROC (high recall, 95%+ false positive rate)
LightGBM (flat-table): 62.44 AUROC
GNN (3-layer GATConv): 75.83 AUROC
KumoRFM (zero-shot): 76.71 AUROC

Or use KumoRFM in one line

KumoRFM PQL

PREDICT is_suspicious FOR account
USING account, entity, transaction

One PQL query. KumoRFM constructs the temporal transaction graph, detects cycle and layering patterns automatically, and outputs suspicion scores per account.

KumoRFM replaces graph construction, cycle-aware architecture design, and training with a single query. It achieves 76.71 AUROC zero-shot while providing the prediction explanations needed for SAR filing and regulatory review.

Anti-Money Laundering: Cycle Detection on Transaction Graphs