Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide8 min read

Fraud Detection with Graphs: How Graph Structure Reveals Fraud Rings

Tabular fraud models score each transaction independently. Graph models see the network: shared devices, common beneficiaries, coordinated timing. That structural context is exactly what fraud rings try to hide.

PyTorch Geometric

TL;DR

  • 1Fraud rings are invisible to tabular models because each row looks normal in isolation. Graph structure reveals coordinated behavior: shared devices, overlapping beneficiaries, and synchronized timing across accounts.
  • 2A fraud detection graph has accounts, devices, IPs, merchants, and transactions as nodes. Edges represent relationships (uses, sends-to, transacts-at). GNNs propagate information across these connections.
  • 3After 2-3 layers of message passing, each account embedding encodes its full transactional neighborhood: who it transacts with, what devices those counterparties use, and whether those devices appear in known fraud clusters.
  • 4GraphSAGE with neighbor sampling is the production standard for fraud graphs with millions of nodes. GAT adds interpretability by showing which edges contributed most to a fraud score.
  • 5Graph-based fraud detection improves AUROC by 5-15 points over tabular baselines. The biggest gains come from detecting coordinated fraud that no single-row model can see.

Graph neural networks detect fraud that tabular models structurally cannot see. A gradient-boosted tree scores each transaction or account as an independent row. It can learn that high-amount transactions at 3 AM are risky. But it cannot learn that five accounts sharing the same device fingerprint all sent money to the same beneficiary within 10 minutes. That pattern exists in the connections between rows, not in any single row.

When you represent transactions, accounts, devices, and merchants as a graph, fraud rings become visible structural patterns: dense clusters of nodes with unusual connectivity. GNNs learn to recognize these patterns automatically.

Why tabular models fail on coordinated fraud

Consider a money laundering ring with 8 accounts. Each account individually looks normal: moderate balances, reasonable transaction amounts, accounts aged 6+ months. A tabular model scores each account at low risk.

But in the graph, the pattern is obvious. These 8 accounts form a near-complete subgraph: they transact almost exclusively with each other, share 2 device fingerprints, and all received their initial deposits from the same source account within 48 hours. The graph structure screams anomaly. The tabular features whisper normalcy.

Building the fraud graph

A production fraud graph is heterogeneous, meaning it has multiple node types and edge types:

  • Node types: Account, Device, IP Address, Merchant, Transaction
  • Edge types: account-uses-device, account-has-IP, account-sends-to-account, transaction-at-merchant, account-initiates-transaction
  • Node features: Account (age, balance, country), Device (OS, fingerprint hash), Transaction (amount, timestamp, channel)
  • Edge features: Timestamp, amount, frequency of connection

Why heterogeneous graphs matter

Different entity types carry different fraud signals. A device shared by 50 accounts is suspicious. A merchant with 90% chargebacks is suspicious. An IP address used from two countries simultaneously is suspicious. Heterogeneous GNNs (using message passing with type-specific transformations) learn separate patterns for each entity type while allowing information to flow across types.

How GNNs detect fraud rings

The detection mechanism is message passing across the transaction graph:

Layer 1: direct connections

Each account node aggregates features from its direct neighbors: its devices, its transactions, its counterparties. After layer 1, the account embedding encodes: “I use 2 devices, made 47 transactions this month, and transact with 12 unique counterparties.”

Layer 2: two-hop neighborhood

Now each account sees its counterparties' neighborhoods. The embedding encodes: “My counterparties share 3 devices with each other, 6 of my 12 counterparties also transact with the same 2 merchant accounts, and 4 of them received initial funding from the same source.” This is the fraud ring signal. It is invisible at one hop.

fraud_detection_gnn.py
import torch
from torch_geometric.nn import SAGEConv, to_hetero
from torch_geometric.data import HeteroData

class FraudGNN(torch.nn.Module):
    def __init__(self, hidden_channels):
        super().__init__()
        self.conv1 = SAGEConv((-1, -1), hidden_channels)
        self.conv2 = SAGEConv((-1, -1), hidden_channels)
        self.classifier = torch.nn.Linear(hidden_channels, 1)

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        x = self.conv2(x, edge_index)
        return self.classifier(x)  # per-node fraud score

# Convert to heterogeneous model automatically
model = FraudGNN(hidden_channels=64)
model = to_hetero(model, data.metadata(), aggr='sum')

PyG's to_hetero() converts a homogeneous GNN into a heterogeneous one that handles multiple node and edge types automatically.

Production considerations

Deploying graph-based fraud detection at scale requires solving three challenges:

  • Scale: Production fraud graphs have hundreds of millions of nodes. GraphSAGE with mini-batching and neighbor sampling trains on subgraphs, not the full graph.
  • Latency: Real-time scoring requires sub-100ms inference. Pre-compute neighbor embeddings and only run the final layers on the local subgraph at inference time.
  • Temporal integrity: The model must not use future information. Edges must be filtered by timestamp so that at prediction time, only past transactions are visible. This is where temporal heterogeneous graphs become essential.

Graph signals tabular models cannot capture

The following patterns exist only in graph structure:

  • Ring topology: A circular flow of funds (A sends to B, B to C, ..., Z back to A) that launders money through apparent diversity of counterparties.
  • Device sharing clusters: Multiple “independent” accounts controlled from the same small set of devices.
  • Rapid fan-out: A single deposit split and forwarded through many intermediate accounts before consolidation. The graph shows the tree structure; tabular data shows individual transfers.
  • Behavioral mimicry with structural anomaly: Each account mimics normal behavior (amounts, timing, frequency) but the subgraph connectivity pattern is abnormal.

Frequently asked questions

Why do tabular models miss fraud rings?

Tabular models see each transaction or account as an independent row. They cannot see that five accounts share the same device fingerprint, IP address, and beneficiary. A graph model connects these entities as nodes and edges, making the ring structure visible in a single forward pass.

How do you build a fraud detection graph?

Nodes represent entities: accounts, devices, IP addresses, merchants, and transactions. Edges represent relationships: account-uses-device, account-sends-to-account, transaction-at-merchant. Node features include account age, transaction velocity, and amount statistics. The resulting heterogeneous graph captures the full ecosystem of interactions.

What GNN layers work best for fraud detection?

GraphSAGE (SAGEConv) with neighbor sampling is the most common production choice because fraud graphs are massive and SAGEConv scales via mini-batching. GAT (GATConv) is used when you need interpretable attention weights showing which connections contributed most to a fraud score. GIN (GINConv) maximizes expressiveness for distinguishing subtle structural differences between fraud and legitimate patterns.

How does graph-based fraud detection handle real-time scoring?

At inference time, when a new transaction arrives, you extract its local subgraph (the account, its recent transactions, connected devices, counterparties) and run a forward pass through the trained GNN. This takes milliseconds. The key is precomputing neighbor embeddings and only updating the local neighborhood, not the entire graph.

What accuracy improvements do graphs provide over tabular models?

In published benchmarks, graph-based fraud detection typically improves AUROC by 5-15 percentage points over gradient-boosted trees on the same data. On the RelBench fraud-related tasks, GNN-based models achieve 75+ AUROC versus 62 for flat-table LightGBM. The improvement is largest when fraud is coordinated across multiple entities.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.