Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Guide9 min read

GNN Expressiveness: What Graph Structures Can a GNN Distinguish?

Expressiveness is the theoretical ceiling on what a GNN can learn. Standard message passing GNNs are bounded by the 1-Weisfeiler-Leman test. Understanding this bound tells you exactly what your GNN can and cannot capture.

PyTorch Geometric

TL;DR

  • 1Expressiveness measures which non-isomorphic graphs a GNN can distinguish. Higher expressiveness means the model can capture more structural patterns.
  • 2Standard message passing GNNs are upper-bounded by the 1-WL graph isomorphism test. Some structurally different graphs (e.g., 6-cycle vs. two triangles) produce identical GNN outputs.
  • 3GINConv achieves maximum expressiveness among standard MP-GNNs by using sum aggregation and an MLP update. GCNConv and SAGEConv are strictly less expressive due to mean/max aggregation.
  • 4For enterprise data, expressiveness determines whether the model can distinguish fraud rings from star patterns, or tight customer clusters from loose communities.
  • 5To go beyond 1-WL: use positional encodings, structural encodings, subgraph GNNs, or graph transformers. KumoRFM's graph transformer architecture exceeds 1-WL expressiveness.

GNN expressiveness defines the set of non-isomorphic graph structures that a graph neural network can assign different representations to. A maximally expressive GNN would assign different outputs to every pair of structurally different graphs. Standard message passing GNNs fall short of this goal: they are provably bounded by the 1-Weisfeiler-Leman (1-WL) graph isomorphism test, meaning certain graphs that a human can easily distinguish produce identical GNN outputs.

Why it matters for enterprise data

Enterprise relational databases contain structural patterns that carry predictive signal. Fraud rings form cycles in transaction graphs. Legitimate accounts form star patterns around merchants. Tightly connected customer clusters indicate different behavior from loosely linked communities.

If the GNN cannot distinguish these structures, it cannot learn the patterns. A model with 1-WL expressiveness might confuse a fraud ring (6 accounts in a cycle) with two legitimate triangles (6 accounts in two separate groups of 3). Higher expressiveness directly translates to the ability to capture more nuanced relational patterns.

The 1-WL expressiveness bound

The Weisfeiler-Leman test works by iteratively coloring nodes based on their neighborhood:

  1. Assign all nodes the same initial color
  2. For each node, collect the multiset of its neighbors' colors
  3. Hash (node's color, neighbor multiset) to produce a new color
  4. Repeat until colors stabilize

If two graphs end up with different color histograms, they are distinguishable. If they end up with the same histogram, 1-WL cannot tell them apart.

Message passing GNNs follow exactly the same logic: aggregate neighbor features (the multiset), combine with self features, and transform. The GNN can distinguish two graphs if and only if 1-WL can distinguish them.

Concrete example: which structures are indistinguishable?

Two well-known graph pairs that 1-WL (and standard GNNs) cannot distinguish:

  • 6-cycle vs. two triangles: Six nodes forming one ring vs. six nodes forming two separate triangles. Both have 6 nodes, 6 edges, and every node has degree 2. 1-WL assigns the same colors to both. A GNN produces identical graph-level representations.
  • Decalin vs. bicyclopentyl: Two molecular graphs with the same node degrees and local structure but different global topology. Relevant for molecular property prediction.

Expressiveness hierarchy of GNN layers

expressiveness_comparison.py
import torch
from torch_geometric.nn import GCNConv, SAGEConv, GINConv
from torch.nn import Sequential, Linear, ReLU

# LEAST EXPRESSIVE: GCNConv (degree-normalized mean aggregation)
# - Cannot distinguish multisets with same mean
# - {1, 1, 1} and {0, 1, 2} produce different means -> distinguishable
# - But loses information about multiset composition
gcn = GCNConv(16, 32)

# MODERATE: SAGEConv (mean or max aggregation + concat)
# - Slightly more expressive than GCN due to separate self/neighbor transforms
sage = SAGEConv(16, 32, aggr='mean')

# MOST EXPRESSIVE (within 1-WL): GINConv (sum + MLP)
# - Sum aggregation preserves multiset composition
# - MLP provides injective mapping (key for expressiveness)
# - Matches 1-WL exactly
gin_nn = Sequential(Linear(16, 32), ReLU(), Linear(32, 32))
gin = GINConv(gin_nn)

# BEYOND 1-WL: requires structural/positional encodings
# or graph transformers

GINConv is the only standard message passing layer that matches the 1-WL expressiveness bound. GCN and SAGE are strictly less expressive.

Going beyond 1-WL

Three practical approaches to exceed the 1-WL expressiveness bound:

  • Positional encodings: Add Laplacian eigenvectors or random walk features to node features before message passing. This gives nodes unique identifiers based on their position in the graph.
  • Structural encodings: Encode local topology (triangle counts, cycle memberships, subgraph patterns) as additional node features.
  • Graph transformers: Global attention over all nodes with positional encodings provides expressiveness beyond 1-WL while maintaining practical scalability.

Limitations and what comes next

  1. Higher expressiveness is not always better: More expressive models have larger hypothesis spaces, which can lead to overfitting on small datasets. GCNConv outperforms GINConv on some tasks because its inductive bias (degree normalization) is a good match.
  2. k-WL GNNs are expensive: Higher-order GNNs that match k-WL for k > 1 operate on tuples of nodes, leading to O(n^k) complexity. Not practical for large enterprise graphs.
  3. Practical vs. theoretical: The 1-WL limitation is worst-case. On many real enterprise tasks, the structures that 1-WL cannot distinguish rarely occur, and 1-WL-bounded GNNs perform well.

KumoRFM's Relational Graph Transformer uses structural encodings combined with global attention, achieving expressiveness beyond 1-WL while remaining scalable to enterprise-scale relational databases.

Frequently asked questions

What is GNN expressiveness?

GNN expressiveness refers to the ability of a graph neural network to distinguish between non-isomorphic graphs (or assign different representations to structurally different nodes). A more expressive GNN can tell apart graph structures that a less expressive one treats as identical. Standard message passing GNNs are upper-bounded by the 1-WL (Weisfeiler-Leman) graph isomorphism test.

What is the 1-WL expressiveness bound?

The 1-Weisfeiler-Leman (1-WL) test is a classical algorithm for testing graph isomorphism. It iteratively colors nodes based on their neighborhood multisets. Standard message passing GNNs can distinguish exactly the same set of graphs as 1-WL. This means there exist non-isomorphic graphs that look structurally different to a human but produce identical GNN outputs. For example, a 6-node cycle and two 3-node triangles can be indistinguishable to 1-WL-bounded GNNs.

Why does expressiveness matter for enterprise data?

In enterprise graphs, certain structural patterns carry important signals. A fraud ring (circular transaction pattern) looks different from a legitimate star pattern (one hub with many spokes). If the GNN cannot distinguish these structures, it cannot learn fraud-specific patterns. Higher expressiveness means the model can capture more nuanced structural features in relational data.

Which GNN layers are most expressive?

GINConv (Graph Isomorphism Network) achieves maximum expressiveness among standard message passing GNNs, matching the 1-WL test exactly. It uses sum aggregation and an injective update function (MLP). GCNConv and SAGEConv are strictly less expressive because mean/max aggregation loses information about multiset composition. To go beyond 1-WL, you need higher-order GNNs, subgraph GNNs, or graph transformers.

How do you go beyond 1-WL expressiveness?

Three approaches: (1) Higher-order GNNs that operate on k-tuples of nodes instead of individual nodes (k-WL). (2) Subgraph GNNs that extract node-specific subgraphs and process them independently. (3) Positional/structural encodings that inject additional structural information (random walk profiles, Laplacian eigenvectors) into node features before message passing.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.