What is graph isomorphism?

Two graphs G1 and G2 are isomorphic if there exists a one-to-one mapping between their nodes that preserves all edge connections. In simple terms: they have the same structure, just with different node labels. Determining whether two graphs are isomorphic is the graph isomorphism problem, which has no known polynomial-time algorithm for general graphs.

What is the Weisfeiler-Leman (WL) test?

The 1-WL test is an iterative algorithm that distinguishes non-isomorphic graphs. At each step, each node is assigned a label based on its own label and the sorted multiset of its neighbors' labels. If the multiset of all labels differs between two graphs at any step, they are not isomorphic. The 1-WL test cannot distinguish all non-isomorphic graphs (it fails on certain regular graphs) but is efficient and practical.

How does GIN relate to the WL test?

The Graph Isomorphism Network (GIN) was designed to be as expressive as the 1-WL test. It uses sum aggregation (not mean or max) and an injective update function (MLP instead of linear). This means GIN can distinguish any two graphs that the 1-WL test can distinguish. It is provably the most expressive standard message passing GNN.

Why does this matter for practical GNN applications?

The expressiveness ceiling determines what structural patterns a GNN can detect. If two different fraud ring topologies are indistinguishable by the 1-WL test, standard GNNs cannot tell them apart. Using maximally expressive architectures (GIN) or higher-order GNNs (k-WL) ensures the model can leverage all available structural information. For molecular property prediction, where subtle structural differences change properties, this is critical.

Graph Isomorphism: When Two Graphs Are Structurally Identical | Kumo.ai

Graph isomorphism is the foundational concept that defines what GNNs can and cannot distinguish. Two graphs are isomorphic when they have identical structure, meaning there is a node-to-node mapping that preserves all edges. If a GNN produces the same output for two non-isomorphic graphs, it has lost structural information. The Weisfeiler-Leman test defines the theoretical boundary of what standard message passing can distinguish.

The isomorphism problem

Given two graphs, are they isomorphic? This sounds simple for small graphs but is surprisingly hard: there is no known polynomial-time algorithm for general graphs. For GNNs, the question becomes: can this architecture produce different outputs for two non-isomorphic graphs? If not, the model cannot leverage the structural difference for prediction.

The Weisfeiler-Leman test

The 1-WL test is an iterative color refinement algorithm:

Initialize: assign each node the same color (or its degree as initial color)
Iterate: each node's new color = hash(own color, sorted multiset of neighbors' colors)
Compare: if the sorted multiset of all colors differs between two graphs, they are non-isomorphic
Repeat until colors stabilize

If the 1-WL test declares two graphs “potentially isomorphic,” they might still be non-isomorphic. The test has false negatives: it cannot distinguish some non-isomorphic graphs, particularly certain regular graphs where all nodes have the same degree.

GIN: achieving the ceiling

Most GNN layers do not reach the 1-WL ceiling. GCN uses mean aggregation, which loses information about multiset size (it cannot distinguish a node with 3 neighbors all having feature 1 from a node with 1 neighbor with feature 1). GraphSAGE with max aggregation loses even more.

GIN (Graph Isomorphism Network) is designed to match the 1-WL ceiling:

Sum aggregation: preserves the full multiset of neighbor features (including multiplicity). Sum is the only standard aggregation that is injective for multisets.
MLP update: an MLP (not a linear layer) ensures the update function is injective. This preserves the distinction between different aggregated messages.
Epsilon parameter: a learnable weight on the self-loop ensures the node's own features can be weighted differently from the aggregated neighbors.

Why expressiveness matters in practice

Consider two molecular graphs that are non-isomorphic but differ only in the arrangement of atoms in a ring:

Molecule A: cyclohexane (single 6-membered ring)
Molecule B: two fused 3-membered rings

Both have the same number of atoms and bonds. Both are 2-regular (every atom has degree 2). The 1-WL test (and therefore standard GNNs) cannot distinguish them. A GCN would assign both molecules the same graph embedding, even though their chemical properties are completely different.

For drug discovery, this indistinguishability is a real limitation. Graph transformers and higher-order GNNs overcome it.

Beyond 1-WL: higher-order expressiveness

Several approaches exceed the 1-WL ceiling:

k-WL and k-FWL: operate on k-tuples of nodes instead of individual nodes. More expressive but exponentially more expensive (O(N^k) complexity).
Graph transformers: global attention can capture structural patterns that local message passing misses. Structural positional encodings (Laplacian eigenvectors) provide information beyond what 1-WL computes.
Subgraph GNNs: process collections of subgraphs around each node, providing additional structural information.
Random features: adding random node features breaks symmetries that 1-WL preserves, increasing expressiveness at the cost of determinism.

Key Takeaways

1Two graphs are isomorphic when they have identical structure with different node labels. The WL test efficiently distinguishes most non-isomorphic graphs by iterative color refinement.
2Standard message passing GNNs (GCN, GAT, GraphSAGE) are bounded by the 1-WL test. They cannot distinguish non-isomorphic graphs that 1-WL cannot. This is the expressiveness ceiling.
3GIN (Graph Isomorphism Network) achieves the 1-WL ceiling through sum aggregation (preserves multiset information) and MLP updates (injective transformation). It is the most expressive standard message passing GNN.
4Mean and max aggregation lose information: GCN cannot distinguish nodes with different numbers of identical neighbors. Only sum preserves the full multiset, which is why GIN uses sum.
5Graph transformers and higher-order GNNs exceed 1-WL expressiveness. For molecular property prediction where subtle structural differences matter, going beyond 1-WL can be essential.

Graph Isomorphism: When Two Graphs Are Structurally Identical