An equivariant GNN produces outputs that transform in a predictable, consistent way when the input undergoes a geometric transformation (rotation, translation, or reflection). If you rotate a molecule by 90 degrees, an equivariant GNN's predicted force vectors rotate by exactly 90 degrees, while its predicted energy (a scalar) stays the same. This symmetry is guaranteed by the architecture, not learned from data.
Invariance vs equivariance
Physical predictions have different symmetry requirements:
- Invariant (unchanged under transformation): energy, binding affinity, toxicity, solubility. Rotating a molecule does not change its energy.
- Equivariant (transforms consistently): forces, dipole moments, velocity fields. Rotating a molecule rotates its force vectors by the same amount.
A standard GNN that uses absolute 3D coordinates as features violates both: the same molecule in a different orientation gets different predictions. Data augmentation (training with random rotations) helps but wastes model capacity learning a property that should be guaranteed.
Building blocks of equivariant GNNs
Relative positions
The foundation of equivariance: use relative positions (r_ij = x_j - x_i) instead of absolute coordinates. Relative positions are equivariant: rotating all atoms rotates all relative positions consistently. Distances (||r_ij||) are invariant.
Scalar and vector channels
Equivariant GNNs maintain two types of features per node:
- Scalar features: Invariant under rotation (energy-like quantities). Standard neural network operations apply.
- Vector features: Equivariant under rotation (force-like quantities). Only equivariant operations (dot products, cross products, scaling) are allowed.
# Equivariant message passing (simplified PaiNN-style)
def equivariant_message(s_j, v_j, r_ij):
"""
s_j: scalar features of neighbor j (invariant)
v_j: vector features of neighbor j (equivariant)
r_ij: relative position vector (equivariant)
"""
d_ij = torch.norm(r_ij, dim=-1, keepdim=True) # distance (invariant)
r_hat = r_ij / (d_ij + 1e-8) # unit direction (equivariant)
# Scalar message: uses distance (invariant)
phi = mlp(d_ij) # radial basis function
s_msg = phi * s_j
# Vector message: scales direction by learned weight
v_msg = phi.unsqueeze(-1) * v_j + psi(s_j).unsqueeze(-1) * r_hat
return s_msg, v_msgThe key constraint: vector features can only be combined with equivariant operations (scaling, addition of vectors, dot/cross products). MLPs only operate on scalars.
Architecture progression
Each generation adds more geometric information:
- SchNet: Uses only pairwise distances. Invariant but cannot distinguish mirror images (chirality).
- DimeNet: Adds bond angles between triplets of atoms. Invariant with richer geometric context.
- PaiNN: Maintains vector features alongside scalars. Full SE(3) equivariance for force predictions.
- MACE: Uses higher-order tensors (spherical harmonics). Captures complex multi-body interactions with maximum expressiveness.
Results on molecular benchmarks
On MD17 (molecular dynamics) and QM9 (quantum chemistry):
- Non-equivariant GNN: 45 meV MAE on force prediction
- SchNet (invariant, distances only): 28 meV MAE
- PaiNN (equivariant, vectors): 15 meV MAE
- MACE (equivariant, higher-order): 8 meV MAE
Each step of added geometric information yields substantial improvement. Equivariance is not optional for molecular applications; it is the difference between useful and useless predictions.