What does equivariant mean for GNNs?

A GNN is equivariant if its output transforms in a predictable, consistent way when the input is transformed. For SE(3) equivariance: if you rotate a molecule's 3D coordinates by 90 degrees, the predicted force vectors also rotate by exactly 90 degrees. The prediction respects the physical symmetry.

What is the difference between invariant and equivariant?

Invariant: the output does not change when the input is transformed. Rotating a molecule does not change its energy (a scalar). Equivariant: the output transforms consistently with the input. Rotating a molecule rotates its force vectors by the same amount. Energy prediction needs invariance. Force prediction needs equivariance.

Why not just augment training data with rotations?

Data augmentation with random rotations works but is inefficient: you are asking the model to learn a symmetry that could be guaranteed by architecture. Equivariant architectures guarantee correct behavior under all rotations with zero additional training data, better generalization, and smaller model sizes.

Equivariant GNNs: Respecting Rotational and Translational Symmetry | Kumo.ai

An equivariant GNN produces outputs that transform in a predictable, consistent way when the input undergoes a geometric transformation (rotation, translation, or reflection). If you rotate a molecule by 90 degrees, an equivariant GNN's predicted force vectors rotate by exactly 90 degrees, while its predicted energy (a scalar) stays the same. This symmetry is guaranteed by the architecture, not learned from data.

Invariance vs equivariance

Physical predictions have different symmetry requirements:

Invariant (unchanged under transformation): energy, binding affinity, toxicity, solubility. Rotating a molecule does not change its energy.
Equivariant (transforms consistently): forces, dipole moments, velocity fields. Rotating a molecule rotates its force vectors by the same amount.

A standard GNN that uses absolute 3D coordinates as features violates both: the same molecule in a different orientation gets different predictions. Data augmentation (training with random rotations) helps but wastes model capacity learning a property that should be guaranteed.

Building blocks of equivariant GNNs

Relative positions

The foundation of equivariance: use relative positions (r_ij = x_j - x_i) instead of absolute coordinates. Relative positions are equivariant: rotating all atoms rotates all relative positions consistently. Distances (||r_ij||) are invariant.

Scalar and vector channels

Equivariant GNNs maintain two types of features per node:

Scalar features: Invariant under rotation (energy-like quantities). Standard neural network operations apply.
Vector features: Equivariant under rotation (force-like quantities). Only equivariant operations (dot products, cross products, scaling) are allowed.

equivariant_message.py

# Equivariant message passing (simplified PaiNN-style)
def equivariant_message(s_j, v_j, r_ij):
    """
    s_j: scalar features of neighbor j (invariant)
    v_j: vector features of neighbor j (equivariant)
    r_ij: relative position vector (equivariant)
    """
    d_ij = torch.norm(r_ij, dim=-1, keepdim=True)  # distance (invariant)
    r_hat = r_ij / (d_ij + 1e-8)  # unit direction (equivariant)

    # Scalar message: uses distance (invariant)
    phi = mlp(d_ij)  # radial basis function
    s_msg = phi * s_j

    # Vector message: scales direction by learned weight
    v_msg = phi.unsqueeze(-1) * v_j + psi(s_j).unsqueeze(-1) * r_hat

    return s_msg, v_msg

The key constraint: vector features can only be combined with equivariant operations (scaling, addition of vectors, dot/cross products). MLPs only operate on scalars.

Architecture progression

Each generation adds more geometric information:

SchNet: Uses only pairwise distances. Invariant but cannot distinguish mirror images (chirality).
DimeNet: Adds bond angles between triplets of atoms. Invariant with richer geometric context.
PaiNN: Maintains vector features alongside scalars. Full SE(3) equivariance for force predictions.
MACE: Uses higher-order tensors (spherical harmonics). Captures complex multi-body interactions with maximum expressiveness.

Results on molecular benchmarks

On MD17 (molecular dynamics) and QM9 (quantum chemistry):

Non-equivariant GNN: 45 meV MAE on force prediction
SchNet (invariant, distances only): 28 meV MAE
PaiNN (equivariant, vectors): 15 meV MAE
MACE (equivariant, higher-order): 8 meV MAE

Each step of added geometric information yields substantial improvement. Equivariance is not optional for molecular applications; it is the difference between useful and useless predictions.

Equivariant GNNs: Respecting Rotational and Translational Symmetry