Graph convolution can be defined in two ways: in the spectral domain (using the graph Laplacian's eigenvectors) or in the spatial domain (using direct neighborhood operations). Spectral methods are mathematically rigorous, drawing on graph signal processing theory. Spatial methods are practical, scalable, and the basis of all modern GNN architectures. The two are connected: GCN is a spectral method that simplifies to a spatial operation.
Spectral approach: graph Fourier transform
The graph Laplacian L = D - A (degree matrix minus adjacency matrix) is a positive semi-definite matrix with eigenvectors that form a basis for signals on the graph. Just as the Fourier transform decomposes a time-domain signal into frequency components, the graph Fourier transform decomposes a graph signal (node features) into components along the Laplacian eigenvectors.
- Low-frequency eigenvectors: smooth signals that vary slowly across the graph (nearby nodes have similar values)
- High-frequency eigenvectors: sharp signals that vary rapidly (neighboring nodes have different values)
Spectral convolution applies a learnable filter in this eigenspace: multiply each frequency component by a learned coefficient, then inverse-transform back to node space. This is analogous to frequency-domain filtering in signal processing.
Limitations of spectral methods
- Expensive: eigendecomposition costs O(N^3). Infeasible for graphs with more than ~10K nodes.
- Graph-specific: eigenvectors are unique to each graph. A filter learned on graph A cannot be applied to graph B (different eigenvectors).
- Full-graph required: the Laplacian is defined for the entire graph. No mini-batching or sampling.
Spatial approach: neighborhood aggregation
Spatial convolution is message passing: each node aggregates features from its local neighborhood, applies a shared transformation, and updates its representation. No eigendecomposition, no frequency domain, no graph-specific filters.
- Scalable: computation scales with edges O(|E|), not nodes cubed
- Inductive: the same weights work on any graph (shared transformation, not graph-specific filter)
- Mini-batch friendly: can sample subgraphs and train on mini-batches
- Heterogeneous: naturally supports multiple node and edge types
Why spatial methods won
The shift from spectral to spatial happened because of four practical advantages:
- Scale: real-world graphs have millions of nodes. Eigendecomposition is impossible. Neighborhood aggregation is trivially parallelizable.
- Transfer: enterprise applications need models that work on new data. A recommendation model trained on user-item graph A should work on graph B. Spatial methods transfer; spectral methods do not.
- Mini-batching: spatial methods support neighbor sampling (GraphSAGE) and subgraph batching. Spectral methods require the full graph.
- Flexibility: spatial methods naturally incorporate edge features, directed edges, heterogeneous types, and temporal filtering. Spectral methods would need substantial modifications for each.
Where spectral theory still matters
Spectral graph theory remains important even though implementation is spatial:
- Positional encodings: Laplacian eigenvectors provide structural positional information for graph transformers (analogous to positional encodings in language transformers)
- Over-smoothing analysis: spectral theory explains over-smoothing as low-pass filtering that eliminates high-frequency (discriminative) components
- Expressiveness theory: spectral properties of the graph (eigenvalue distribution) determine what GNNs can and cannot distinguish
- Graph wavelets: multi-scale spectral analysis provides richer structural features than simple neighborhood aggregation