Original Paper
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Defferrard et al. (2016). NeurIPS 2016
Read paper →What ChebConv does
ChebConv defines a learnable filter in the spectral domain of the graph using Chebyshev polynomials:
- Compute Chebyshev polynomial basis up to order K using the graph Laplacian
- Learn coefficients for each polynomial basis (the filter)
- Apply the polynomial filter to node features
The polynomial recurrence (T_k(x) = 2x*T_{k-1}(x) - T_{k-2}(x)) means this is computed efficiently using only sparse matrix multiplications, without explicit eigendecomposition.
The math (simplified)
# Chebyshev polynomial filter
h_i' = Σ_{k=0}^{K} theta_k · T_k(L_norm) · X
Where:
T_k = Chebyshev polynomial of order k
L_norm = normalized graph Laplacian (scaled to [-1, 1])
theta_k = learnable filter coefficients
K = filter order (receptive field = K hops)
Recurrence (no eigendecomposition needed):
T_0(L) = I
T_1(L) = L
T_k(L) = 2 · L · T_{k-1}(L) - T_{k-2}(L)
GCNConv is this with K=1 and theta_0 = -theta_1Each Chebyshev basis T_k captures k-hop information. The learnable coefficients theta_k control how much weight each hop gets.
PyG implementation
import torch
import torch.nn.functional as F
from torch_geometric.nn import ChebConv
class ChebNet(torch.nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels, K=3):
super().__init__()
self.conv1 = ChebConv(in_channels, hidden_channels, K=K)
self.conv2 = ChebConv(hidden_channels, out_channels, K=K)
def forward(self, x, edge_index):
x = self.conv1(x, edge_index)
x = F.relu(x)
x = F.dropout(x, p=0.5, training=self.training)
x = self.conv2(x, edge_index)
return x
# K=3: each layer captures 3-hop neighborhoods
# 2 layers = effective 6-hop receptive field
model = ChebNet(dataset.num_features, 32, dataset.num_classes, K=3)With K=3 and 2 layers, each node sees a 6-hop neighborhood. Equivalent receptive field would require 6 stacked GCNConv layers.
When to use ChebConv
- Multi-hop context in fewer layers. A single ChebConv layer with K=5 has the same receptive field as 5 GCNConv layers, avoiding over-smoothing from deep stacking. Note that the learned function class differs: ChebConv learns polynomial filters, while stacked GCNConv with nonlinearities learns more general nonlinear functions.
- Mesh and manifold processing. Spectral methods are natural for geometric data where the Laplacian captures surface properties (curvature, smoothness).
- Physics simulations. The spectral interpretation maps well to physical frequency decomposition (low-frequency = smooth variations, high-frequency = sharp changes).
When not to use ChebConv
- Graphs with very different spectra. ChebConv's polynomial filters are defined on the normalized Laplacian. The learned coefficients (theta_k) are parametric and transferable, but their effect depends on the graph's spectral properties. For inductive settings across structurally different graphs, spatial methods like SAGEConv are more robust.
- When GCNConv suffices. On standard citation benchmarks, ChebConv with K=2 performs similarly to GCNConv. The extra computation is not always justified.