What is ChebConv in PyTorch Geometric?

ChebConv implements spectral graph convolution using Chebyshev polynomials from Defferrard et al. (2016). Instead of aggregating in the spatial domain (like GCNConv), it applies a polynomial filter to the graph's spectral representation. The filter order K controls how many hops of neighbors are considered.

How does ChebConv relate to GCNConv?

GCNConv is actually a first-order approximation of ChebConv (K=1). Kipf & Welling simplified ChebConv by setting K=1 and making additional approximations. ChebConv with K>1 is more expressive because it can learn K-hop filters, while GCNConv is limited to 1-hop aggregation per layer.

What does the filter order K control?

K controls the receptive field of the filter. With K=1, ChebConv considers only direct neighbors (similar to GCNConv). With K=3, it considers neighbors up to 3 hops away in a single layer. Higher K gives more expressiveness but increases computation. Typical values are K=2-5.

Is ChebConv spatial or spectral?

ChebConv is spectral in theory (defined through the graph Laplacian's eigenvalues) but spatial in practice (implemented using Chebyshev polynomial recurrence, which only requires local neighbor access). You do not need to compute eigenvalues; the polynomial recurrence handles everything efficiently.

When should I use ChebConv vs GCNConv?

Use ChebConv when you need multi-hop receptive fields in a single layer (set K=3-5) or when spectral properties of the graph matter for your task. Use GCNConv for simplicity and speed on standard tasks. The accuracy difference is often small on citation benchmarks, but ChebConv can be better on tasks requiring broader context per layer.

ChebConv: Spectral Graph Convolution with Chebyshev Polynomials | PyG Guide

Original Paper

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Defferrard et al. (2016). NeurIPS 2016

Read paper →

What ChebConv does

ChebConv defines a learnable filter in the spectral domain of the graph using Chebyshev polynomials:

Compute Chebyshev polynomial basis up to order K using the graph Laplacian
Learn coefficients for each polynomial basis (the filter)
Apply the polynomial filter to node features

The polynomial recurrence (T_k(x) = 2x*T_{k-1}(x) - T_{k-2}(x)) means this is computed efficiently using only sparse matrix multiplications, without explicit eigendecomposition.

The math (simplified)

ChebConv formula

# Chebyshev polynomial filter
h_i' = Σ_{k=0}^{K} theta_k · T_k(L_norm) · X

Where:
  T_k     = Chebyshev polynomial of order k
  L_norm  = normalized graph Laplacian (scaled to [-1, 1])
  theta_k = learnable filter coefficients
  K       = filter order (receptive field = K hops)

Recurrence (no eigendecomposition needed):
  T_0(L) = I
  T_1(L) = L
  T_k(L) = 2 · L · T_{k-1}(L) - T_{k-2}(L)

GCNConv is this with K=1 and theta_0 = -theta_1

Each Chebyshev basis T_k captures k-hop information. The learnable coefficients theta_k control how much weight each hop gets.

PyG implementation

cheb_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import ChebConv

class ChebNet(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels, K=3):
        super().__init__()
        self.conv1 = ChebConv(in_channels, hidden_channels, K=K)
        self.conv2 = ChebConv(hidden_channels, out_channels, K=K)

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.conv2(x, edge_index)
        return x

# K=3: each layer captures 3-hop neighborhoods
# 2 layers = effective 6-hop receptive field
model = ChebNet(dataset.num_features, 32, dataset.num_classes, K=3)

With K=3 and 2 layers, each node sees a 6-hop neighborhood. Equivalent receptive field would require 6 stacked GCNConv layers.

When to use ChebConv

Multi-hop context in fewer layers. A single ChebConv layer with K=5 has the same receptive field as 5 GCNConv layers, avoiding over-smoothing from deep stacking. Note that the learned function class differs: ChebConv learns polynomial filters, while stacked GCNConv with nonlinearities learns more general nonlinear functions.
Mesh and manifold processing. Spectral methods are natural for geometric data where the Laplacian captures surface properties (curvature, smoothness).
Physics simulations. The spectral interpretation maps well to physical frequency decomposition (low-frequency = smooth variations, high-frequency = sharp changes).

When not to use ChebConv

Graphs with very different spectra. ChebConv's polynomial filters are defined on the normalized Laplacian. The learned coefficients (theta_k) are parametric and transferable, but their effect depends on the graph's spectral properties. For inductive settings across structurally different graphs, spatial methods like SAGEConv are more robust.
When GCNConv suffices. On standard citation benchmarks, ChebConv with K=2 performs similarly to GCNConv. The extra computation is not always justified.

ChebConv: Spectral Filtering with Chebyshev Polynomials