Traffic congestion propagates through road networks as a spatial phenomenon, and graph neural networks are the natural architecture for modeling this propagation. A slowdown on a highway entrance ramp causes congestion that spreads upstream over minutes. An accident on a major artery diverts traffic to parallel routes, causing secondary congestion. These are graph-structured patterns: the congestion signal travels along edges (roads) between nodes (intersections or sensors).
Traditional traffic forecasting treats each sensor as an independent time series. An LSTM at sensor A learns rush hour patterns for sensor A. But it cannot learn that congestion at sensor B (3 miles upstream) predicts congestion at sensor A in 12 minutes. GNNs capture this spatial dependency through message passing on the road graph.
The road network as a graph
Building a traffic graph requires two components:
- Nodes: each traffic sensor or road segment. Node features are time-varying: speed, flow (vehicles per hour), and occupancy (fraction of time the sensor is occupied).
- Edges: road connections between sensors. Edge weights are typically the inverse of road distance (closer sensors have stronger connections). Some models use directed edges to capture one-way streets and directional traffic flow.
On METR-LA (207 sensors across Los Angeles highways), the graph has 207 nodes and approximately 1,500 edges. Each node carries a feature vector that updates every 5 minutes. The prediction task: given the past 60 minutes of sensor readings across the network, predict the next 15-60 minutes at all sensors simultaneously.
Spatio-temporal architecture
The defining innovation in traffic GNNs is combining spatial graph operations with temporal sequence operations:
Spatial component: graph convolution
At each time step, a graph convolution propagates traffic information across the road network. After one layer, each sensor's embedding includes information from directly connected sensors. After two layers, it includes 2-hop information. This captures the spatial propagation of congestion.
Temporal component: sequence modeling
Across time steps, a temporal model (1D convolution, GRU, or dilated causal convolution) captures patterns like rush hour cycles, weekend effects, and holiday impacts. The temporal component operates on the spatially-enriched embeddings from the graph convolution.
Foundational models
Three architectures established the field:
- DCRNN (Diffusion Convolutional Recurrent Neural Network): models traffic as a diffusion process on the graph. Uses diffusion convolution for spatial and GRU for temporal. Introduced the encoder-decoder framework for multi-step forecasting.
- STGCN (Spatio-Temporal Graph Convolutional Network): uses spectral graph convolutions with gated 1D temporal convolutions. Faster than DCRNN due to non-recurrent temporal processing.
- Graph WaveNet: combines adaptive graph learning (the model learns the graph structure rather than relying solely on road connectivity) with dilated causal convolutions for long-range temporal dependencies.
Why spatial modeling matters: a concrete example
Consider a 3-sensor stretch of highway: sensor B is between sensors A (upstream) and C (downstream). At 5:05 PM, sensor A detects a speed drop to 20 mph. An LSTM at sensor B does not see this. A GNN does:
- At 5:05 PM, sensor A's speed feature drops. Graph convolution propagates this to sensor B's embedding.
- The temporal component recognizes this pattern from training: upstream slowdown predicts local slowdown with a characteristic delay.
- At 5:10 PM, the model predicts sensor B will drop to 25 mph, and sensor C will drop to 30 mph at 5:15 PM.
The LSTM at sensor B would not predict the slowdown until sensor B itself starts slowing down. The GNN predicts it 5-10 minutes earlier by learning the spatial propagation.
Performance on benchmarks
On METR-LA (15-minute ahead prediction), representative results:
- Historical Average: 7.80 MAE (mph)
- ARIMA: 5.55 MAE
- LSTM: 4.19 MAE
- DCRNN: 3.17 MAE
- Graph WaveNet: 2.99 MAE
The gap widens at longer horizons (60 minutes), where spatial propagation patterns become even more important. GNN models maintain accuracy while univariate models degrade rapidly.
Beyond road traffic
The same spatio-temporal graph architecture applies to any flow on a network:
- Public transit: predict passenger flow at stations using the transit network graph
- Ride-sharing demand: predict ride requests across city zones connected by travel patterns
- Air traffic: predict flight delays propagating through the airport network
- Logistics: predict delivery times across warehouse and route networks