Original Paper
Generic wrapper for heterogeneous message passing
PyG Team (2021). PyTorch Geometric
What HeteroConv does
HeteroConv takes a dictionary mapping edge types to layer instances. For each edge type, it runs the corresponding layer independently. Then it aggregates all messages arriving at each node:
- For each edge type, apply its assigned layer to produce per-type messages
- For each target node, collect messages from all relevant edge types
- Aggregate (sum, mean, or custom) across edge types
- Return updated node features per node type
PyG implementation
import torch
import torch.nn.functional as F
from torch_geometric.nn import HeteroConv, GATConv, SAGEConv, GCNConv
class HeteroGNN(torch.nn.Module):
def __init__(self, hidden_channels, out_channels):
super().__init__()
# Different layer types per edge type
self.conv1 = HeteroConv({
('user', 'purchases', 'product'): GATConv(-1, hidden_channels),
('user', 'reviews', 'product'): SAGEConv(-1, hidden_channels),
('product', 'co_viewed', 'product'): GCNConv(-1, hidden_channels),
}, aggr='sum')
self.conv2 = HeteroConv({
('user', 'purchases', 'product'): GATConv(hidden_channels, out_channels),
('user', 'reviews', 'product'): SAGEConv(hidden_channels, out_channels),
('product', 'co_viewed', 'product'): GCNConv(hidden_channels, out_channels),
}, aggr='sum')
def forward(self, x_dict, edge_index_dict):
x_dict = self.conv1(x_dict, edge_index_dict)
x_dict = {k: F.relu(v) for k, v in x_dict.items()}
x_dict = self.conv2(x_dict, edge_index_dict)
return x_dict
# Usage with HeteroData
model = HeteroGNN(hidden_channels=64, out_channels=num_classes)
out_dict = model(data.x_dict, data.edge_index_dict)Each edge type gets its own layer instance. GATConv for purchases (attention matters), SAGEConv for reviews (scalability matters), GCNConv for co-views (simple baseline).
When to use HeteroConv
- Rapid prototyping. Quickly test heterogeneous architectures by mixing layers you already know. No need to learn HGTConv's API.
- Different edge types need different treatment. If purchase edges benefit from attention but co-view edges do not, HeteroConv lets you assign GATConv to one and GCNConv to the other.
- Graphs with many edge types. HeteroConv scales linearly with the number of edge types since each type uses a lightweight base layer. HGTConv's cross-type attention becomes expensive with many types.
- When you want maximum control. Choose layer types, hidden dimensions, and hyperparameters independently per edge type.
When not to use HeteroConv
- When cross-type attention matters. HeteroConv runs layers independently per type and only aggregates at the end. It cannot learn joint attention across types like HGTConv.
- Homogeneous graphs. No wrapper needed. Use the base layer directly.