Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Layer8 min read

HeteroConv: Build Heterogeneous GNNs from Any Layer

HeteroConv is PyG's Swiss Army knife for heterogeneous graphs. It wraps any message-passing layer and applies it per edge type, letting you build custom heterogeneous architectures from familiar building blocks like GCNConv, GATConv, and SAGEConv.

PyTorch Geometric

TL;DR

  • 1HeteroConv wraps any PyG layer and applies separate instances per edge type. You can mix GATConv for some edges and SAGEConv for others.
  • 2Messages from different edge types are aggregated (sum by default) at each target node. Each edge type has independent parameters.
  • 3More flexible than HGTConv (mix layer types) but without integrated cross-type attention. Choose based on whether you need flexibility or unified attention.
  • 4Use HeteroConv for rapid prototyping of heterogeneous architectures, or when different edge types genuinely need different treatment.

Original Paper

Generic wrapper for heterogeneous message passing

PyG Team (2021). PyTorch Geometric

What HeteroConv does

HeteroConv takes a dictionary mapping edge types to layer instances. For each edge type, it runs the corresponding layer independently. Then it aggregates all messages arriving at each node:

  1. For each edge type, apply its assigned layer to produce per-type messages
  2. For each target node, collect messages from all relevant edge types
  3. Aggregate (sum, mean, or custom) across edge types
  4. Return updated node features per node type

PyG implementation

hetero_conv_model.py
import torch
import torch.nn.functional as F
from torch_geometric.nn import HeteroConv, GATConv, SAGEConv, GCNConv

class HeteroGNN(torch.nn.Module):
    def __init__(self, hidden_channels, out_channels):
        super().__init__()
        # Different layer types per edge type
        self.conv1 = HeteroConv({
            ('user', 'purchases', 'product'): GATConv(-1, hidden_channels),
            ('user', 'reviews', 'product'): SAGEConv(-1, hidden_channels),
            ('product', 'co_viewed', 'product'): GCNConv(-1, hidden_channels),
        }, aggr='sum')

        self.conv2 = HeteroConv({
            ('user', 'purchases', 'product'): GATConv(hidden_channels, out_channels),
            ('user', 'reviews', 'product'): SAGEConv(hidden_channels, out_channels),
            ('product', 'co_viewed', 'product'): GCNConv(hidden_channels, out_channels),
        }, aggr='sum')

    def forward(self, x_dict, edge_index_dict):
        x_dict = self.conv1(x_dict, edge_index_dict)
        x_dict = {k: F.relu(v) for k, v in x_dict.items()}
        x_dict = self.conv2(x_dict, edge_index_dict)
        return x_dict

# Usage with HeteroData
model = HeteroGNN(hidden_channels=64, out_channels=num_classes)
out_dict = model(data.x_dict, data.edge_index_dict)

Each edge type gets its own layer instance. GATConv for purchases (attention matters), SAGEConv for reviews (scalability matters), GCNConv for co-views (simple baseline).

When to use HeteroConv

  • Rapid prototyping. Quickly test heterogeneous architectures by mixing layers you already know. No need to learn HGTConv's API.
  • Different edge types need different treatment. If purchase edges benefit from attention but co-view edges do not, HeteroConv lets you assign GATConv to one and GCNConv to the other.
  • Graphs with many edge types. HeteroConv scales linearly with the number of edge types since each type uses a lightweight base layer. HGTConv's cross-type attention becomes expensive with many types.
  • When you want maximum control. Choose layer types, hidden dimensions, and hyperparameters independently per edge type.

When not to use HeteroConv

  • When cross-type attention matters. HeteroConv runs layers independently per type and only aggregates at the end. It cannot learn joint attention across types like HGTConv.
  • Homogeneous graphs. No wrapper needed. Use the base layer directly.

Frequently asked questions

What is HeteroConv in PyTorch Geometric?

HeteroConv is a generic wrapper that applies any PyG message-passing layer independently per edge type, then aggregates the results per node. It lets you reuse existing homogeneous layers (GCNConv, GATConv, SAGEConv) on heterogeneous graphs without writing custom heterogeneous code.

How does HeteroConv differ from HGTConv?

HGTConv is a single integrated heterogeneous layer with type-specific attention. HeteroConv is a wrapper that applies separate instances of any base layer per edge type. HeteroConv is more flexible (mix different layer types per edge) but does not have cross-type attention like HGTConv.

Can I use different layer types per edge type with HeteroConv?

Yes. For example, you can use GATConv for 'purchases' edges (where attention matters), SAGEConv for 'follows' edges (where scalability matters), and GCNConv for 'co-viewed' edges (where simplicity suffices). Each edge type gets its own layer instance.

How does HeteroConv aggregate messages from different edge types?

By default, HeteroConv sums messages from all edge types arriving at each node. You can change this to mean or other aggregation by setting the aggr parameter. The aggregation combines the outputs of the per-edge-type layers.

When should I use HeteroConv vs to_hetero()?

PyG's to_hetero() automatically converts a homogeneous model to heterogeneous by replicating layers per type. HeteroConv gives you manual control: different layer types, different hidden dimensions, or different parameters per edge type. Use to_hetero() for quick prototyping, HeteroConv for fine-grained control.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.