HeteroConv: Generic Heterogeneous Message Passing | PyG Guide

Original Paper

Generic wrapper for heterogeneous message passing

PyG Team (2021). PyTorch Geometric

What HeteroConv does

HeteroConv takes a dictionary mapping edge types to layer instances. For each edge type, it runs the corresponding layer independently. Then it aggregates all messages arriving at each node:

For each edge type, apply its assigned layer to produce per-type messages
For each target node, collect messages from all relevant edge types
Aggregate (sum, mean, or custom) across edge types
Return updated node features per node type

PyG implementation

hetero_conv_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import HeteroConv, GATConv, SAGEConv, GCNConv

class HeteroGNN(torch.nn.Module):
    def __init__(self, hidden_channels, out_channels):
        super().__init__()
        # Different layer types per edge type
        self.conv1 = HeteroConv({
            ('user', 'purchases', 'product'): GATConv(-1, hidden_channels),
            ('user', 'reviews', 'product'): SAGEConv(-1, hidden_channels),
            ('product', 'co_viewed', 'product'): GCNConv(-1, hidden_channels),
        }, aggr='sum')

        self.conv2 = HeteroConv({
            ('user', 'purchases', 'product'): GATConv(hidden_channels, out_channels),
            ('user', 'reviews', 'product'): SAGEConv(hidden_channels, out_channels),
            ('product', 'co_viewed', 'product'): GCNConv(hidden_channels, out_channels),
        }, aggr='sum')

    def forward(self, x_dict, edge_index_dict):
        x_dict = self.conv1(x_dict, edge_index_dict)
        x_dict = {k: F.relu(v) for k, v in x_dict.items()}
        x_dict = self.conv2(x_dict, edge_index_dict)
        return x_dict

# Usage with HeteroData
model = HeteroGNN(hidden_channels=64, out_channels=num_classes)
out_dict = model(data.x_dict, data.edge_index_dict)

Each edge type gets its own layer instance. GATConv for purchases (attention matters), SAGEConv for reviews (scalability matters), GCNConv for co-views (simple baseline).

When to use HeteroConv

Rapid prototyping. Quickly test heterogeneous architectures by mixing layers you already know. No need to learn HGTConv's API.
Different edge types need different treatment. If purchase edges benefit from attention but co-view edges do not, HeteroConv lets you assign GATConv to one and GCNConv to the other.
Graphs with many edge types. HeteroConv scales linearly with the number of edge types since each type uses a lightweight base layer. HGTConv's cross-type attention becomes expensive with many types.
When you want maximum control. Choose layer types, hidden dimensions, and hyperparameters independently per edge type.

When not to use HeteroConv

When cross-type attention matters. HeteroConv runs layers independently per type and only aggregates at the end. It cannot learn joint attention across types like HGTConv.
Homogeneous graphs. No wrapper needed. Use the base layer directly.

HeteroConv vs to_hetero(). PyG also offerstorch_geometric.nn.to_hetero(model, metadata) which automatically converts a homogeneous model to heterogeneous. Use to_hetero() for quick prototyping with uniform architecture. Use HeteroConv when you need different layer types or configurations per edge type.

Frequently asked questions

What is HeteroConv in PyTorch Geometric?

HeteroConv is a generic wrapper that applies any PyG message-passing layer independently per edge type, then aggregates the results per node. It lets you reuse existing homogeneous layers (GCNConv, GATConv, SAGEConv) on heterogeneous graphs without writing custom heterogeneous code.

How does HeteroConv differ from HGTConv?

HGTConv is a single integrated heterogeneous layer with type-specific attention. HeteroConv is a wrapper that applies separate instances of any base layer per edge type. HeteroConv is more flexible (mix different layer types per edge) but does not have cross-type attention like HGTConv.

Can I use different layer types per edge type with HeteroConv?

Yes. For example, you can use GATConv for 'purchases' edges (where attention matters), SAGEConv for 'follows' edges (where scalability matters), and GCNConv for 'co-viewed' edges (where simplicity suffices). Each edge type gets its own layer instance.

How does HeteroConv aggregate messages from different edge types?

By default, HeteroConv sums messages from all edge types arriving at each node. You can change this to mean or other aggregation by setting the aggr parameter. The aggregation combines the outputs of the per-edge-type layers.

When should I use HeteroConv vs to_hetero()?

PyG's to_hetero() automatically converts a homogeneous model to heterogeneous by replicating layers per type. HeteroConv gives you manual control: different layer types, different hidden dimensions, or different parameters per edge type. Use to_hetero() for quick prototyping, HeteroConv for fine-grained control.

HeteroConv: Build Heterogeneous GNNs from Any Layer

What HeteroConv does

PyG implementation

When to use HeteroConv

When not to use HeteroConv

Frequently asked questions

What is HeteroConv in PyTorch Geometric?

How does HeteroConv differ from HGTConv?

Can I use different layer types per edge type with HeteroConv?

How does HeteroConv aggregate messages from different edge types?

When should I use HeteroConv vs to_hetero()?

Related

From the Kumo Learn Hub

Learn more about graph ML