Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Layer9 min read

HANConv: Meta-Path Attention for Heterogeneous Graphs

HANConv introduces hierarchical attention over meta-paths: first attend to neighbors within each meta-path, then attend across meta-paths to learn which composite relationships matter most. It is the interpretable approach to heterogeneous graph learning.

PyTorch Geometric

TL;DR

  • 1HANConv uses two-level attention: node-level (which neighbors within a meta-path?) and semantic-level (which meta-paths are important?). This provides built-in interpretability.
  • 2Meta-paths define composite relationships: Author-Paper-Author captures co-authorship, Author-Paper-Venue-Paper-Author captures same-venue connections.
  • 3Requires manual meta-path definition, which is both a strength (domain knowledge injection) and a weakness (requires expertise).
  • 4More interpretable than HGTConv but less flexible. Use HANConv when you want to understand which relationship patterns drive predictions.

Original Paper

Heterogeneous Graph Attention Network

Wang et al. (2019). WWW 2019

Read paper →

What HANConv does

HANConv operates in two stages:

  1. Node-level attention: For each meta-path, apply GAT-style attention to aggregate neighbors reachable via that meta-path. Each meta-path produces one embedding per node.
  2. Semantic-level attention: Attend across the meta-path-specific embeddings to produce a final node representation. This learns which meta-paths are most informative for the task.

The result is a node embedding that captures information from multiple relationship patterns, weighted by their relevance.

The math (simplified)

HANConv formula
# Stage 1: Node-level attention per meta-path P
For each meta-path P:
  z_i^P = Σ_j alpha_ij^P · W^P · h_j
  where j are neighbors reachable via meta-path P
  alpha_ij^P = attention weight (GAT-style)

# Stage 2: Semantic-level attention across meta-paths
beta_P = softmax( q^T · tanh(W_sem · z_i^P + b) )
h_i'   = Σ_P beta_P · z_i^P

Where:
  P       = meta-path (e.g., Author-Paper-Author)
  beta_P  = importance of meta-path P (learned)
  q, W_sem = semantic attention parameters

The semantic attention weights (beta) reveal which meta-paths the model relies on, providing interpretability that other heterogeneous layers lack.

PyG implementation

han_model.py
import torch
import torch.nn.functional as F
from torch_geometric.nn import HANConv

class HAN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels,
                 metadata, heads=8):
        super().__init__()
        # metadata = (node_types, edge_types) from HeteroData
        self.conv1 = HANConv(in_channels, hidden_channels,
                             metadata=metadata, heads=heads)
        self.conv2 = HANConv(hidden_channels, out_channels,
                             metadata=metadata, heads=heads)

    def forward(self, x_dict, edge_index_dict):
        x_dict = self.conv1(x_dict, edge_index_dict)
        x_dict = {k: F.elu(v) for k, v in x_dict.items()}
        x_dict = self.conv2(x_dict, edge_index_dict)
        return x_dict

# Define heterogeneous data with meta-paths
from torch_geometric.data import HeteroData
data = HeteroData()
data['author'].x = author_features
data['paper'].x = paper_features
data['author', 'writes', 'paper'].edge_index = writes_edges
data['paper', 'cites', 'paper'].edge_index = cites_edges

model = HAN(in_channels=-1, hidden_channels=64,
            out_channels=num_classes, metadata=data.metadata())

HANConv in PyG works with HeteroData and automatically handles the meta-path aggregation based on the edge types present in the data.

When to use HANConv

  • When interpretability matters. The semantic attention weights reveal which meta-paths drive predictions. In a healthcare graph, you can see whether Patient-Doctor-Hospital or Patient-Drug-Condition paths matter more.
  • When you have domain knowledge about meta-paths. If you know which composite relationships are meaningful for your task, encoding them as meta-paths injects useful inductive bias.
  • Academic heterogeneous networks. Paper-Author, Paper-Venue, and Author-Institution relationships have well-studied meta-paths that HANConv can exploit directly.

When not to use HANConv

  • When you do not know the right meta-paths. Bad meta-path choices hurt performance. If you lack domain knowledge, use HGTConv which discovers patterns automatically.
  • Large graphs with many types. The number of meta-paths grows combinatorially with types. For enterprise databases with 20+ table types, HGTConv or HeteroConv is more practical.

Frequently asked questions

What is HANConv in PyTorch Geometric?

HANConv implements the Hierarchical Attention Network from Wang et al. (2019). It uses two levels of attention for heterogeneous graphs: node-level attention (within each meta-path) and semantic-level attention (across meta-paths). This learns which meta-paths are most important for each node's prediction.

What is a meta-path?

A meta-path is a sequence of node types and edge types that defines a composite relationship. For example, Author-Paper-Author means two authors who co-authored a paper. Author-Paper-Venue-Paper-Author means two authors who published in the same venue. Different meta-paths capture different semantics.

How does HANConv differ from HGTConv?

HANConv requires pre-defined meta-paths and learns to weight them. HGTConv works directly on the heterogeneous graph without meta-paths, using type-specific attention. HANConv is more interpretable (you see which meta-paths matter). HGTConv is more flexible (no meta-path engineering required).

When should I use HANConv vs HGTConv?

Use HANConv when you have domain knowledge about meaningful meta-paths and want interpretable results. Use HGTConv when you want the model to discover relevant patterns automatically without manual meta-path design. For most practical purposes, HGTConv is more convenient.

How many meta-paths should I define for HANConv?

Typically 2-5 meta-paths that capture meaningfully different relationships. For an academic graph: Author-Paper-Author (co-authorship), Author-Paper-Venue-Paper-Author (same venue). More meta-paths add expressiveness but also complexity. The semantic-level attention will down-weight unhelpful ones.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.