What is HANConv in PyTorch Geometric?

HANConv implements the Hierarchical Attention Network from Wang et al. (2019). It uses two levels of attention for heterogeneous graphs: node-level attention (within each meta-path) and semantic-level attention (across meta-paths). This learns which meta-paths are most important for each node's prediction.

A meta-path is a sequence of node types and edge types that defines a composite relationship. For example, Author-Paper-Author means two authors who co-authored a paper. Author-Paper-Venue-Paper-Author means two authors who published in the same venue. Different meta-paths capture different semantics.

How does HANConv differ from HGTConv?

HANConv requires pre-defined meta-paths and learns to weight them. HGTConv works directly on the heterogeneous graph without meta-paths, using type-specific attention. HANConv is more interpretable (you see which meta-paths matter). HGTConv is more flexible (no meta-path engineering required).

When should I use HANConv vs HGTConv?

Use HANConv when you have domain knowledge about meaningful meta-paths and want interpretable results. Use HGTConv when you want the model to discover relevant patterns automatically without manual meta-path design. For most practical purposes, HGTConv is more convenient.

How many meta-paths should I define for HANConv?

Typically 2-5 meta-paths that capture meaningfully different relationships. For an academic graph: Author-Paper-Author (co-authorship), Author-Paper-Venue-Paper-Author (same venue). More meta-paths add expressiveness but also complexity. The semantic-level attention will down-weight unhelpful ones.

HANConv: Hierarchical Attention Network for Heterogeneous Graphs | PyG Guide

Original Paper

Heterogeneous Graph Attention Network

Wang et al. (2019). WWW 2019

Read paper →

What HANConv does

HANConv operates in two stages:

Node-level attention: For each meta-path, apply GAT-style attention to aggregate neighbors reachable via that meta-path. Each meta-path produces one embedding per node.
Semantic-level attention: Attend across the meta-path-specific embeddings to produce a final node representation. This learns which meta-paths are most informative for the task.

The result is a node embedding that captures information from multiple relationship patterns, weighted by their relevance.

The math (simplified)

HANConv formula

# Stage 1: Node-level attention per meta-path P
For each meta-path P:
  z_i^P = Σ_j alpha_ij^P · W^P · h_j
  where j are neighbors reachable via meta-path P
  alpha_ij^P = attention weight (GAT-style)

# Stage 2: Semantic-level attention across meta-paths
beta_P = softmax( q^T · tanh(W_sem · z_i^P + b) )
h_i'   = Σ_P beta_P · z_i^P

Where:
  P       = meta-path (e.g., Author-Paper-Author)
  beta_P  = importance of meta-path P (learned)
  q, W_sem = semantic attention parameters

The semantic attention weights (beta) reveal which meta-paths the model relies on, providing interpretability that other heterogeneous layers lack.

PyG implementation

han_model.py

import torch
import torch.nn.functional as F
from torch_geometric.nn import HANConv

class HAN(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels,
                 metadata, heads=8):
        super().__init__()
        # metadata = (node_types, edge_types) from HeteroData
        self.conv1 = HANConv(in_channels, hidden_channels,
                             metadata=metadata, heads=heads)
        self.conv2 = HANConv(hidden_channels, out_channels,
                             metadata=metadata, heads=heads)

    def forward(self, x_dict, edge_index_dict):
        x_dict = self.conv1(x_dict, edge_index_dict)
        x_dict = {k: F.elu(v) for k, v in x_dict.items()}
        x_dict = self.conv2(x_dict, edge_index_dict)
        return x_dict

# Define heterogeneous data with meta-paths
from torch_geometric.data import HeteroData
data = HeteroData()
data['author'].x = author_features
data['paper'].x = paper_features
data['author', 'writes', 'paper'].edge_index = writes_edges
data['paper', 'cites', 'paper'].edge_index = cites_edges

model = HAN(in_channels=-1, hidden_channels=64,
            out_channels=num_classes, metadata=data.metadata())

HANConv in PyG works with HeteroData and automatically handles the meta-path aggregation based on the edge types present in the data.

When to use HANConv

When interpretability matters. The semantic attention weights reveal which meta-paths drive predictions. In a healthcare graph, you can see whether Patient-Doctor-Hospital or Patient-Drug-Condition paths matter more.
When you have domain knowledge about meta-paths. If you know which composite relationships are meaningful for your task, encoding them as meta-paths injects useful inductive bias.
Academic heterogeneous networks. Paper-Author, Paper-Venue, and Author-Institution relationships have well-studied meta-paths that HANConv can exploit directly.

When not to use HANConv

When you do not know the right meta-paths. Bad meta-path choices hurt performance. If you lack domain knowledge, use HGTConv which discovers patterns automatically.
Large graphs with many types. The number of meta-paths grows combinatorially with types. For enterprise databases with 20+ table types, HGTConv or HeteroConv is more practical.

HANConv: Meta-Path Attention for Heterogeneous Graphs

What HANConv does

The math (simplified)

PyG implementation

When to use HANConv

When not to use HANConv

Frequently asked questions

What is HANConv in PyTorch Geometric?

What is a meta-path?

How does HANConv differ from HGTConv?

When should I use HANConv vs HGTConv?

How many meta-paths should I define for HANConv?

Related

From the Kumo Learn Hub

Learn more about graph ML