What HANConv does
HANConv operates in two stages:
- Node-level attention: For each meta-path, apply GAT-style attention to aggregate neighbors reachable via that meta-path. Each meta-path produces one embedding per node.
- Semantic-level attention: Attend across the meta-path-specific embeddings to produce a final node representation. This learns which meta-paths are most informative for the task.
The result is a node embedding that captures information from multiple relationship patterns, weighted by their relevance.
The math (simplified)
# Stage 1: Node-level attention per meta-path P
For each meta-path P:
z_i^P = Σ_j alpha_ij^P · W^P · h_j
where j are neighbors reachable via meta-path P
alpha_ij^P = attention weight (GAT-style)
# Stage 2: Semantic-level attention across meta-paths
beta_P = softmax( q^T · tanh(W_sem · z_i^P + b) )
h_i' = Σ_P beta_P · z_i^P
Where:
P = meta-path (e.g., Author-Paper-Author)
beta_P = importance of meta-path P (learned)
q, W_sem = semantic attention parametersThe semantic attention weights (beta) reveal which meta-paths the model relies on, providing interpretability that other heterogeneous layers lack.
PyG implementation
import torch
import torch.nn.functional as F
from torch_geometric.nn import HANConv
class HAN(torch.nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels,
metadata, heads=8):
super().__init__()
# metadata = (node_types, edge_types) from HeteroData
self.conv1 = HANConv(in_channels, hidden_channels,
metadata=metadata, heads=heads)
self.conv2 = HANConv(hidden_channels, out_channels,
metadata=metadata, heads=heads)
def forward(self, x_dict, edge_index_dict):
x_dict = self.conv1(x_dict, edge_index_dict)
x_dict = {k: F.elu(v) for k, v in x_dict.items()}
x_dict = self.conv2(x_dict, edge_index_dict)
return x_dict
# Define heterogeneous data with meta-paths
from torch_geometric.data import HeteroData
data = HeteroData()
data['author'].x = author_features
data['paper'].x = paper_features
data['author', 'writes', 'paper'].edge_index = writes_edges
data['paper', 'cites', 'paper'].edge_index = cites_edges
model = HAN(in_channels=-1, hidden_channels=64,
out_channels=num_classes, metadata=data.metadata())HANConv in PyG works with HeteroData and automatically handles the meta-path aggregation based on the edge types present in the data.
When to use HANConv
- When interpretability matters. The semantic attention weights reveal which meta-paths drive predictions. In a healthcare graph, you can see whether Patient-Doctor-Hospital or Patient-Drug-Condition paths matter more.
- When you have domain knowledge about meta-paths. If you know which composite relationships are meaningful for your task, encoding them as meta-paths injects useful inductive bias.
- Academic heterogeneous networks. Paper-Author, Paper-Venue, and Author-Institution relationships have well-studied meta-paths that HANConv can exploit directly.
When not to use HANConv
- When you do not know the right meta-paths. Bad meta-path choices hurt performance. If you lack domain knowledge, use HGTConv which discovers patterns automatically.
- Large graphs with many types. The number of meta-paths grows combinatorially with types. For enterprise databases with 20+ table types, HGTConv or HeteroConv is more practical.