What is a hypergraph?

A hypergraph is a graph where edges (called hyperedges) can connect more than two nodes simultaneously. In a standard graph, an edge connects exactly two nodes. A hyperedge can connect 3, 10, or 100 nodes. This models group interactions: a meeting involves multiple people, a co-authored paper has multiple authors, a shopping cart contains multiple products.

Why can't regular graphs model group interactions?

Regular graphs decompose group interactions into pairwise edges, losing the group structure. A meeting of 4 people becomes 6 pairwise edges, indistinguishable from 4 people who each had separate one-on-one conversations. A hyperedge preserves the fact that all 4 participated in the same event.

How does message passing work on hypergraphs?

Hypergraph message passing has two phases: (1) nodes send messages to their hyperedges (aggregating member information), (2) hyperedges send messages back to their member nodes (distributing group information). This node-hyperedge-node propagation lets each node learn from all other members of its groups.

How do I represent a hypergraph in PyG?

Use a hyperedge index: a [2, N] tensor where row 0 is the node ID and row 1 is the hyperedge ID. Each (node, hyperedge) pair indicates membership. HypergraphConv in PyG handles this format natively. Alternatively, convert to a bipartite graph with node and hyperedge nodes.

What are common applications of hypergraphs?

Co-authorship networks (paper = hyperedge connecting authors), shopping baskets (cart = hyperedge connecting products), email threads (email = hyperedge connecting recipients), chemical reactions (reaction = hyperedge connecting molecules), and team sports (play = hyperedge connecting players).

Hypergraphs in GNNs: Edges Connecting Multiple Nodes | Kumo.ai

A hypergraph is a graph where edges can connect more than two nodes. In a standard graph, every edge connects exactly two nodes. In a hypergraph, an edge (called a hyperedge) can connect any number of nodes simultaneously. A co-authored paper connects 5 authors. A shopping cart connects 12 products. A meeting connects 8 attendees. These are natural hyperedges.

The limitation of standard graphs is that they decompose every group interaction into pairwise edges. A meeting of 4 people becomes 6 edges (every pair), which is indistinguishable from 4 people who each had separate bilateral conversations. A hyperedge preserves the group: all 4 were in the same room at the same time.

Pairwise vs group: what you lose

Consider a research collaboration network. Three researchers (A, B, C) co-author a paper. In a pairwise graph, this creates edges A-B, B-C, A-C. But this looks identical to a scenario where A and B wrote one paper, B and C wrote another, and A and C wrote a third. The group structure is lost.

A hyperedge {A, B, C} encodes the fact that all three collaborated simultaneously. This distinction matters: researchers who co-author in large groups behave differently from those who always work in pairs.

Hypergraph message passing

Message passing on hypergraphs has two phases per layer:

Node to hyperedge: each hyperedge aggregates features from all its member nodes. A meeting hyperedge collects information from all attendees.
Hyperedge to node: each node aggregates information from all hyperedges it belongs to. An attendee receives information from all their meetings.

hypergraph_conv.py

import torch
from torch_geometric.nn import HypergraphConv

# 10 nodes, each with 16 features
x = torch.randn(10, 16)

# Hyperedge index: (node_id, hyperedge_id)
# Hyperedge 0 connects nodes {0, 1, 2}
# Hyperedge 1 connects nodes {2, 3, 4, 5}
# Hyperedge 2 connects nodes {0, 5, 7, 8, 9}
hyperedge_index = torch.tensor([
    [0, 1, 2, 2, 3, 4, 5, 0, 5, 7, 8, 9],  # node IDs
    [0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2],  # hyperedge IDs
], dtype=torch.long)

conv = HypergraphConv(16, 32)
out = conv(x, hyperedge_index)
# out.shape: [10, 32]
# Node 2 received info from hyperedges 0 and 1
# Node 5 received info from hyperedges 1 and 2

HypergraphConv handles the two-phase propagation internally. Each node learns from all members of all its groups.

Enterprise example: market basket analysis

A grocery retailer models shopping baskets as a hypergraph. Each basket is a hyperedge connecting the products purchased together:

Basket 1 = {bread, butter, milk, eggs}
Basket 2 = {bread, cheese, wine}
Basket 3 = {milk, cereal, bananas, yogurt}

After hypergraph message passing, bread's embedding reflects that it co-occurs with dairy products (basket 1) and with wine/cheese (basket 2). These are different shopping contexts. The hypergraph captures both without conflating them, unlike pairwise co-occurrence which would merge all signals.

Applications: bundle recommendations (suggest products that complete a basket), store layout optimization (products in the same hyperedges should be near each other), and demand forecasting (when one product in a basket trends up, others may follow).

Converting to standard graphs

If you cannot use HypergraphConv, two conversion strategies:

Clique expansion: replace each hyperedge with all pairwise edges between its members. Simple but loses group structure and creates many edges.
Bipartite expansion: create explicit hyperedge nodes. Each member node connects to its hyperedge node. This preserves group structure exactly and works with standard GNN layers on a bipartite graph.

bipartite_expansion.py

# Convert hypergraph to bipartite graph
# Original: hyperedge 0 = {node_0, node_1, node_2}
# Bipartite: node_0 -- hedge_0, node_1 -- hedge_0, node_2 -- hedge_0

# Add hyperedge nodes with IDs offset by num_nodes
num_nodes = 10
num_hyperedges = 3

# Hyperedge nodes get initial features via aggregation
hedge_features = scatter_mean(x[hyperedge_index[0]], hyperedge_index[1])

# Build bipartite edge_index: member_nodes <-> hyperedge_nodes
bipartite_edges = torch.stack([
    hyperedge_index[0],                    # member node IDs
    hyperedge_index[1] + num_nodes,        # offset hyperedge IDs
])
# Now use standard GNN on the bipartite graph

Bipartite expansion preserves group structure exactly. Standard layers work on the resulting bipartite graph.

Key Takeaways

1A hypergraph allows edges (hyperedges) to connect any number of nodes. This models group interactions that pairwise edges destroy: co-authorship, shopping baskets, meetings, chemical reactions.
2Hypergraph message passing is two-phase: nodes aggregate to hyperedges, then hyperedges distribute back to nodes. Each node learns from all members of all its groups.
3HypergraphConv in PyG takes a hyperedge_index tensor mapping nodes to hyperedge IDs. It handles the two-phase propagation internally.
4If HypergraphConv is unavailable, convert to a bipartite graph with explicit hyperedge nodes. This preserves group structure and works with standard GNN layers.
5Enterprise applications include market basket analysis, multi-party transaction detection, team collaboration modeling, and drug combination analysis where group synergies matter.

Hypergraphs: Edges Connecting Multiple Nodes