What is a bipartite graph?

A bipartite graph is a graph with two distinct sets of nodes where edges only connect nodes from different sets, never within the same set. A user-product purchase graph is bipartite: users connect to products, but users do not connect to users and products do not connect to products.

How do I represent a bipartite graph in PyG?

Use HeteroData with two node types and one edge type between them. For example, data['user'].x for user features, data['product'].x for product features, and data['user', 'buys', 'product'].edge_index for the bipartite edges. Alternatively, use the bipartite keyword in message passing by passing (x_src, x_dst) tuples.

What is the difference between bipartite and heterogeneous graphs?

A bipartite graph is a special case of a heterogeneous graph with exactly two node types and edges only between types. A general heterogeneous graph can have many node types, edges within the same type, and multiple edge types between the same pair of types. Bipartite is the simplest form of heterogeneity.

Why are bipartite graphs important for recommendations?

Recommendation systems naturally form bipartite graphs: users on one side, items on the other. Message passing on this structure lets user embeddings absorb information from their purchased/clicked items, and item embeddings absorb information from their users. This collaborative signal is the foundation of collaborative filtering with GNNs.

Can standard GNN layers operate on bipartite graphs?

Yes. In PyG, most message passing layers accept bipartite input by passing a tuple (x_src, x_dst) instead of a single x tensor. The source and destination nodes can have different feature dimensions. SAGEConv, GATConv, and GCNConv all support this natively.

Bipartite Graphs in PyG: Two-Set Node Structures | Kumo.ai

A bipartite graph is a graph with two distinct sets of nodes where edges only connect nodes from different sets. Users and products form a bipartite graph: users buy products, but users do not buy other users, and products do not buy other products. Every edge crosses the boundary between the two sets.

This structure appears everywhere in enterprise systems: customers and products, job seekers and job listings, patients and medications, students and courses, advertisers and publishers. Whenever your data has two fundamentally different entity types connected by interactions, you have a bipartite graph.

Why bipartite structure matters

When you apply message passing to a bipartite graph, information flows in a specific pattern:

Layer 1: user nodes receive messages from their purchased items. Item nodes receive messages from their buyers.
Layer 2: user nodes now receive messages from items, which already absorbed information from other buyers. Users who share items get similar representations.

This is collaborative filtering implemented as message passing. Users who bought similar items naturally end up with similar embeddings, and items bought by similar users naturally end up with similar embeddings. The GNN learns this structure automatically.

Bipartite graphs in PyG

PyG supports bipartite message passing natively. Pass a tuple of feature matrices instead of a single tensor:

bipartite_message_passing.py

import torch
from torch_geometric.nn import SAGEConv

# 100 users with 16 features, 500 items with 32 features
x_users = torch.randn(100, 16)
x_items = torch.randn(500, 32)

# Bipartite edges: user -> item (purchases)
edge_index = torch.randint(0, 100, (2, 2000))
edge_index[1] = torch.randint(0, 500, (2000,))

# SAGEConv accepts bipartite input as (x_src, x_dst) tuple
conv = SAGEConv((16, 32), 64)  # (src_dim, dst_dim) -> out_dim

# Messages flow from users to items
out_items = conv((x_users, x_items), edge_index)
# out_items.shape: [500, 64]

# Reverse edges for items -> users message passing
conv_rev = SAGEConv((32, 16), 64)
out_users = conv_rev((x_items, x_users), edge_index.flip(0))

Bipartite message passing: source and destination have different feature dimensions. Flip edges for reverse direction.

Enterprise example: product recommendations

An e-commerce platform with 10 million users and 500,000 products. The bipartite graph connects users to products they purchased, clicked, or added to cart:

User features: demographics, account age, average spend
Item features: price, category embedding, rating, inventory
Edge features: interaction type (purchase/click/cart), timestamp, quantity

After 2 layers of bipartite message passing, each user's embedding encodes not just their own features but the characteristics of their purchased items and, transitively, the behavior of users who bought similar items. This is strictly more powerful than matrix factorization because it incorporates node features and multi-hop collaborative signal.

bipartite_recommender.py

from torch_geometric.data import HeteroData
from torch_geometric.nn import SAGEConv, to_hetero
import torch

class BipartiteGNN(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = SAGEConv(-1, 128)
        self.conv2 = SAGEConv(128, 64)

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        return self.conv2(x, edge_index)

# Build bipartite HeteroData
data = HeteroData()
data['user'].x = torch.randn(10000, 16)
data['product'].x = torch.randn(5000, 32)
data['user', 'purchased', 'product'].edge_index = ...

# Convert to heterogeneous model
model = BipartiteGNN()
model = to_hetero(model, data.metadata(), aggr='sum')

# Output: user and product embeddings in shared 64-dim space
out = model(data.x_dict, data.edge_index_dict)
# Score: dot product between user and product embeddings

A bipartite recommendation model using HeteroData. User and product embeddings land in the same space for scoring.

Scaling bipartite GNNs

Bipartite graphs in enterprise settings are often extremely large (millions of users, millions of items, billions of edges). Key scaling strategies:

Neighbor sampling: sample a fixed number of neighbors per node instead of using the full neighborhood. PyG's NeighborLoader handles this.
Two-tower architecture: encode users and items independently, then score with a dot product. This allows precomputing all item embeddings offline and serving recommendations with approximate nearest neighbor search.
Mini-batch training: sample subgraphs containing target edges and their local neighborhoods. Each batch trains on a small bipartite subgraph.

Key Takeaways

1A bipartite graph has two distinct node sets with edges only between sets. Users and items, patients and medications, students and courses are all bipartite.
2Message passing on bipartite graphs implements collaborative filtering: users who interact with similar items get similar embeddings, and vice versa. This captures multi-hop collaborative signal.
3In PyG, use (x_src, x_dst) tuples for bipartite message passing in any standard layer, or use HeteroData with two node types for a cleaner API.
4For production recommendations at scale, combine bipartite GNNs with the two-tower architecture and neighbor sampling. Precompute item embeddings offline, serve with ANN search.
5Bipartite is the simplest form of heterogeneity. If you need edges within the same type or more than two entity types, upgrade to a full heterogeneous graph.

Bipartite Graphs: Two Distinct Sets of Nodes