What is a co-purchase graph?

A co-purchase graph is a product-product graph where edges connect products that are frequently bought together by the same customers. If customers who buy product A often also buy product B, an edge (weighted by co-purchase frequency) connects A and B. The resulting graph captures product complementarity and substitutability.

How are co-purchase graphs used for recommendations?

GNNs on co-purchase graphs learn product representations where complementary products (phone + case) are embedded nearby. For a customer's recent purchases, the model recommends products close in the embedding space to what they bought. This captures 'frequently bought together' patterns and extends them through multi-hop paths.

What is the difference between co-purchase and co-view graphs?

Co-purchase edges are strong signals (user spent money). Co-view edges are weaker but more abundant (user looked but may not buy). In practice, both are used as separate edge types in a heterogeneous graph, with the GNN learning different weights for each signal type.

Co-Purchase Graphs: Products Linked by Being Bought Together | Kumo.ai

A co-purchase graph connects products that are frequently bought together by the same customers. If 30% of customers who buy a camera also buy a memory card within the same session, an edge connects those two products, weighted by the co-purchase frequency. The resulting graph captures product complementarity (items bought together) and, through cluster structure, substitutability (items that compete for the same need).

Construction

Building a co-purchase graph from transaction data:

Define co-purchase window: Products bought by the same customer within a time window (same session, same day, or same week) are considered co-purchased.
Count co-occurrences: For each product pair, count how many customers bought both within the window.
Filter and normalize: Remove low-count edges (noise from random co-occurrence). Normalize by product popularity to avoid popular items dominating.
Add node features: Product attributes (price, category, brand, description embedding, rating, review count).

What GNNs learn from co-purchase structure

1-hop: Direct complements

A camera node's immediate neighbors are memory cards, camera bags, tripods, and lens filters. After one layer of message passing, the camera's representation encodes “I am typically bought with photography accessories.”

2-hop: Indirect relationships

Camera → memory card → card reader. The camera does not directly co-occur with card readers, but through the memory card connection, the GNN learns this indirect relationship. Two layers of message passing let the camera representation encode this second-order pattern.

Cluster structure: Substitutes

Products that are substitutes (iPhone vs Samsung Galaxy) rarely appear in the same basket but share many of the same co-purchase neighbors (phone cases, screen protectors, chargers). In the GNN embedding space, substitutes cluster together even without direct edges, because their neighborhoods are similar.

Applications

Recommendations: “Frequently bought together” powered by 1-hop neighbors. “Customers also bought” powered by 2-hop GNN representations.
Bundle pricing: Identify product bundles (strongly connected subgraphs) and optimize bundle discounts based on co-purchase strength.
Demand forecasting: Demand for complementary products is correlated. A spike in camera sales predicts memory card demand. The co-purchase graph encodes these correlations.
Category discovery: Graph clustering on co-purchase structure reveals natural product categories that may differ from the retailer's taxonomy.
Cold-start products: A new product with no purchase history but known category and attributes can borrow representations from its nearest neighbors in feature space.

Heterogeneous product graphs

Co-purchase edges are just one type. A rich product graph includes:

Co-purchase: Bought together (strong intent signal)
Co-view: Viewed in the same session (weaker but abundant)
Co-review: Reviewed by the same customer
Same-category: Share a taxonomy category
Same-brand: From the same manufacturer

Using relation type encoding, the GNN learns different weights for each edge type, combining strong purchase signals with weaker but complementary behavioral signals.

Key Takeaways

1Co-purchase graphs connect products bought together, capturing complementarity (camera + case) and, through cluster structure, substitutability (iPhone vs Samsung).
2GNN message passing reveals direct complements (1-hop), indirect relationships (2-hop: camera -> memory card -> card reader), and substitute clusters (similar neighborhoods).
3ogbn-products (2.4M Amazon products, 61M edges) is the standard benchmark. Graph structure significantly improves product classification over features alone.
4Applications: recommendations, bundle pricing, demand forecasting, category discovery, and cold-start product embeddings.
5Combine co-purchase with co-view, co-review, and taxonomy edges in a heterogeneous graph. Each edge type carries different signal strength for different downstream tasks.

Co-Purchase Graphs: Products Linked by Being Bought Together