Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
PyG/Use Case11 min read

Cross-Sell & Upsell: Next-Product Prediction on Purchase Graphs

Cross-sell and upsell drive 10-30% of e-commerce revenue. Association rules capture direct co-purchases. Here is how to build a GNN that discovers multi-hop purchase sequences and cross-category expansion patterns.

PyTorch Geometric

TL;DR

  • 1Cross-sell is a temporal link prediction problem on the purchase graph. GNNs capture multi-hop purchase sequences that association rules miss: X leads to Y leads to Z across categories.
  • 2SAGEConv with temporal edge features learns sequential purchase patterns: which products follow which, and when, based on the behavior of similar customers.
  • 3On RelBench benchmarks, GNNs achieve 75.83 AUROC vs 62.44 for flat-table LightGBM. Multi-hop purchase sequences drive the improvement.
  • 4The model predicts next-product probabilities per customer. Campaign orchestration (email, in-app, sales outreach) happens downstream.
  • 5KumoRFM predicts next-product purchases with one PQL query (76.71 AUROC zero-shot), discovering cross-sell patterns from temporal purchase sequences automatically.

The business problem

Cross-sell and upsell generate 10-30% of revenue for mature e-commerce and B2B companies. Amazon attributes 35% of revenue to its recommendation engine, much of which is cross-sell (frequently bought together, customers also bought). The opportunity is enormous: acquiring a new customer costs 5-25x more than expanding an existing relationship.

Association rules (“people who bought X also bought Y”) capture obvious co-purchase patterns but miss the sequential, multi-hop journeys that drive expansion: a camera buyer becomes a lens buyer, then a tripod buyer, then a lighting buyer. Each step in the journey depends on the previous purchases and the behavior of similar customers.

Why flat ML fails

  • No sequence awareness: Flat models predict next-purchase from current features. They miss that the purchase of a specific product often triggers a predictable sequence of follow-on purchases.
  • No cross-customer learning: Customer A bought camera then lens. Customer B just bought a camera. The graph connects them through the shared product, transferring the pattern.
  • No category crossing: Cross-category expansion (electronics to accessories to services) requires multi-hop graph traversal that flat features cannot capture.
  • Timing matters: The cross-sell window varies by product pair. Camera-to-lens is 30 days. Software-to-training is 90 days. Temporal edge features capture this.

The relational schema

schema.txt
Node types:
  Customer  (id, segment, signup_date, total_spend)
  Product   (id, category, price, product_line)
  Category  (id, name, avg_expansion_rate)

Edge types:
  Customer --[purchased]-->   Product  (amount, timestamp, channel)
  Product  --[upsell_to]-->   Product  (conversion_rate)
  Product  --[cross_sell]-->  Product  (lift_factor)
  Product  --[in_category]--> Category

Purchase edges carry timestamps for sequence modeling. Product-to-product edges encode known cross-sell and upsell relationships.

PyG architecture: SAGEConv for next-product prediction

cross_sell_model.py
import torch
import torch.nn.functional as F
from torch_geometric.nn import SAGEConv, HeteroConv, Linear

class CrossSellGNN(torch.nn.Module):
    def __init__(self, hidden_dim=128):
        super().__init__()
        self.customer_lin = Linear(-1, hidden_dim)
        self.product_lin = Linear(-1, hidden_dim)

        self.conv1 = HeteroConv({
            ('customer', 'purchased', 'product'): SAGEConv(
                hidden_dim, hidden_dim),
            ('product', 'upsell_to', 'product'): SAGEConv(
                hidden_dim, hidden_dim),
            ('product', 'cross_sell', 'product'): SAGEConv(
                hidden_dim, hidden_dim),
        }, aggr='sum')

        self.conv2 = HeteroConv({
            ('customer', 'purchased', 'product'): SAGEConv(
                hidden_dim, hidden_dim),
            ('product', 'upsell_to', 'product'): SAGEConv(
                hidden_dim, hidden_dim),
            ('product', 'cross_sell', 'product'): SAGEConv(
                hidden_dim, hidden_dim),
        }, aggr='sum')

    def encode(self, x_dict, edge_index_dict):
        x_dict['customer'] = self.customer_lin(
            x_dict['customer'])
        x_dict['product'] = self.product_lin(x_dict['product'])

        x_dict = {k: F.relu(v) for k, v in
                  self.conv1(x_dict, edge_index_dict).items()}
        x_dict = self.conv2(x_dict, edge_index_dict)
        return x_dict

    def predict_next(self, customer_emb, product_emb):
        # Score all candidate products for each customer
        return torch.sigmoid(
            customer_emb @ product_emb.T)

SAGEConv propagates purchase patterns through the product graph. The model scores all candidate products for each customer, ranking by purchase probability.

Expected performance

  • Association rules: ~45 AUROC
  • LightGBM (flat features): 62.44 AUROC
  • GNN (SAGEConv): 75.83 AUROC
  • KumoRFM (zero-shot): 76.71 AUROC

Or use KumoRFM in one line

KumoRFM PQL
PREDICT next_product FOR customer
USING customer, product, purchase, category

One PQL query. KumoRFM discovers cross-sell and upsell patterns from temporal purchase sequences automatically.

Frequently asked questions

How does GNN cross-sell differ from association rules?

Association rules (people who bought X also bought Y) only capture direct co-purchase patterns. GNNs see multi-hop patterns: people who bought X later bought Y, and people who bought Y in that context went on to buy Z. This enables cross-category recommendations that association rules miss.

What is the difference between cross-sell and upsell in a GNN context?

Cross-sell predicts complementary products (laptop buyer gets recommended a bag). Upsell predicts product upgrades (basic plan user gets recommended premium). Both are link prediction tasks on the purchase graph, but with different edge type targets. The GNN handles both by predicting which products the customer will purchase next.

How do you encode purchase sequences in a graph?

Each purchase becomes an edge with a timestamp. The graph encodes not just what was bought, but the sequence and timing. Temporal features on edges (days since last purchase, purchase position in sequence) let the GNN learn sequential patterns: customers typically buy X, then Y within 30 days, then Z within 90 days.

Can GNN cross-sell models handle cold-start products?

Yes. New products have category, price, and description features that connect them to the graph. The GNN transfers purchase patterns from similar existing products, enabling cross-sell recommendations for newly launched items before any purchase data exists.

How does KumoRFM handle cross-sell predictions?

KumoRFM takes your customer, product, and purchase tables and predicts next-product purchases with one PQL query. It automatically discovers cross-sell and upsell patterns from temporal purchase sequences without manual rule creation.

Learn more about graph ML

PyTorch Geometric is the open-source foundation for graph neural networks. Explore more layers, concepts, and production patterns.