What data is needed for product recommendations?

Kumo connects directly to your existing relational tables: CUSTOMERS, PURCHASES, PRODUCTS. No ETL or feature engineering required. Write a PQL query and get explainable predictions in minutes.

1Link Prediction · Recommendation

Product Recommendations

“For each customer, what products will they purchase in the next 30 days?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

For each customer, what products will they purchase in the next 30 days?

Most retailers still show generic bestseller lists or rely on collaborative filtering that only considers the user-item interaction matrix. Cross-category purchase patterns, return signals, browse-to-buy sequences, and shared merchant affinities are invisible. Kumo learns from the full purchase-product-customer graph — capturing signals that collaborative filtering structurally cannot see. For a mid-size retailer doing $500M in ecommerce revenue, even a 1% conversion lift is worth $5M annually.

Quick answer

Product recommendation engines predict which products each customer will purchase next. Traditional collaborative filtering only sees the user-item interaction matrix, missing cross-category purchase patterns, return signals, and browse-to-buy sequences. Graph-based recommendations learn from the full purchase-product-customer graph, delivering 15-30% lift in click-through rate over standard approaches.

Approaches compared

4 ways to solve this problem

1. Popularity-Based (Bestseller Lists)

Show the same top-selling products to everyone. No personalization, no ML. The default for most retailers who have not invested in recommendation infrastructure.

Best for

Low-traffic sites where there is not enough interaction data for personalization. Also works as a fallback for new users with no history.

Watch out for

Zero personalization. A trail runner sees the same shoes as a road runner. Click-through rates typically sit at 2-4%, far below what personalized approaches achieve.

2. Collaborative Filtering (Matrix Factorization)

Learn latent factors from the user-item interaction matrix. Recommend items purchased by similar users. The industry standard for most recommendation engines.

Best for

Products with dense interaction data (many users, many items, many purchases). Good at capturing 'people who bought X also bought Y' patterns.

Watch out for

Only sees the (user, item) matrix. Cannot incorporate product attributes, browsing behavior, return signals, or cross-category patterns. Fails completely for new products (cold-start problem) and struggles with sparse categories.

3. Content-Based Filtering + Hybrid

Recommend products with similar attributes (category, brand, price range) to what the user previously purchased. Often combined with collaborative filtering in hybrid systems.

Best for

Catalogs with rich product metadata and users with consistent category preferences.

Watch out for

Over-specializes. If a customer bought running shoes, the model recommends more running shoes but never surfaces the hydration pack that trail runners also buy. The cross-category discovery that drives basket expansion is missing.

4. KumoRFM (Graph Neural Networks on Relational Data)

Builds a heterogeneous graph connecting customers, purchases, products, browsing sessions, and returns. Graph transformers traverse the full relational structure to predict which products each customer will buy next, capturing cross-category patterns and browse-to-buy sequences automatically.

Best for

Retailers with multi-table data (purchases, browsing, returns, reviews) who want maximum recommendation accuracy and cross-category discovery.

Watch out for

The graph advantage is largest for catalogs with meaningful cross-product relationships (co-purchase patterns, shared categories, brand connections). For very small catalogs with independent products, simpler models may suffice.

Key metric: Graph-based recommendations deliver 15-30% CTR lift over collaborative filtering. RelBench benchmark: 76.71 vs 62.44 for flat baselines, driven by cross-category discovery and cold-start performance.

Why relational data changes the answer

Customer C001 (Sarah Chen, premium) bought Trail Running Shoes and a Hydration Pack. Collaborative filtering says 'recommend more trail running shoes and hydration packs.' But the relational graph reveals a richer pattern: C001's browse-to-cart ratio for the Outdoor category is 0.38 (high intent), 12 similar customers in her graph neighborhood bought product P203, and her return rate is 0.02 (very low), meaning recommendations are likely to stick. More importantly, the cross-category link between Footwear and Outdoor Gear purchases suggests C001 is an outdoor enthusiast who would respond to trail GPS watches and lightweight backpacks, products that collaborative filtering would never surface because they are in different categories with no direct co-purchase signal.

These cross-category discovery patterns require connecting the PURCHASES table to the PRODUCTS table (for category information) and then to the CUSTOMERS table (for segment patterns). The graph neural network propagates purchase signals across product categories, discovering that trail shoe buyers who also buy hydration gear are 4.2x more likely to purchase trail navigation equipment. On the RelBench benchmark, graph-based recommendation models score 76.71 vs 62.44 for flat-table baselines. For product recommendations specifically, the graph advantage translates to 15-30% higher click-through rates because the model surfaces cross-category items that collaborative filtering structurally cannot discover.

Collaborative filtering is like a bookstore that only tracks what books people bought together. A relational recommendation model is like a bookseller who also knows what genres each reader browses, which books they returned, what their book club friends are reading, and that readers who buy this author's mystery novels also love that other author's thrillers. The purchase matrix is one signal; the relational context is what powers genuine discovery.

How KumoRFM solves this

Relational intelligence for true personalization

Kumo's graph transformers traverse the full relational structure — customer demographics, purchase history, product attributes, browsing sessions, returns, and reviews — to predict which products each customer will buy next. Unlike matrix factorization that only sees (user, item) pairs, Kumo captures that Customer C001 bought running shoes, viewed trail gear, and shares purchase patterns with outdoor enthusiasts — surfacing cross-category recommendations that collaborative filtering misses entirely.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

CUSTOMERS

customer_id	name	segment	signup_date
C001	Sarah Chen	premium	2023-06-15
C002	Michael Torres	standard	2024-01-20
C003	Priya Kapoor	premium	2022-11-03

PURCHASES

purchase_id	customer_id	product_id	amount	timestamp
PUR001	C001	P203	89.99	2025-02-10
PUR002	C001	P087	124.50	2025-02-14
PUR003	C002	P042	34.99	2025-02-11

PRODUCTS

product_id	product_name	category	price
P203	Trail Running Shoes	Footwear	89.99
P087	Hydration Pack 2L	Outdoor Gear	124.50
P042	Wireless Earbuds	Electronics	34.99

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT LIST_DISTINCT(PURCHASES.PRODUCT_ID, 0, 30, days)
FOR EACH CUSTOMERS.CUSTOMER_ID

Prediction output

Every entity gets a score, updated continuously

CUSTOMER_ID	CLASS	SCORE	TIMESTAMP
C001	P203	0.92	2025-03-12
C001	P087	0.85	2025-03-12
C002	P042	0.78	2025-03-12

Understand why

Every prediction includes feature attributions — no black boxes

Customer C001 (Sarah Chen, premium segment)

Predicted: Will purchase P203 (Trail Running Shoes) — score 0.92

Top contributing features

Previous category purchases (Footwear)

4 purchases in 90 days

34% attribution

Graph neighbors with same product

12 similar customers bought P203

28% attribution

Browse-to-cart ratio (Outdoor)

0.38 (high intent)

19% attribution

Days since last Footwear purchase

47 days

12% attribution

Return rate for category

0.02 (very low)

7% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about product recommendations

How much does graph-based product recommendation improve over collaborative filtering?

Graph-based models deliver 15-30% lift in recommendation click-through rate compared to collaborative filtering. The improvement is largest for cross-category recommendations, new products (cold-start), and customers with sparse purchase histories. On the RelBench benchmark, graph models score 76.71 vs 62.44 for flat baselines.

Can graph recommendations solve the cold-start problem?

Yes. New products with zero interaction history can be recommended through the relational graph: same brand, similar category, comparable price point, shared attributes with popular products. Collaborative filtering fails completely for new products because there is no interaction data. Graph models transfer knowledge through product relationships.

What data improves product recommendations the most?

Beyond purchase history, the highest-value additions are: browsing/session data (shows intent even without purchase), return data (negative signals that prevent bad recommendations), product attributes (enables cross-category discovery), and review/rating data (captures quality signals). Each additional data source improves recommendation quality by 5-15% incrementally.

How do you measure recommendation quality?

The primary metric is click-through rate (CTR) on recommendations, followed by add-to-cart rate and conversion rate. Revenue per recommendation impression is the ultimate measure. Avoid using accuracy alone since a model that always recommends bestsellers has high accuracy but zero personalization value.

Bottom line: 15-30% lift in recommendation click-through rate. Each percentage point of conversion improvement equals $2-5M annually for mid-size retailers.

Related use cases

Explore more personalization use cases

Use Case #2Content PersonalizationLearn more

Use Case #4Next Best OfferLearn more

Use Case #6New Collection Launch RecsLearn more

Next#2 Content Personalization

Topics covered

product recommendation AIgraph neural network recommendationscollaborative filtering alternativepersonalized product suggestionsrecommendation engineKumoRFMpredictive query languagerelational deep learningecommerce personalizationpurchase predictionnext purchase predictionAI product recommendations

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free