6Ranking · Cross-Sell

Basket Analysis

“What will this customer add to their cart?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

What will this customer add to their cart?

The average e-commerce basket contains 3.2 items, but product affinity analysis suggests optimal baskets should contain 4.5-5.0 items (Baymard Institute). Increasing average basket size by just one item adds $15-25 per order, translating to $150-250M annually for a retailer processing 10M orders per year. Traditional association rules ('customers who bought X also bought Y') are static, ignoring the customer's current session context, inventory availability, margin contribution, and real-time browsing signals. They also suffer from popularity bias, recommending the same high-volume items to everyone.

Quick answer

Market basket analysis predicts what a customer will add to their cart based on what is already in it, their purchase history, and real-time session context. Traditional association rules (if pasta then sauce) are static and suffer from popularity bias. A relational model connects the current cart to the customer's organic preference, the product's margin contribution, and real-time inventory availability. On the RelBench product recommendation task, KumoRFM scores 76.71 vs 62.44 for the next-best method because it captures these multi-table signals that static rules miss.

Approaches compared

4 ways to solve this problem

1. Association rules (Apriori, FP-Growth)

Mine frequent itemsets from historical baskets. Generate rules like 'if pasta and sauce then garlic bread' with support and confidence scores.

Best for

Quick wins on high-frequency item pairs where the co-purchase pattern is strong and stable across all customers.

Watch out for

No personalization. The rules are the same for every customer. Popularity bias means you always recommend the most common add-ons, missing long-tail items with higher margins. Cannot incorporate inventory or margin data.

2. Collaborative filtering on basket data

Treat baskets as implicit feedback and use matrix factorization to find latent product associations beyond simple co-occurrence.

Best for

E-commerce platforms with large basket histories where co-purchase patterns are complex and non-obvious.

Watch out for

Ignores the current session context. A customer browsing premium organic items should not see the same recommendations as a budget shopper, but collaborative filtering treats them identically if their basket contents overlap.

3. Gradient-boosted models with cart features

Train XGBoost models with features derived from current cart contents, customer history, and product attributes to predict next-item probability.

Best for

Teams that can invest in feature engineering to capture cart composition, customer segment, and product attribute interactions.

Watch out for

Feature engineering for cart context is complex. You need to manually encode meal patterns, brand preferences, and replenishment timing. SAP SALT shows 75% accuracy ceiling for tabular approaches.

4. KumoRFM (relational foundation model)

Connects current cart, customer history, product affinities, inventory status, and margin data into a relational graph. Predicts next-item probability with personalization and business constraints.

Best for

Retailers who want personalized, margin-aware basket recommendations that account for inventory availability and customer preferences simultaneously.

Watch out for

Most impactful when you have rich customer history alongside cart data. For anonymous visitors with no history, falls back to association-rule-level performance.

Key metric: RelBench product recommendation: KumoRFM 76.71 vs next-best 62.44. Personalized basket suggestions add 1.0-1.2 items vs 0.3-0.4 for generic rules.

Why relational data changes the answer

Association rules see that pasta and garlic bread appear together in 42% of baskets. They cannot see that this specific customer buys organic garlic bread monthly, last purchased it 26 days ago (due for replenishment), and has an 85% organic product preference. These signals live in the purchase_history and customer_preferences tables, not in the current cart.

A relational model joins the current cart to the customer's behavioral graph and surfaces recommendations that are both contextually relevant and personalized. It ranks organic garlic bread above conventional because of the customer's preference, ranks parmigiano above mozzarella because the cart pattern matches an Italian meal, and excludes items that are out of stock. The result is a 1.2-item average basket increase vs 0.4 items from static association rules, because personalized recommendations convert at 3x the rate of generic ones.

Static basket rules are like a waiter who always suggests garlic bread with pasta because it is the most popular side. A relational basket model is like a waiter who remembers you always order the organic garlic bread, notices you have not been in for a month (time for your usual), sees that the parmigiano you like is on special today, and knows the tiramisu is out. Same restaurant, very different upsell.

How KumoRFM solves this

Relational intelligence built for retail and e-commerce data

Kumo builds a relational graph connecting the current cart contents, customer purchase history, browsing session, product attributes, inventory levels, and margin data. The model predicts in real time that a customer with pasta and marinara sauce in their cart will add garlic bread (72% probability) and parmesan cheese (65% probability), and that recommending these items at checkout will generate $8.40 in incremental margin. The graph captures that this specific customer prefers organic products, so it ranks the organic garlic bread above the conventional option.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

CURRENT_CART

session_id	customer_id	product_id	product_name	price
SS-7701	CU-3012	P-2001	De Cecco Spaghetti	$3.49
SS-7701	CU-3012	P-2002	Rao's Marinara Sauce	$8.99
SS-7701	CU-3012	P-2003	Organic Ground Beef 1lb	$7.99

PURCHASE_HISTORY

customer_id	product_id	category	frequency	last_purchased
CU-3012	P-2010	Organic Garlic Bread	Monthly	2025-08-20
CU-3012	P-2011	Parmigiano Reggiano	Monthly	2025-08-20
CU-3012	P-2015	Organic Mixed Greens	Weekly	2025-09-10

PRODUCT_AFFINITIES

product_a	product_b	co_purchase_rate	lift	category_pair
P-2001	P-2010	42%	3.8	Pasta + Bread
P-2002	P-2011	38%	4.2	Sauce + Cheese
P-2001	P-2015	22%	1.5	Pasta + Salad

INVENTORY_STATUS

product_id	name	in_stock	margin_pct	on_promotion
P-2010	Organic Garlic Bread	True	42%	False
P-2011	Parmigiano Reggiano	True	35%	True
P-2015	Organic Mixed Greens	True	48%	False

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(ORDERS.PRODUCT_ID, 0, 0, days)
FOR EACH CURRENT_CART.SESSION_ID, PRODUCTS.PRODUCT_ID
RANK TOP 3

Prediction output

Every entity gets a score, updated continuously

SESSION_ID	RECOMMENDED_PRODUCT	ADD_PROB	MARGIN_UPLIFT	RANK
SS-7701	Organic Garlic Bread	0.72	$2.18	1
SS-7701	Parmigiano Reggiano	0.65	$2.94	2
SS-7701	Organic Mixed Greens	0.51	$2.88	3

Understand why

Every prediction includes feature attributions — no black boxes

Session SS-7701 (Cart: pasta, sauce, ground beef)

Predicted: Organic Garlic Bread: 72% add probability

Top contributing features

Historical co-purchase with pasta

Monthly buyer

30% attribution

Cart context (Italian meal pattern)

3 Italian items

25% attribution

Category affinity lift

3.8x baseline

20% attribution

Customer organic preference

85% organic

14% attribution

Replenishment timing

26 days since last

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

PQL Documentation

Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.

Read docs

Python SDK

Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.

Read docs

Explainability Docs

Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.

Read docs

Frequently asked questions

Common questions about basket analysis

What is the difference between basket analysis and product recommendations?

Basket analysis predicts what to add to an active cart in the current session. Product recommendations predict what to show a customer on a homepage, email, or category page based on their profile. Basket analysis operates in real time with cart context as the primary signal. Recommendations operate on longer time horizons with browsing and purchase history as primary signals. Both benefit from relational data, but basket analysis is more time-sensitive and context-dependent.

How much revenue does basket analysis add?

The average e-commerce basket contains 3.2 items, but affinity analysis suggests optimal baskets should contain 4.5-5.0 items (Baymard Institute). Adding 1 item per order at $15-25 average item value translates to $150-250M annually for a retailer processing 10M orders per year. The key is personalization: generic 'frequently bought together' suggestions add 0.3-0.4 items per basket, while personalized relational recommendations add 1.0-1.2 items.

Can basket analysis work in physical stores?

Yes, through in-store app suggestions, loyalty program notifications, and checkout screen prompts. The model uses the customer's loyalty ID to connect their current basket (from scanned items or self-checkout) to their purchase history and preferences. A grocery loyalty app can push a notification when pasta is scanned: 'Your usual organic garlic bread is on aisle 4.' The relational signal is the same; the delivery channel is different.

Bottom line: Increase average basket size by 1.2 items and basket value by $18 per order, generating $150-250M in incremental annual revenue for a 10M-order retailer.

Related use cases

Explore more retail & e-commerce use cases

Use Case #2Product RecommendationsLearn more

Use Case #4Dynamic PricingLearn more

Use Case #3Customer Churn PredictionLearn more

Previous#5 Inventory Optimization

Next#7 Return Prediction

Topics covered

basket analysis AImarket basket predictioncart recommendation enginecross-sell retail AIgraph neural network basketKumoRFMrelational deep learning retailadd-on product predictionbasket size optimizationretail cross-sell analytics

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free