Basket Analysis
“What will this customer add to their cart?”
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
What will this customer add to their cart?
The average e-commerce basket contains 3.2 items, but product affinity analysis suggests optimal baskets should contain 4.5-5.0 items (Baymard Institute). Increasing average basket size by just one item adds $15-25 per order, translating to $150-250M annually for a retailer processing 10M orders per year. Traditional association rules ('customers who bought X also bought Y') are static, ignoring the customer's current session context, inventory availability, margin contribution, and real-time browsing signals. They also suffer from popularity bias, recommending the same high-volume items to everyone.
Quick answer
Market basket analysis predicts what a customer will add to their cart based on what is already in it, their purchase history, and real-time session context. Traditional association rules (if pasta then sauce) are static and suffer from popularity bias. A relational model connects the current cart to the customer's organic preference, the product's margin contribution, and real-time inventory availability. On the RelBench product recommendation task, KumoRFM scores 76.71 vs 62.44 for the next-best method because it captures these multi-table signals that static rules miss.
Approaches compared
4 ways to solve this problem
1. Association rules (Apriori, FP-Growth)
Mine frequent itemsets from historical baskets. Generate rules like 'if pasta and sauce then garlic bread' with support and confidence scores.
Best for
Quick wins on high-frequency item pairs where the co-purchase pattern is strong and stable across all customers.
Watch out for
No personalization. The rules are the same for every customer. Popularity bias means you always recommend the most common add-ons, missing long-tail items with higher margins. Cannot incorporate inventory or margin data.
2. Collaborative filtering on basket data
Treat baskets as implicit feedback and use matrix factorization to find latent product associations beyond simple co-occurrence.
Best for
E-commerce platforms with large basket histories where co-purchase patterns are complex and non-obvious.
Watch out for
Ignores the current session context. A customer browsing premium organic items should not see the same recommendations as a budget shopper, but collaborative filtering treats them identically if their basket contents overlap.
3. Gradient-boosted models with cart features
Train XGBoost models with features derived from current cart contents, customer history, and product attributes to predict next-item probability.
Best for
Teams that can invest in feature engineering to capture cart composition, customer segment, and product attribute interactions.
Watch out for
Feature engineering for cart context is complex. You need to manually encode meal patterns, brand preferences, and replenishment timing. SAP SALT shows 75% accuracy ceiling for tabular approaches.
4. KumoRFM (relational foundation model)
Connects current cart, customer history, product affinities, inventory status, and margin data into a relational graph. Predicts next-item probability with personalization and business constraints.
Best for
Retailers who want personalized, margin-aware basket recommendations that account for inventory availability and customer preferences simultaneously.
Watch out for
Most impactful when you have rich customer history alongside cart data. For anonymous visitors with no history, falls back to association-rule-level performance.
Key metric: RelBench product recommendation: KumoRFM 76.71 vs next-best 62.44. Personalized basket suggestions add 1.0-1.2 items vs 0.3-0.4 for generic rules.
Why relational data changes the answer
Association rules see that pasta and garlic bread appear together in 42% of baskets. They cannot see that this specific customer buys organic garlic bread monthly, last purchased it 26 days ago (due for replenishment), and has an 85% organic product preference. These signals live in the purchase_history and customer_preferences tables, not in the current cart.
A relational model joins the current cart to the customer's behavioral graph and surfaces recommendations that are both contextually relevant and personalized. It ranks organic garlic bread above conventional because of the customer's preference, ranks parmigiano above mozzarella because the cart pattern matches an Italian meal, and excludes items that are out of stock. The result is a 1.2-item average basket increase vs 0.4 items from static association rules, because personalized recommendations convert at 3x the rate of generic ones.
Static basket rules are like a waiter who always suggests garlic bread with pasta because it is the most popular side. A relational basket model is like a waiter who remembers you always order the organic garlic bread, notices you have not been in for a month (time for your usual), sees that the parmigiano you like is on special today, and knows the tiramisu is out. Same restaurant, very different upsell.
How KumoRFM solves this
Relational intelligence built for retail and e-commerce data
Kumo builds a relational graph connecting the current cart contents, customer purchase history, browsing session, product attributes, inventory levels, and margin data. The model predicts in real time that a customer with pasta and marinara sauce in their cart will add garlic bread (72% probability) and parmesan cheese (65% probability), and that recommending these items at checkout will generate $8.40 in incremental margin. The graph captures that this specific customer prefers organic products, so it ranks the organic garlic bread above the conventional option.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
CURRENT_CART
| session_id | customer_id | product_id | product_name | price |
|---|---|---|---|---|
| SS-7701 | CU-3012 | P-2001 | De Cecco Spaghetti | $3.49 |
| SS-7701 | CU-3012 | P-2002 | Rao's Marinara Sauce | $8.99 |
| SS-7701 | CU-3012 | P-2003 | Organic Ground Beef 1lb | $7.99 |
PURCHASE_HISTORY
| customer_id | product_id | category | frequency | last_purchased |
|---|---|---|---|---|
| CU-3012 | P-2010 | Organic Garlic Bread | Monthly | 2025-08-20 |
| CU-3012 | P-2011 | Parmigiano Reggiano | Monthly | 2025-08-20 |
| CU-3012 | P-2015 | Organic Mixed Greens | Weekly | 2025-09-10 |
PRODUCT_AFFINITIES
| product_a | product_b | co_purchase_rate | lift | category_pair |
|---|---|---|---|---|
| P-2001 | P-2010 | 42% | 3.8 | Pasta + Bread |
| P-2002 | P-2011 | 38% | 4.2 | Sauce + Cheese |
| P-2001 | P-2015 | 22% | 1.5 | Pasta + Salad |
INVENTORY_STATUS
| product_id | name | in_stock | margin_pct | on_promotion |
|---|---|---|---|---|
| P-2010 | Organic Garlic Bread | True | 42% | False |
| P-2011 | Parmigiano Reggiano | True | 35% | True |
| P-2015 | Organic Mixed Greens | True | 48% | False |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT BOOL(ORDERS.PRODUCT_ID, 0, 0, days) FOR EACH CURRENT_CART.SESSION_ID, PRODUCTS.PRODUCT_ID RANK TOP 3
Prediction output
Every entity gets a score, updated continuously
| SESSION_ID | RECOMMENDED_PRODUCT | ADD_PROB | MARGIN_UPLIFT | RANK |
|---|---|---|---|---|
| SS-7701 | Organic Garlic Bread | 0.72 | $2.18 | 1 |
| SS-7701 | Parmigiano Reggiano | 0.65 | $2.94 | 2 |
| SS-7701 | Organic Mixed Greens | 0.51 | $2.88 | 3 |
Understand why
Every prediction includes feature attributions — no black boxes
Session SS-7701 (Cart: pasta, sauce, ground beef)
Predicted: Organic Garlic Bread: 72% add probability
Top contributing features
Historical co-purchase with pasta
Monthly buyer
30% attribution
Cart context (Italian meal pattern)
3 Italian items
25% attribution
Category affinity lift
3.8x baseline
20% attribution
Customer organic preference
85% organic
14% attribution
Replenishment timing
26 days since last
11% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Frequently asked questions
Common questions about basket analysis
What is the difference between basket analysis and product recommendations?
Basket analysis predicts what to add to an active cart in the current session. Product recommendations predict what to show a customer on a homepage, email, or category page based on their profile. Basket analysis operates in real time with cart context as the primary signal. Recommendations operate on longer time horizons with browsing and purchase history as primary signals. Both benefit from relational data, but basket analysis is more time-sensitive and context-dependent.
How much revenue does basket analysis add?
The average e-commerce basket contains 3.2 items, but affinity analysis suggests optimal baskets should contain 4.5-5.0 items (Baymard Institute). Adding 1 item per order at $15-25 average item value translates to $150-250M annually for a retailer processing 10M orders per year. The key is personalization: generic 'frequently bought together' suggestions add 0.3-0.4 items per basket, while personalized relational recommendations add 1.0-1.2 items.
Can basket analysis work in physical stores?
Yes, through in-store app suggestions, loyalty program notifications, and checkout screen prompts. The model uses the customer's loyalty ID to connect their current basket (from scanned items or self-checkout) to their purchase history and preferences. A grocery loyalty app can push a notification when pasta is scanned: 'Your usual organic garlic bread is on aisle 4.' The relational signal is the same; the delivery channel is different.
Bottom line: Increase average basket size by 1.2 items and basket value by $18 per order, generating $150-250M in incremental annual revenue for a 10M-order retailer.
Related use cases
Explore more retail & e-commerce use cases
Topics covered
One Platform. One Model. Infinite Predictions.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.




