Executive AI Dinner hosted by Kumo - Austin, April 8

Register here
6Ranking · Cross-Sell

Basket Analysis

What will this customer add to their cart?

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

Catalina Logo

A real-world example

What will this customer add to their cart?

The average e-commerce basket contains 3.2 items, but product affinity analysis suggests optimal baskets should contain 4.5-5.0 items (Baymard Institute). Increasing average basket size by just one item adds $15-25 per order, translating to $150-250M annually for a retailer processing 10M orders per year. Traditional association rules ('customers who bought X also bought Y') are static, ignoring the customer's current session context, inventory availability, margin contribution, and real-time browsing signals. They also suffer from popularity bias, recommending the same high-volume items to everyone.

Quick answer

Market basket analysis predicts what a customer will add to their cart based on what is already in it, their purchase history, and real-time session context. Traditional association rules (if pasta then sauce) are static and suffer from popularity bias. A relational model connects the current cart to the customer's organic preference, the product's margin contribution, and real-time inventory availability. On the RelBench product recommendation task, KumoRFM scores 76.71 vs 62.44 for the next-best method because it captures these multi-table signals that static rules miss.

Approaches compared

4 ways to solve this problem

1. Association rules (Apriori, FP-Growth)

Mine frequent itemsets from historical baskets. Generate rules like 'if pasta and sauce then garlic bread' with support and confidence scores.

Best for

Quick wins on high-frequency item pairs where the co-purchase pattern is strong and stable across all customers.

Watch out for

No personalization. The rules are the same for every customer. Popularity bias means you always recommend the most common add-ons, missing long-tail items with higher margins. Cannot incorporate inventory or margin data.

2. Collaborative filtering on basket data

Treat baskets as implicit feedback and use matrix factorization to find latent product associations beyond simple co-occurrence.

Best for

E-commerce platforms with large basket histories where co-purchase patterns are complex and non-obvious.

Watch out for

Ignores the current session context. A customer browsing premium organic items should not see the same recommendations as a budget shopper, but collaborative filtering treats them identically if their basket contents overlap.

3. Gradient-boosted models with cart features

Train XGBoost models with features derived from current cart contents, customer history, and product attributes to predict next-item probability.

Best for

Teams that can invest in feature engineering to capture cart composition, customer segment, and product attribute interactions.

Watch out for

Feature engineering for cart context is complex. You need to manually encode meal patterns, brand preferences, and replenishment timing. SAP SALT shows 75% accuracy ceiling for tabular approaches.

4. KumoRFM (relational foundation model)

Connects current cart, customer history, product affinities, inventory status, and margin data into a relational graph. Predicts next-item probability with personalization and business constraints.

Best for

Retailers who want personalized, margin-aware basket recommendations that account for inventory availability and customer preferences simultaneously.

Watch out for

Most impactful when you have rich customer history alongside cart data. For anonymous visitors with no history, falls back to association-rule-level performance.

Key metric: RelBench product recommendation: KumoRFM 76.71 vs next-best 62.44. Personalized basket suggestions add 1.0-1.2 items vs 0.3-0.4 for generic rules.

Why relational data changes the answer

Association rules see that pasta and garlic bread appear together in 42% of baskets. They cannot see that this specific customer buys organic garlic bread monthly, last purchased it 26 days ago (due for replenishment), and has an 85% organic product preference. These signals live in the purchase_history and customer_preferences tables, not in the current cart.

A relational model joins the current cart to the customer's behavioral graph and surfaces recommendations that are both contextually relevant and personalized. It ranks organic garlic bread above conventional because of the customer's preference, ranks parmigiano above mozzarella because the cart pattern matches an Italian meal, and excludes items that are out of stock. The result is a 1.2-item average basket increase vs 0.4 items from static association rules, because personalized recommendations convert at 3x the rate of generic ones.

Static basket rules are like a waiter who always suggests garlic bread with pasta because it is the most popular side. A relational basket model is like a waiter who remembers you always order the organic garlic bread, notices you have not been in for a month (time for your usual), sees that the parmigiano you like is on special today, and knows the tiramisu is out. Same restaurant, very different upsell.

How KumoRFM solves this

Relational intelligence built for retail and e-commerce data

Kumo builds a relational graph connecting the current cart contents, customer purchase history, browsing session, product attributes, inventory levels, and margin data. The model predicts in real time that a customer with pasta and marinara sauce in their cart will add garlic bread (72% probability) and parmesan cheese (65% probability), and that recommending these items at checkout will generate $8.40 in incremental margin. The graph captures that this specific customer prefers organic products, so it ranks the organic garlic bread above the conventional option.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

1

Your data

The relational tables Kumo learns from

CURRENT_CART

session_idcustomer_idproduct_idproduct_nameprice
SS-7701CU-3012P-2001De Cecco Spaghetti$3.49
SS-7701CU-3012P-2002Rao's Marinara Sauce$8.99
SS-7701CU-3012P-2003Organic Ground Beef 1lb$7.99

PURCHASE_HISTORY

customer_idproduct_idcategoryfrequencylast_purchased
CU-3012P-2010Organic Garlic BreadMonthly2025-08-20
CU-3012P-2011Parmigiano ReggianoMonthly2025-08-20
CU-3012P-2015Organic Mixed GreensWeekly2025-09-10

PRODUCT_AFFINITIES

product_aproduct_bco_purchase_rateliftcategory_pair
P-2001P-201042%3.8Pasta + Bread
P-2002P-201138%4.2Sauce + Cheese
P-2001P-201522%1.5Pasta + Salad

INVENTORY_STATUS

product_idnamein_stockmargin_pcton_promotion
P-2010Organic Garlic BreadTrue42%False
P-2011Parmigiano ReggianoTrue35%True
P-2015Organic Mixed GreensTrue48%False
2

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL
PREDICT BOOL(ORDERS.PRODUCT_ID, 0, 0, days)
FOR EACH CURRENT_CART.SESSION_ID, PRODUCTS.PRODUCT_ID
RANK TOP 3
3

Prediction output

Every entity gets a score, updated continuously

SESSION_IDRECOMMENDED_PRODUCTADD_PROBMARGIN_UPLIFTRANK
SS-7701Organic Garlic Bread0.72$2.181
SS-7701Parmigiano Reggiano0.65$2.942
SS-7701Organic Mixed Greens0.51$2.883
4

Understand why

Every prediction includes feature attributions — no black boxes

Session SS-7701 (Cart: pasta, sauce, ground beef)

Predicted: Organic Garlic Bread: 72% add probability

Top contributing features

Historical co-purchase with pasta

Monthly buyer

30% attribution

Cart context (Italian meal pattern)

3 Italian items

25% attribution

Category affinity lift

3.8x baseline

20% attribution

Customer organic preference

85% organic

14% attribution

Replenishment timing

26 days since last

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

Frequently asked questions

Common questions about basket analysis

What is the difference between basket analysis and product recommendations?

Basket analysis predicts what to add to an active cart in the current session. Product recommendations predict what to show a customer on a homepage, email, or category page based on their profile. Basket analysis operates in real time with cart context as the primary signal. Recommendations operate on longer time horizons with browsing and purchase history as primary signals. Both benefit from relational data, but basket analysis is more time-sensitive and context-dependent.

How much revenue does basket analysis add?

The average e-commerce basket contains 3.2 items, but affinity analysis suggests optimal baskets should contain 4.5-5.0 items (Baymard Institute). Adding 1 item per order at $15-25 average item value translates to $150-250M annually for a retailer processing 10M orders per year. The key is personalization: generic 'frequently bought together' suggestions add 0.3-0.4 items per basket, while personalized relational recommendations add 1.0-1.2 items.

Can basket analysis work in physical stores?

Yes, through in-store app suggestions, loyalty program notifications, and checkout screen prompts. The model uses the customer's loyalty ID to connect their current basket (from scanned items or self-checkout) to their purchase history and preferences. A grocery loyalty app can push a notification when pasta is scanned: 'Your usual organic garlic bread is on aisle 4.' The relational signal is the same; the delivery channel is different.

Bottom line: Increase average basket size by 1.2 items and basket value by $18 per order, generating $150-250M in incremental annual revenue for a 10M-order retailer.

Topics covered

basket analysis AImarket basket predictioncart recommendation enginecross-sell retail AIgraph neural network basketKumoRFMrelational deep learning retailadd-on product predictionbasket size optimizationretail cross-sell analytics

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.