Executive AI Dinner hosted by Kumo - Austin, April 8

Register here
8Classification · Cold Start

New Product Launch Prediction

Which customers will buy this new product with zero purchase history?

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

Catalina Logo

A real-world example

Which customers will buy this new product with zero purchase history?

Retailers launch 25,000-50,000 new SKUs annually, but 70-80% fail to meet sales targets in the first 90 days (Nielsen). Traditional demand models require 8-12 weeks of sales history before making accurate predictions, leaving the critical launch window unoptimized. Overstocking a failed product wastes $50-200K per SKU in inventory carrying and markdown costs. Understocking a hit product forfeits $200-500K in lost revenue during the peak-demand window. The cold-start problem costs large retailers $500M-$1B annually in misallocated launch inventory.

Quick answer

The cold-start problem is the hardest challenge in retail prediction: how do you forecast demand for a product with zero sales history? Traditional time-series models cannot, full stop. They need 8-12 weeks of data before making accurate predictions. Relational models solve this by transferring learned demand patterns from similar products through the graph. When a new organic protein bar launches, the model connects its attributes (organic, high-protein, $3.49) to customers who buy similar products and stores where health-food trends are strong. On the RelBench product demand task, KumoRFM scores 76.71 vs 62.44, with the cold-start advantage being even larger because relational models do not require per-item training data.

Approaches compared

4 ways to solve this problem

1. Buyer committee and analog planning

Merchants select 2-3 analog products and use their historical sales as a proxy for the new item. The most common approach in retail buying offices.

Best for

Small product launches where a senior buyer has deep category expertise and can identify the right analog products.

Watch out for

Highly subjective. Two buyers pick different analogs and get 2x different forecasts. No systematic way to weight attributes, store-level trends, or customer segment affinity. Nielsen reports 70-80% of new SKUs miss targets.

2. Attribute-based regression models

Train a regression model on product attributes (category, price point, brand tier) to predict first-week demand based on how similar attribute combinations performed historically.

Best for

Retailers with structured product attribute data and a history of launches in similar categories.

Watch out for

Treats all stores identically and ignores local customer base composition. A new organic snack will sell 5x more at a health-food-focused urban store than a rural conventional grocer, but attribute regression cannot see store-level context.

3. XGBoost with product and store features

Gradient-boosted models trained on historical launch data with features for product attributes, store demographics, similar-product velocity, and seasonal timing.

Best for

Analytics teams that can build launch-specific feature pipelines from product master data and store performance tables.

Watch out for

Feature engineering for cold-start is especially hard because you are trying to encode similarity relationships manually. Which attributes matter? How do you weight brand vs. price point vs. category? SAP SALT shows 75% accuracy ceiling.

4. KumoRFM (relational foundation model)

Connects new products to existing products, customer preferences, and store trends through the relational graph. Transfers demand patterns from similar products without manual analog selection.

Best for

Retailers launching 1,000+ new SKUs per year who need automated, store-level demand forecasts without waiting for sales data to accumulate.

Watch out for

Accuracy improves with the richness of the product attribute graph. If new products have minimal attribute data (just name and price), the graph has less to connect to.

Key metric: RelBench product demand: KumoRFM 76.71 vs next-best 62.44. Cold-start advantage: store-level forecasts from day zero vs 8-12 week blind period.

Why relational data changes the answer

The cold-start problem exists because traditional models need a product's own sales history to make predictions. A relational model does not, because it learns demand patterns from the graph structure connecting products, customers, and stores. When Product P-6001 (new organic protein bar) enters the graph, it connects to existing organic snacks through shared attributes, to health-conscious customers through their purchase patterns, and to specific stores through their category performance data.

The model transfers learned demand signals through these connections. It predicts 285 units in the first week at Store S-14 because that store has high organic snack velocity, 62% of its customers have medium-to-high organic affinity, and the product's attribute profile (organic, high-protein, gluten-free, $3.49) is 92% similar to existing top sellers. Store S-37 gets a prediction of only 55 units because its health-food index is 4.2/10 and similar products sell slowly there. This store-level granularity from day zero is impossible with models that require per-item sales history.

Launching a new product without relational data is like a real estate agent pricing a house that was just built on a street where no house has ever sold. With only the house's features (3 bedrooms, 2 baths), they are guessing. A relational model is like an agent who knows the neighborhood demographics, the prices of similar houses across town, which buyers have been searching for this type of home, and which streets are trending up. They can price the house accurately on day one because the information lives in the connections, not in the house's own sales history.

How KumoRFM solves this

Relational intelligence built for retail and e-commerce data

Kumo does not need sales history for the new product because it learns from the relational graph connecting product attributes, similar products, customer preferences, and market signals. When a new organic protein bar (P-6001) launches, Kumo's graph neural network recognizes its attributes (organic, high-protein, $3.49 price point) and connects them to customers who buy similar products. Customer CU-3012 has bought 6 organic snack products in the past 90 days and lives near a store where health-food trends are strong. The model predicts first-week demand at each store without any prior sales data.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

1

Your data

The relational tables Kumo learns from

NEW_PRODUCT

product_idnamecategoryattributespricelaunch_date
P-6001Peak Organic Protein BarSnacksOrganic, High-Protein, Gluten-Free$3.492025-10-01

SIMILAR_PRODUCTS

product_idnamecategoryweekly_units_avgcustomer_overlap
P-5801RX Bar ProteinSnacks420High
P-5802Kind Protein BarSnacks380High
P-5803Clif Organic BarSnacks310Medium

CUSTOMER_PREFERENCES

customer_idorganic_affinityprotein_purchases_90dsnack_spend_90d
CU-3012High12$84.50
CU-3045Medium4$32.00
CU-3078Low0$8.50

STORE_TRENDS

store_idhealth_food_indexorganic_growth_yoysimilar_product_velocity
S-148.4+22%High
S-226.1+12%Medium
S-374.2+5%Low
2

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL
PREDICT BOOL(ORDERS.PRODUCT_ID = 'P-6001', 0, 7, days)
FOR EACH CUSTOMERS.CUSTOMER_ID
WHERE CUSTOMER_PREFERENCES.ORGANIC_AFFINITY IN ('High', 'Medium')
3

Prediction output

Every entity gets a score, updated continuously

STORE_IDPREDICTED_WEEK1_UNITSTARGET_CUSTOMERSSTOCK_RECCONFIDENCE
S-142851,420350High
S-22140680180Medium
S-375521075Medium
4

Understand why

Every prediction includes feature attributions — no black boxes

New Product P-6001 (Peak Organic Protein Bar) at Store S-14

Predicted: 285 units predicted in first week

Top contributing features

Similar product velocity at this store

High

28% attribution

Customer base organic affinity

62% High/Med

25% attribution

Attribute similarity to top sellers

92% match

21% attribution

Store health-food trend index

8.4/10

15% attribution

Price point within target range

$3.49 (sweet spot)

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

Frequently asked questions

Common questions about new product launch prediction

How do you predict demand for a product with no sales history?

You transfer learned demand patterns from similar products through a relational graph. The model connects the new product's attributes (category, price, brand tier, ingredients) to existing products and their historical performance. It also connects to the customer base (who buys similar items?) and to stores (which locations over-index on this category?). This graph-based transfer learning produces store-level demand forecasts from day zero, eliminating the 8-12 week blind period that traditional models require.

What percentage of new product launches fail in retail?

Nielsen reports that 70-80% of new SKUs fail to meet sales targets in the first 90 days. A major driver is inventory misallocation: overstocking failures wastes $50-200K per SKU in carrying and markdown costs, while understocking hits forfeits $200-500K in lost revenue during peak demand. For large retailers launching 25,000-50,000 new SKUs annually, misallocation costs $500M-$1B. Better day-one forecasting addresses both sides: you stock less of likely failures and more of likely hits.

How quickly does a new product launch model improve with actual sales data?

Relational models produce their best cold-start predictions from day zero, then refine as real sales data flows in. After 1 week of actual sales, the model blends graph-based predictions with observed velocity, typically improving accuracy by 15-20%. By week 4, the model relies primarily on actual data with graph signals as a secondary input. The critical value is in the first 2-4 weeks, where traditional models have nothing and relational models are already 40-60% more accurate than manual analog planning.

Bottom line: Accurately forecast first-week demand for new products with zero sales history, reducing launch inventory misallocation by 40-60% and recovering $500M-$1B in industry-wide launch losses.

Topics covered

new product launch predictioncold start problem AIproduct launch demand forecastingnew SKU predictiongraph neural network cold startKumoRFMrelational deep learning retailproduct launch analyticszero-shot product predictionnew item demand forecasting

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.