New Product Launch Prediction
“Which customers will buy this new product with zero purchase history?”
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
Which customers will buy this new product with zero purchase history?
Retailers launch 25,000-50,000 new SKUs annually, but 70-80% fail to meet sales targets in the first 90 days (Nielsen). Traditional demand models require 8-12 weeks of sales history before making accurate predictions, leaving the critical launch window unoptimized. Overstocking a failed product wastes $50-200K per SKU in inventory carrying and markdown costs. Understocking a hit product forfeits $200-500K in lost revenue during the peak-demand window. The cold-start problem costs large retailers $500M-$1B annually in misallocated launch inventory.
Quick answer
The cold-start problem is the hardest challenge in retail prediction: how do you forecast demand for a product with zero sales history? Traditional time-series models cannot, full stop. They need 8-12 weeks of data before making accurate predictions. Relational models solve this by transferring learned demand patterns from similar products through the graph. When a new organic protein bar launches, the model connects its attributes (organic, high-protein, $3.49) to customers who buy similar products and stores where health-food trends are strong. On the RelBench product demand task, KumoRFM scores 76.71 vs 62.44, with the cold-start advantage being even larger because relational models do not require per-item training data.
Approaches compared
4 ways to solve this problem
1. Buyer committee and analog planning
Merchants select 2-3 analog products and use their historical sales as a proxy for the new item. The most common approach in retail buying offices.
Best for
Small product launches where a senior buyer has deep category expertise and can identify the right analog products.
Watch out for
Highly subjective. Two buyers pick different analogs and get 2x different forecasts. No systematic way to weight attributes, store-level trends, or customer segment affinity. Nielsen reports 70-80% of new SKUs miss targets.
2. Attribute-based regression models
Train a regression model on product attributes (category, price point, brand tier) to predict first-week demand based on how similar attribute combinations performed historically.
Best for
Retailers with structured product attribute data and a history of launches in similar categories.
Watch out for
Treats all stores identically and ignores local customer base composition. A new organic snack will sell 5x more at a health-food-focused urban store than a rural conventional grocer, but attribute regression cannot see store-level context.
3. XGBoost with product and store features
Gradient-boosted models trained on historical launch data with features for product attributes, store demographics, similar-product velocity, and seasonal timing.
Best for
Analytics teams that can build launch-specific feature pipelines from product master data and store performance tables.
Watch out for
Feature engineering for cold-start is especially hard because you are trying to encode similarity relationships manually. Which attributes matter? How do you weight brand vs. price point vs. category? SAP SALT shows 75% accuracy ceiling.
4. KumoRFM (relational foundation model)
Connects new products to existing products, customer preferences, and store trends through the relational graph. Transfers demand patterns from similar products without manual analog selection.
Best for
Retailers launching 1,000+ new SKUs per year who need automated, store-level demand forecasts without waiting for sales data to accumulate.
Watch out for
Accuracy improves with the richness of the product attribute graph. If new products have minimal attribute data (just name and price), the graph has less to connect to.
Key metric: RelBench product demand: KumoRFM 76.71 vs next-best 62.44. Cold-start advantage: store-level forecasts from day zero vs 8-12 week blind period.
Why relational data changes the answer
The cold-start problem exists because traditional models need a product's own sales history to make predictions. A relational model does not, because it learns demand patterns from the graph structure connecting products, customers, and stores. When Product P-6001 (new organic protein bar) enters the graph, it connects to existing organic snacks through shared attributes, to health-conscious customers through their purchase patterns, and to specific stores through their category performance data.
The model transfers learned demand signals through these connections. It predicts 285 units in the first week at Store S-14 because that store has high organic snack velocity, 62% of its customers have medium-to-high organic affinity, and the product's attribute profile (organic, high-protein, gluten-free, $3.49) is 92% similar to existing top sellers. Store S-37 gets a prediction of only 55 units because its health-food index is 4.2/10 and similar products sell slowly there. This store-level granularity from day zero is impossible with models that require per-item sales history.
Launching a new product without relational data is like a real estate agent pricing a house that was just built on a street where no house has ever sold. With only the house's features (3 bedrooms, 2 baths), they are guessing. A relational model is like an agent who knows the neighborhood demographics, the prices of similar houses across town, which buyers have been searching for this type of home, and which streets are trending up. They can price the house accurately on day one because the information lives in the connections, not in the house's own sales history.
How KumoRFM solves this
Relational intelligence built for retail and e-commerce data
Kumo does not need sales history for the new product because it learns from the relational graph connecting product attributes, similar products, customer preferences, and market signals. When a new organic protein bar (P-6001) launches, Kumo's graph neural network recognizes its attributes (organic, high-protein, $3.49 price point) and connects them to customers who buy similar products. Customer CU-3012 has bought 6 organic snack products in the past 90 days and lives near a store where health-food trends are strong. The model predicts first-week demand at each store without any prior sales data.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
NEW_PRODUCT
| product_id | name | category | attributes | price | launch_date |
|---|---|---|---|---|---|
| P-6001 | Peak Organic Protein Bar | Snacks | Organic, High-Protein, Gluten-Free | $3.49 | 2025-10-01 |
SIMILAR_PRODUCTS
| product_id | name | category | weekly_units_avg | customer_overlap |
|---|---|---|---|---|
| P-5801 | RX Bar Protein | Snacks | 420 | High |
| P-5802 | Kind Protein Bar | Snacks | 380 | High |
| P-5803 | Clif Organic Bar | Snacks | 310 | Medium |
CUSTOMER_PREFERENCES
| customer_id | organic_affinity | protein_purchases_90d | snack_spend_90d |
|---|---|---|---|
| CU-3012 | High | 12 | $84.50 |
| CU-3045 | Medium | 4 | $32.00 |
| CU-3078 | Low | 0 | $8.50 |
STORE_TRENDS
| store_id | health_food_index | organic_growth_yoy | similar_product_velocity |
|---|---|---|---|
| S-14 | 8.4 | +22% | High |
| S-22 | 6.1 | +12% | Medium |
| S-37 | 4.2 | +5% | Low |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT BOOL(ORDERS.PRODUCT_ID = 'P-6001', 0, 7, days) FOR EACH CUSTOMERS.CUSTOMER_ID WHERE CUSTOMER_PREFERENCES.ORGANIC_AFFINITY IN ('High', 'Medium')
Prediction output
Every entity gets a score, updated continuously
| STORE_ID | PREDICTED_WEEK1_UNITS | TARGET_CUSTOMERS | STOCK_REC | CONFIDENCE |
|---|---|---|---|---|
| S-14 | 285 | 1,420 | 350 | High |
| S-22 | 140 | 680 | 180 | Medium |
| S-37 | 55 | 210 | 75 | Medium |
Understand why
Every prediction includes feature attributions — no black boxes
New Product P-6001 (Peak Organic Protein Bar) at Store S-14
Predicted: 285 units predicted in first week
Top contributing features
Similar product velocity at this store
High
28% attribution
Customer base organic affinity
62% High/Med
25% attribution
Attribute similarity to top sellers
92% match
21% attribution
Store health-food trend index
8.4/10
15% attribution
Price point within target range
$3.49 (sweet spot)
11% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Frequently asked questions
Common questions about new product launch prediction
How do you predict demand for a product with no sales history?
You transfer learned demand patterns from similar products through a relational graph. The model connects the new product's attributes (category, price, brand tier, ingredients) to existing products and their historical performance. It also connects to the customer base (who buys similar items?) and to stores (which locations over-index on this category?). This graph-based transfer learning produces store-level demand forecasts from day zero, eliminating the 8-12 week blind period that traditional models require.
What percentage of new product launches fail in retail?
Nielsen reports that 70-80% of new SKUs fail to meet sales targets in the first 90 days. A major driver is inventory misallocation: overstocking failures wastes $50-200K per SKU in carrying and markdown costs, while understocking hits forfeits $200-500K in lost revenue during peak demand. For large retailers launching 25,000-50,000 new SKUs annually, misallocation costs $500M-$1B. Better day-one forecasting addresses both sides: you stock less of likely failures and more of likely hits.
How quickly does a new product launch model improve with actual sales data?
Relational models produce their best cold-start predictions from day zero, then refine as real sales data flows in. After 1 week of actual sales, the model blends graph-based predictions with observed velocity, typically improving accuracy by 15-20%. By week 4, the model relies primarily on actual data with graph signals as a secondary input. The critical value is in the first 2-4 weeks, where traditional models have nothing and relational models are already 40-60% more accurate than manual analog planning.
Bottom line: Accurately forecast first-week demand for new products with zero sales history, reducing launch inventory misallocation by 40-60% and recovering $500M-$1B in industry-wide launch losses.
Related use cases
Explore more retail & e-commerce use cases
Topics covered
One Platform. One Model. Infinite Predictions.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.




