8Classification · Cold Start

New Product Launch Prediction

“Which customers will buy this new product with zero purchase history?”

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example

Which customers will buy this new product with zero purchase history?

Retailers launch 25,000-50,000 new SKUs annually, but 70-80% fail to meet sales targets in the first 90 days (Nielsen). Traditional demand models require 8-12 weeks of sales history before making accurate predictions, leaving the critical launch window unoptimized. Overstocking a failed product wastes $50-200K per SKU in inventory carrying and markdown costs. Understocking a hit product forfeits $200-500K in lost revenue during the peak-demand window. The cold-start problem costs large retailers $500M-$1B annually in misallocated launch inventory.

Quick answer

The cold-start problem is the hardest challenge in retail prediction: how do you forecast demand for a product with zero sales history? Traditional time-series models cannot, full stop. They need 8-12 weeks of data before making accurate predictions. Relational models solve this by transferring learned demand patterns from similar products through the graph. When a new organic protein bar launches, the model connects its attributes (organic, high-protein, $3.49) to customers who buy similar products and stores where health-food trends are strong. On the RelBench product demand task, KumoRFM scores 76.71 vs 62.44, with the cold-start advantage being even larger because relational models do not require per-item training data.

Approaches compared

4 ways to solve this problem

1. Buyer committee and analog planning

Merchants select 2-3 analog products and use their historical sales as a proxy for the new item. The most common approach in retail buying offices.

Best for

Small product launches where a senior buyer has deep category expertise and can identify the right analog products.

Watch out for

Highly subjective. Two buyers pick different analogs and get 2x different forecasts. No systematic way to weight attributes, store-level trends, or customer segment affinity. Nielsen reports 70-80% of new SKUs miss targets.

2. Attribute-based regression models

Train a regression model on product attributes (category, price point, brand tier) to predict first-week demand based on how similar attribute combinations performed historically.

Best for

Retailers with structured product attribute data and a history of launches in similar categories.

Watch out for

Treats all stores identically and ignores local customer base composition. A new organic snack will sell 5x more at a health-food-focused urban store than a rural conventional grocer, but attribute regression cannot see store-level context.

3. XGBoost with product and store features

Gradient-boosted models trained on historical launch data with features for product attributes, store demographics, similar-product velocity, and seasonal timing.

Best for

Analytics teams that can build launch-specific feature pipelines from product master data and store performance tables.

Watch out for

Feature engineering for cold-start is especially hard because you are trying to encode similarity relationships manually. Which attributes matter? How do you weight brand vs. price point vs. category? SAP SALT shows 75% accuracy ceiling.

4. KumoRFM (relational foundation model)

Connects new products to existing products, customer preferences, and store trends through the relational graph. Transfers demand patterns from similar products without manual analog selection.

Best for

Retailers launching 1,000+ new SKUs per year who need automated, store-level demand forecasts without waiting for sales data to accumulate.

Watch out for

Accuracy improves with the richness of the product attribute graph. If new products have minimal attribute data (just name and price), the graph has less to connect to.

Key metric: RelBench product demand: KumoRFM 76.71 vs next-best 62.44. Cold-start advantage: store-level forecasts from day zero vs 8-12 week blind period.

Why relational data changes the answer

The cold-start problem exists because traditional models need a product's own sales history to make predictions. A relational model does not, because it learns demand patterns from the graph structure connecting products, customers, and stores. When Product P-6001 (new organic protein bar) enters the graph, it connects to existing organic snacks through shared attributes, to health-conscious customers through their purchase patterns, and to specific stores through their category performance data.

The model transfers learned demand signals through these connections. It predicts 285 units in the first week at Store S-14 because that store has high organic snack velocity, 62% of its customers have medium-to-high organic affinity, and the product's attribute profile (organic, high-protein, gluten-free, $3.49) is 92% similar to existing top sellers. Store S-37 gets a prediction of only 55 units because its health-food index is 4.2/10 and similar products sell slowly there. This store-level granularity from day zero is impossible with models that require per-item sales history.

Launching a new product without relational data is like a real estate agent pricing a house that was just built on a street where no house has ever sold. With only the house's features (3 bedrooms, 2 baths), they are guessing. A relational model is like an agent who knows the neighborhood demographics, the prices of similar houses across town, which buyers have been searching for this type of home, and which streets are trending up. They can price the house accurately on day one because the information lives in the connections, not in the house's own sales history.

How KumoRFM solves this

Relational intelligence built for retail and e-commerce data

Kumo does not need sales history for the new product because it learns from the relational graph connecting product attributes, similar products, customer preferences, and market signals. When a new organic protein bar (P-6001) launches, Kumo's graph neural network recognizes its attributes (organic, high-protein, $3.49 price point) and connects them to customers who buy similar products. Customer CU-3012 has bought 6 organic snack products in the past 90 days and lives near a store where health-food trends are strong. The model predicts first-week demand at each store without any prior sales data.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

Your data

The relational tables Kumo learns from

NEW_PRODUCT

product_id	name	category	attributes	price	launch_date
P-6001	Peak Organic Protein Bar	Snacks	Organic, High-Protein, Gluten-Free	$3.49	2025-10-01

SIMILAR_PRODUCTS

product_id	name	category	weekly_units_avg	customer_overlap
P-5801	RX Bar Protein	Snacks	420	High
P-5802	Kind Protein Bar	Snacks	380	High
P-5803	Clif Organic Bar	Snacks	310	Medium

CUSTOMER_PREFERENCES

customer_id	organic_affinity	protein_purchases_90d	snack_spend_90d
CU-3012	High	12	$84.50
CU-3045	Medium	4	$32.00
CU-3078	Low	0	$8.50

STORE_TRENDS

store_id	health_food_index	organic_growth_yoy	similar_product_velocity
S-14	8.4	+22%	High
S-22	6.1	+12%	Medium
S-37	4.2	+5%	Low

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL

PREDICT BOOL(ORDERS.PRODUCT_ID = 'P-6001', 0, 7, days)
FOR EACH CUSTOMERS.CUSTOMER_ID
WHERE CUSTOMER_PREFERENCES.ORGANIC_AFFINITY IN ('High', 'Medium')

Prediction output

Every entity gets a score, updated continuously

STORE_ID	PREDICTED_WEEK1_UNITS	TARGET_CUSTOMERS	STOCK_REC	CONFIDENCE
S-14	285	1,420	350	High
S-22	140	680	180	Medium
S-37	55	210	75	Medium

Understand why

Every prediction includes feature attributions — no black boxes

New Product P-6001 (Peak Organic Protein Bar) at Store S-14

Predicted: 285 units predicted in first week

Top contributing features

Common questions about new product launch prediction

How do you predict demand for a product with no sales history?

You transfer learned demand patterns from similar products through a relational graph. The model connects the new product's attributes (category, price, brand tier, ingredients) to existing products and their historical performance. It also connects to the customer base (who buys similar items?) and to stores (which locations over-index on this category?). This graph-based transfer learning produces store-level demand forecasts from day zero, eliminating the 8-12 week blind period that traditional models require.

What percentage of new product launches fail in retail?

Nielsen reports that 70-80% of new SKUs fail to meet sales targets in the first 90 days. A major driver is inventory misallocation: overstocking failures wastes $50-200K per SKU in carrying and markdown costs, while understocking hits forfeits $200-500K in lost revenue during peak demand. For large retailers launching 25,000-50,000 new SKUs annually, misallocation costs $500M-$1B. Better day-one forecasting addresses both sides: you stock less of likely failures and more of likely hits.

How quickly does a new product launch model improve with actual sales data?

Relational models produce their best cold-start predictions from day zero, then refine as real sales data flows in. After 1 week of actual sales, the model blends graph-based predictions with observed velocity, typically improving accuracy by 15-20%. By week 4, the model relies primarily on actual data with graph signals as a secondary input. The critical value is in the first 2-4 weeks, where traditional models have nothing and relational models are already 40-60% more accurate than manual analog planning.

Bottom line: Accurately forecast first-week demand for new products with zero sales history, reducing launch inventory misallocation by 40-60% and recovering $500M-$1B in industry-wide launch losses.

Related use cases

Explore more retail & e-commerce use cases

Use Case #1SKU-Level Demand ForecastingLearn more

Use Case #2Product RecommendationsLearn more

Use Case #5Inventory OptimizationLearn more

Previous#7 Return Prediction

Next#9 Store Clustering

Topics covered

new product launch predictioncold start problem AIproduct launch demand forecastingnew SKU predictiongraph neural network cold startKumoRFMrelational deep learning retailproduct launch analyticszero-shot product predictionnew item demand forecasting

From a leadership team with proven experience

Vanja Josifovski

CEO and Co-Founder, ex-CTO Airbnb, ex-CTO Pinterest

Jure Leskovec

Co-Founder & Chief Scientist, Stanford Professor

Hema Raghavan

Co-Founder & Head of Engineering, ex-AI Lead, LinkedIn

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

Book a Demo Try Free