Demand Forecasting
“How many units of each product will sell at each store over the next 3 months?”
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
How many units of each product will sell at each store over the next 3 months?
Retailers order inventory based on last year's averages, leading to 25–30% overstock on slow items and stockouts on trending ones. A single stockout event costs $1M+ in lost revenue for large retailers. Accurate SKU-level forecasts at each store would let you order 2,400 Classic Tees instead of 5,000 and 350 Flannels instead of 1,000 — freeing millions in working capital while keeping shelves stocked.
Quick answer
Demand forecasting predicts how many units of each product will sell at each store over a future time window. The best models go beyond isolated time-series forecasting by connecting products to transactions, stores, suppliers, and promotions in a relational graph, capturing cross-product substitution effects and promotional lifts that single-SKU models miss.
Approaches compared
4 ways to solve this problem
1. Historical Averages / Moving Averages
Forecast demand based on last year's same-period sales or a rolling average. The most common approach in mid-market retail, often implemented in spreadsheets.
Best for
Stable, mature product categories with minimal seasonal variation and no new product introductions.
Watch out for
Misses trends, promotional effects, and new product cannibalization. A 25-30% error rate on individual SKUs is typical, leading to significant overstock and stockout costs.
2. Time-Series Models (ARIMA, Prophet, ETS)
Fit statistical time-series models to each SKU-store pair's historical sales data. Captures trend, seasonality, and holiday effects.
Best for
Products with long, clean sales histories and strong seasonal patterns. Good at capturing regular cyclical demand.
Watch out for
Treats each SKU-store pair in isolation. Cannot see that a promotion on Product A cannibalized Product B, or that a new store opening shifted demand from a nearby location. Forecast accuracy drops sharply for new or slow-moving products.
3. Gradient Boosted Trees (LightGBM/XGBoost)
Train a regression model on hand-crafted features: lagged sales, price, promotions, holidays, weather. The current industry standard for retail demand forecasting.
Best for
Teams with strong feature engineering capability and large training datasets. Handles non-linear relationships well.
Watch out for
Requires weeks of feature engineering per model iteration. Still treats each prediction as an independent row. Cross-product substitution, supplier disruption, and regional demand shifts require explicit feature creation.
4. KumoRFM (Graph Neural Networks on Relational Data)
Connects products, transactions, stores, suppliers, and promotions into a single relational graph. The GNN learns cross-product, cross-store, and cross-supplier signals automatically. No feature engineering required.
Best for
Multi-store retailers with complex product relationships, supplier dependencies, and promotional calendars.
Watch out for
The graph advantage is largest when products share suppliers, stores overlap geographically, and promotions affect multiple SKUs. For a single-store, single-product business, simpler models may suffice.
Key metric: Graph-based demand models score 76.71 vs 62.44 on RelBench benchmarks, reducing forecast error by 25-40% and freeing $2-5M in working capital per quarter for mid-size retailers.
Why relational data changes the answer
Article A001 (Classic Tee) sold 2,412 units last quarter at Union Square. A time-series model would forecast next quarter based on this trajectory plus seasonal adjustment. But the relational graph reveals much more: A001 shares supplier SUP-12 with A003 (Cargo Shorts), and that supplier just extended lead times from 14 to 21 days, which historically suppresses restocking speed. Meanwhile, a fall campaign promotion is scheduled for both flagship stores, and similar promotions historically lift Classic Tee sales by 34%. The Midtown Mall store recently opened a competitor location within 500 meters, pulling 15% of foot traffic.
None of these cross-table signals appear in a single SKU's time series. Supplier lead time changes live in the SUPPLIERS table. Promotional calendars are in PROMOTIONS. Competitor proximity requires the STORES table. On the RelBench benchmark, graph-based demand models score 76.71 vs 62.44 for flat-table baselines. For retail demand forecasting specifically, the improvement is often larger because the relational structure (products-stores-suppliers-promotions) is inherently rich. Each additional table connection adds signal that single-SKU models structurally cannot access.
Forecasting demand with a time-series model is like predicting rush-hour traffic by only looking at one road's historical traffic counts. A relational model sees that a concert is scheduled downtown (promotion), a highway on-ramp closed for construction (supplier disruption), and a new office building opened nearby (store opening). The historical pattern matters, but the connected context is what separates a useful forecast from a guess.
How KumoRFM solves this
Relational intelligence for every forecast
Kumo learns from the full relational graph — products connected to transactions, stores, suppliers, promotions, and seasonal calendars. Traditional time-series models see each SKU-store pair in isolation. Kumo sees that Article A001 shares supplier and seasonal patterns with similar items, amplifying the demand signal even for new or slow-moving products. The graph structure captures cross-product substitution effects, regional preferences, and promotional lifts that flat models miss entirely.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
ARTICLES
| article_id | article_name | category | supplier_id |
|---|---|---|---|
| A001 | Classic Tee | apparel | SUP-12 |
| A002 | Slim Flannel | apparel | SUP-07 |
| A003 | Cargo Shorts | apparel | SUP-12 |
TRANSACTIONS
| txn_id | article_id | store_id | quantity | revenue | timestamp |
|---|---|---|---|---|---|
| TXN-90001 | A001 | S-14 | 3 | $74.97 | 2025-09-15 |
| TXN-90002 | A002 | S-14 | 1 | $48.00 | 2025-09-15 |
| TXN-90003 | A003 | S-22 | 2 | $69.98 | 2025-09-16 |
STORES
| store_id | store_name | region | format |
|---|---|---|---|
| S-14 | Union Square | West | flagship |
| S-22 | Midtown Mall | Northeast | standard |
| S-37 | Lakeside Plaza | Midwest | outlet |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT SUM(TRANSACTIONS.QUANTITY, 0, 3, months) FOR EACH ARTICLES.ARTICLE_ID WHERE ARTICLES.CATEGORY = "apparel"
Prediction output
Every entity gets a score, updated continuously
| ARTICLE_ID | TIMESTAMP | TARGET_PRED |
|---|---|---|
| A001 | 2025-10-01 | 2,412 |
| A002 | 2025-10-01 | 876 |
| A003 | 2025-10-01 | 341 |
Understand why
Every prediction includes feature attributions — no black boxes
Article A001 (Classic Tee)
Predicted: 2,412 units sold in next 3 months
Top contributing features
Seasonal trend (Q4 uplift)
+34%
31% attribution
Store traffic (flagship locations)
High
24% attribution
Promotion active (fall campaign)
Yes
19% attribution
Supplier lead time
14 days
14% attribution
Price point vs. category avg
-8%
12% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Frequently asked questions
Common questions about demand forecasting
How accurate is AI demand forecasting compared to traditional methods?
Graph-based demand models reduce forecast error by 25-40% compared to moving averages and 15-25% compared to time-series methods. On the RelBench benchmark, relational models score 76.71 vs 62.44 for flat baselines. The improvement is largest for new products, promotional periods, and stores with cross-location dependencies.
Can demand forecasting work for new products with no history?
Yes. Graph-based models solve the cold-start problem by transferring knowledge from similar products through the relational graph. A new t-shirt with no sales history can borrow demand patterns from similar items that share the same supplier, category, price point, and store placement. Traditional time-series models fail completely in this scenario.
What data do I need for SKU-level demand forecasting?
At minimum: a products table, a transactions table with timestamps, and a stores/locations table. High-value additions include supplier data, promotional calendars, pricing history, and weather data. The more relational tables you connect, the more cross-product and cross-store signals the graph captures.
How does promotional demand lifting work in graph models?
The graph connects promotions to products and stores, learning the historical lift pattern for each promotion type, product category, and store format combination. It also captures cross-category cannibalization: a 20% off electronics promotion may pull sales from the accessories category. These interaction effects are invisible to models that forecast each SKU independently.
Bottom line: Reduce overstock by 25% and eliminate stockouts on high-demand SKUs — freeing $2–5M in working capital per quarter.
Related use cases
Explore more forecasting use cases
Topics covered
One Platform. One Model. Infinite Predictions.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.




