The headline result: SAP SALT benchmark
Before comparing individual tools, here is the result that matters most. The SAP SALT benchmark is an enterprise-grade evaluation where real business analysts and data scientists attempt prediction tasks on SAP enterprise data. It measures how accurately different approaches predict real business outcomes on production-quality enterprise databases with multiple related tables.
sap_salt_enterprise_benchmark
| approach | accuracy | what_it_means |
|---|---|---|
| LLM + AutoML | 63% | Language model generates features, AutoML selects model |
| PhD Data Scientist + XGBoost | 75% | Expert spends weeks hand-crafting features, tunes XGBoost |
| KumoRFM (zero-shot) | 91% | No feature engineering, no training, reads relational tables directly |
SAP SALT benchmark: KumoRFM outperforms expert data scientists by 16 percentage points and LLM+AutoML by 28 percentage points on real enterprise prediction tasks.
KumoRFM scores 91% where PhD-level data scientists with weeks of feature engineering and hand-tuned XGBoost score 75%. The 16 percentage point gap is the value of reading relational data natively instead of flattening it into a single table.
Why demand forecasting is harder than it looks
Every retailer and CPG company has a demand forecasting model. Most of them work the same way: take historical sales by SKU-store-week, feed the time series into ARIMA, Prophet, or XGBoost, and project forward. Add some calendar features (holidays, day-of-week), maybe a promotion flag, and call it done.
These models work. They capture seasonality, trend, and basic promotional effects. For stable products in stable markets, they can hit 70-75% accuracy at the SKU-store-week level.
But they miss the hard cases. And the hard cases are where the money is. The product whose demand spikes because a competing product stocked out. The store cluster that responds 3x to a specific promotion type while neighboring stores barely move. The new product with zero sales history that shares a category, supplier, and price point with three existing products. The supplier disruption that ripples across 200 SKUs with a 21-day lag.
What makes demand forecasting different at enterprise scale
Enterprise demand forecasting differs from simple time-series projection in three ways that most tools handle poorly:
- Cross-product dependencies. Products do not exist in isolation. They share suppliers, compete for shelf space, substitute for each other during stockouts, and respond differently to the same promotion. A model that forecasts each product independently misses these interactions entirely.
- Multi-granularity signals. Demand is influenced by signals at every level: SKU-level attributes (size, flavor, price tier), store-level characteristics (format, region, demographics), supplier-level constraints (lead time, reliability), and promotion-level parameters (type, depth, timing). These signals live in different tables, connected by foreign keys.
- Promotional lift complexity. A 20% discount on premium SKUs in urban stores during Q4 produces a fundamentally different demand response than the same discount on value SKUs in suburban stores during Q2. The combinatorial explosion of product x store x promotion x season interactions is too large for manual feature engineering to capture.
The 7 best AI demand forecasting tools, compared
demand_forecasting_tool_comparison
| Tool | Approach | Handles Cross-Product Effects | Multi-Table Native | Promotional Lift | Granularity (SKU/Store) | Best For |
|---|---|---|---|---|---|---|
| Kumo.ai | Relational foundation model (GNN) | Yes - graph captures substitution + cannibalization | Yes - reads raw relational tables | Automatic cross-table discovery | SKU-store-day | Enterprises with complex multi-table supply chain data |
| o9 Solutions | AI-powered integrated planning | Partial - configurable demand sensing | No - requires data pipeline | Configurable promotion models | SKU-store-week | Large enterprises wanting end-to-end planning |
| Anaplan | Connected planning + scenario modeling | Partial - via manual model configuration | No - model-based data integration | Manual promotion calendars | Configurable | Finance + supply chain integrated planning |
| Blue Yonder | Supply chain AI + ML forecasting | Partial - category-level modeling | No - requires ETL into platform | Built-in promotion engine | SKU-store-week | Retail and CPG supply chain optimization |
| Kinaxis Maestro | Concurrent planning + AI | Limited - scenario-based | No - requires data integration | Scenario-based promotion planning | Configurable | Supply chain orchestration across functions |
| RELEX Solutions | Unified retail planning + ML | Partial - category management features | No - retail data model | Retail-specific promotion optimization | SKU-store-day | Retail-focused demand and replenishment |
| DataRobot | AutoML on flat time-series table | No - each series modeled independently | No - requires flat feature table | Only if manually engineered as features | Depends on input table | Data science teams wanting model automation |
Highlighted: Kumo.ai is the only tool that reads multi-table relational data natively and captures cross-product effects (substitution, cannibalization, promotional lift) through graph traversal. Other tools require manual data pipelines or model each product in isolation.
1. Kumo.ai - relational demand forecasting
Kumo.ai takes a fundamentally different approach to demand forecasting. Instead of treating each SKU-store pair as an isolated time series, it connects products, transactions, stores, suppliers, and promotions in a temporal heterogeneous graph. Foreign key relationships become edges. The graph neural network traverses this structure, automatically discovering which cross-table patterns are predictive of future demand.
This means Kumo sees what time-series models cannot: that Product A and Product B share a supplier with a 21-day lead time constraint, that Store S-14 is in a cluster with 40% higher Q4 volume, that a fall promotion is running across the category, and that Product C stocked out last week (so demand is shifting to A and B). These signals are automatically discovered from the relational structure - no manual feature engineering required.
The substitution effect: the blind spot of time-series models
One of the largest sources of forecast error in retail and CPG is the substitution effect. When Product A stocks out, customers do not simply forgo the purchase - they buy Product B instead. This demand shift is invisible to time-series models because they forecast each product independently.
Consider what happens from each model's perspective when Product A stocks out:
- Time-series model for Product A: Sees sales drop to zero. Attributes it to demand decline. Future forecasts for Product A are suppressed.
- Time-series model for Product B: Sees an unexplained demand spike. Attributes it to noise or trend. Future forecasts for Product B are inflated.
- Kumo.ai: Sees the stockout event for Product A, the category relationship between A and B, and the historical substitution pattern. Correctly attributes the demand shift. Product A's forecast reflects true demand (not suppressed sales), and Product B's forecast returns to normal when A is restocked.
Demand forecasting in one PQL query
PQL Query
PREDICT SUM(TRANSACTIONS.QUANTITY, 0, 3, months) FOR EACH ARTICLES.ARTICLE_ID
This query predicts the total transaction quantity over the next 3 months for each article. Kumo automatically discovers the cross-table signals that drive demand: product category and attributes, store characteristics, supplier lead times, active promotions, and substitution patterns from related products. No manual feature engineering, no flattening tables, no building time-series features by hand.
Output
| article_id | predicted_qty_3m | top_demand_drivers | promo_lift_factor | substitution_risk |
|---|---|---|---|---|
| A-1042 | 12,400 | Q4 seasonality + fall promo | 2.1x | Low (no related stockouts) |
| A-1043 | 8,200 | Stable category, urban stores | 1.0x (no promo) | High (A-1042 substitute) |
| A-2271 | 3,100 | New product, similar to A-2270 | 1.4x | Medium |
| A-3305 | 18,900 | Category leader, 40% Q4 lift in region NE | 3.2x | Low |
Compare this to the traditional approach: a data scientist spends weeks joining transaction, product, store, and promotion tables, computing hundreds of features (rolling averages, lags, promotional indicators, price elasticities), flattening everything into a single row per SKU-store-week, then training an XGBoost model. The PQL query replaces that entire pipeline.
2. o9 Solutions - AI-powered integrated planning
o9 Solutions is an AI-powered planning platform that combines demand sensing, supply planning, and integrated business planning into a single platform. Its demand forecasting component uses ML models enhanced with demand sensing - short-term signals like POS data, weather, and social trends - to adjust statistical forecasts in near real-time.
Strengths: End-to-end planning from demand to supply to finance. Strong demand sensing capabilities that incorporate external signals. The knowledge graph architecture connects planning dimensions. Good for large enterprises that want a unified planning platform rather than a point forecasting solution.
Limitations: Requires significant implementation effort (6-12 months for full deployment). Demand sensing improves short-term accuracy but does not fundamentally change how cross-product relationships are modeled. The ML models still operate on pre-configured data pipelines, not raw relational tables. Expensive - enterprise pricing starts in the high six figures.
3. Anaplan - connected planning and scenario modeling
Anaplan is a connected planning platform that spans finance, supply chain, and sales. Its demand forecasting capabilities are embedded within a broader planning model that lets planners build scenarios, run what-if analyses, and connect demand plans to financial outcomes.
Strengths: Best-in-class scenario modeling and what-if analysis. Strong finance-to-supply-chain connectivity. The Hyperblock engine handles large-scale planning models. Good for organizations where demand planning is tightly coupled with financial planning and S&OP processes.
Limitations: The forecasting engine is more statistical than ML-native. Cross-product effects require manual model configuration. Not designed for SKU-store-level granularity at scale - better at category or region-level planning. The platform's flexibility means implementation complexity is high.
4. Blue Yonder - supply chain AI and demand forecasting
Blue Yonder (formerly JDA Software) is one of the most established supply chain platforms, with deep capabilities in demand forecasting, inventory optimization, and replenishment. Its ML forecasting engine operates at SKU-store granularity and includes a built-in promotion engine for modeling promotional lifts.
Strengths: Deep retail and CPG domain expertise built over decades. The promotion engine is one of the most mature in the market. Strong inventory optimization that connects demand forecasts to ordering decisions. Large customer base means extensive benchmarks and best practices.
Limitations: Legacy architecture means the platform can feel monolithic. Data integration requires ETL into Blue Yonder's data model. Cross-product effects are handled at the category level, not at the individual product graph level. Implementation timelines of 6-18 months are common.
5. Kinaxis Maestro - concurrent planning and orchestration
Kinaxis Maestro (formerly RapidResponse) focuses on supply chain orchestration - the ability to plan across demand, supply, inventory, and logistics concurrently rather than sequentially. Its demand forecasting capabilities are integrated into this broader concurrent planning framework.
Strengths: Best-in-class concurrent planning that connects demand with supply constraints in real time. Strong scenario and what-if capabilities. The orchestration engine excels at identifying and resolving conflicts between demand plans and supply capacity. Good for complex manufacturing supply chains.
Limitations: The forecasting engine is not its primary differentiator - the orchestration layer is. Cross-product demand effects are handled through scenario modeling, not automatic discovery. Requires significant configuration to model your specific supply chain. Better for supply-constrained environments than demand-driven retail.
6. RELEX Solutions - retail-focused demand planning
RELEX Solutions is a unified supply chain planning platform with a strong focus on retail. Its demand forecasting engine operates at SKU-store-day granularity and includes category management features that account for shelf space, assortment changes, and promotional effects specific to retail environments.
Strengths: Purpose-built for retail, which means the platform understands retail-specific demand patterns (shelf life, planogram effects, markdown optimization). SKU-store-day granularity is finer than most competitors. The unified platform connects demand forecasting to replenishment, workforce planning, and space optimization.
Limitations: Retail-focused design means it is less suited for non-retail supply chains (manufacturing, B2B distribution). Cross-product effects are modeled through category management features, not through automatic graph discovery. Data integration follows a retail-specific data model that may not map cleanly to your warehouse structure.
7. DataRobot - AutoML for demand forecasting
DataRobot applies AutoML to demand forecasting: you upload a time-series feature table, and it tries dozens of model architectures (XGBoost, LightGBM, neural nets, ARIMA variants), tunes hyperparameters, and returns the best-performing model. It is the most sophisticated AutoML platform for time-series forecasting.
Strengths: Best-in-class automated model selection and hyperparameter tuning for time-series problems. Excellent explainability (SHAP values, feature importance). Strong MLOps features for model monitoring, drift detection, and automated retraining. Enterprise-grade security and governance.
Limitations: Requires a pre-built flat time-series table. All feature engineering is manual - joining product attributes, store characteristics, promotion calendars, and supplier data into a single table is your team's responsibility. Each SKU-store series is modeled independently. Cannot capture cross-product substitution, relational promotional lifts, or supplier-level demand constraints. Accuracy is bounded by the features you build.
The cross-table signal gap: what time-series tools miss
The single biggest differentiator in demand forecasting accuracy is whether a tool can model cross-table relationships. Here is why:
demand_signal_strength_by_type
| Signal Type | Example | Visible in Time-Series Table | Relative Predictive Power |
|---|---|---|---|
| Historical trend | Product A sells 15% more each Q4 | Yes | Moderate (captures seasonality, misses causality) |
| Calendar effects | Holiday week, day-of-week patterns | Yes | Moderate (easy to model, limited upside) |
| Basic promotion flag | Product A is on promotion this week | Yes (if added as feature) | Moderate (binary flag misses interaction effects) |
| Cross-product substitution | Product A stocked out, demand shifts to B | No - requires product relationship graph | High (5-8% of SKUs affected weekly) |
| Promotional lift interactions | 20% off premium SKUs in urban stores = 3.2x lift | No - requires product x store x promo traversal | Very High (promotions drive 20-40% of volume) |
| Supplier constraint propagation | Supplier delay ripples across 200 SKUs with 21-day lag | No - requires supplier-product graph | High (causes systematic forecast bias) |
Highlighted: the three strongest demand signals beyond basic trend and seasonality - cross-product substitution, promotional lift interactions, and supplier constraint propagation - are invisible to any tool that models each SKU-store pair as an isolated time series.
The implication is direct. If your product catalog has substitution relationships, if promotions are a significant volume driver, or if supplier constraints affect demand fulfillment, a time-series-only model is structurally incapable of capturing the most predictive signals. No amount of hyperparameter tuning on the same flat table will fix a data gap.
How to choose the right tool
The right demand forecasting tool depends on three factors: your supply chain complexity, your team's technical depth, and whether you need a point solution or an end-to-end planning platform.
demand_forecasting_tool_selection_guide
| If you... | Consider | Why |
|---|---|---|
| Need end-to-end planning (demand + supply + finance) | o9 Solutions or Anaplan | Best integrated planning with demand sensing and scenario modeling |
| Are a retailer/CPG wanting a proven supply chain platform | Blue Yonder or RELEX | Deepest retail domain expertise and promotion engines |
| Need concurrent planning across complex supply chains | Kinaxis Maestro | Best supply chain orchestration and conflict resolution |
| Have a data science team and want model control | DataRobot | Best AutoML and model transparency on flat time-series data |
| Have complex multi-table data and need maximum forecast accuracy | Kumo.ai | Only tool that captures cross-product substitution, promotional lifts, and supplier effects natively from relational data |
Highlighted: if your demand is driven by cross-product effects, promotional interactions, and supplier constraints - and accuracy matters more than planning workflows - the relational approach captures signals that time-series tools structurally cannot.
The accuracy ceiling is a data ceiling
The most important insight in demand forecasting is that the accuracy ceiling of most tools is not a model limitation - it is a data limitation. Better algorithms on the same time-series table yield diminishing returns. The jump from ARIMA to Prophet might add 2-3 points. The jump from Prophet to XGBoost might add 3-5 more. But you are still modeling each SKU-store pair in isolation, blind to the cross-table signals that drive 25-30% of demand variation.
The jump from isolated time series to multi-table relational data adds 10-15 points at the SKU-store level, because you are adding entirely new categories of signals: product substitution graphs, cross-dimensional promotional effects, supplier constraint propagation, and store clustering patterns. Enterprise benchmarks show 25% overstock reduction and $2-5M in working capital freed per quarter when switching from isolated time-series forecasts to relational demand models.
For enterprises with complex product catalogs, multi-format store networks, and active promotional calendars, the question is not "which time-series algorithm should we use?" It is "which tool can read our full relational data - products, transactions, stores, suppliers, promotions - without requiring six months of feature engineering first?"