The business problem
Retailers and manufacturers hold $1.1 trillion in excess inventory globally while simultaneously losing $634 billion to out-of-stocks. This paradox exists because inventory is optimized per SKU-location independently, ignoring the network effects: inventory at a nearby warehouse can cover stockouts at a store, substitute products can absorb demand, and supplier lead time variability affects the entire downstream network.
Why flat ML fails
- Independent optimization: Each SKU-location gets its own safety stock formula. No cross-location pooling or substitution effects are captured.
- No network effects: A stockout at Store A drives customers to Store B, increasing B's demand. Flat models cannot anticipate this demand transfer.
- No substitution modeling: When Product A is out of stock, demand shifts to substitute Product B. Joint inventory planning across substitutes reduces total stock needed.
- Static lead times: Lead times vary based on supplier load, which depends on all customers' orders. The supply graph captures these dependencies.
The relational schema
Node types:
Product (id, category, unit_cost, shelf_life)
Location (id, type, capacity, geo_lat, geo_lon)
Supplier (id, lead_time_mean, lead_time_var, moq)
Edge types:
Location --[supplies]--> Location (transit_time, cost)
Product --[substitute_of]--> Product (substitution_rate)
Supplier --[provides]--> Product (lead_time, cost)
Product --[stocked_at]--> Location (on_hand, demand_rate)Multi-echelon supply chain: suppliers feed warehouses, warehouses feed stores. Substitute products share demand.
PyG architecture: SAGEConv for network optimization
import torch
import torch.nn.functional as F
from torch_geometric.nn import SAGEConv, HeteroConv, Linear
class InventoryGNN(torch.nn.Module):
def __init__(self, hidden_dim=128):
super().__init__()
self.product_lin = Linear(-1, hidden_dim)
self.location_lin = Linear(-1, hidden_dim)
self.supplier_lin = Linear(-1, hidden_dim)
self.conv1 = HeteroConv({
('location', 'supplies', 'location'): SAGEConv(
hidden_dim, hidden_dim),
('product', 'substitute_of', 'product'): SAGEConv(
hidden_dim, hidden_dim),
('supplier', 'provides', 'product'): SAGEConv(
hidden_dim, hidden_dim),
('product', 'stocked_at', 'location'): SAGEConv(
hidden_dim, hidden_dim),
}, aggr='mean')
self.conv2 = HeteroConv({
('location', 'supplies', 'location'): SAGEConv(
hidden_dim, hidden_dim),
('product', 'substitute_of', 'product'): SAGEConv(
hidden_dim, hidden_dim),
('product', 'stocked_at', 'location'): SAGEConv(
hidden_dim, hidden_dim),
}, aggr='mean')
# Predict demand distribution parameters (mean, std)
self.demand_head = torch.nn.Sequential(
Linear(hidden_dim, 64),
torch.nn.ReLU(),
Linear(64, 2), # mean and std of demand
)
def forward(self, x_dict, edge_index_dict):
x_dict['product'] = self.product_lin(x_dict['product'])
x_dict['location'] = self.location_lin(
x_dict['location'])
x_dict['supplier'] = self.supplier_lin(
x_dict['supplier'])
x_dict = {k: F.relu(v) for k, v in
self.conv1(x_dict, edge_index_dict).items()}
x_dict = self.conv2(x_dict, edge_index_dict)
# Output demand distribution per product-location
params = self.demand_head(x_dict['product'])
mean = F.softplus(params[:, 0])
std = F.softplus(params[:, 1])
return mean, stdThe GNN predicts demand distribution parameters (mean, std) per product-location, accounting for substitution, network supply, and lead time effects. Feed these into safety stock calculations.
Expected performance
Inventory optimization is measured by demand forecast accuracy (WMAPE) and resulting service level improvements:
- EOQ formula (heuristic): ~20% WMAPE, baseline service levels
- LightGBM (independent demand): ~14% WMAPE
- GNN (network demand): ~10-12% WMAPE
- KumoRFM (zero-shot): ~10% WMAPE
Or use KumoRFM in one line
PREDICT weekly_demand FOR product, location
USING product, location, supplier, sales_historyOne PQL query. KumoRFM predicts demand per SKU-location, capturing substitution and network effects for inventory planning.