Berlin Tech Meetup: The Future of Relational Foundation Models, Systems, and Real-World Applications

Register now:
Learn10 min read

How DoorDash Improved Prediction Accuracy by 30% with Relational Deep Learning

Delivery platforms depend on dozens of interconnected predictions: ETAs, driver availability, demand by zone, prep times. Traditional ML sees each prediction in isolation. Relational deep learning connects the entire data landscape and finds patterns that flat-table models structurally cannot.

TL;DR

  • 1On the SAP SALT enterprise benchmark, KumoRFM scores 91% accuracy vs 75% for PhD data scientists with XGBoost and 63% for LLM+AutoML - with zero feature engineering and zero training time.
  • 2DoorDash achieved a 30% accuracy improvement over their internal model by applying relational deep learning to delivery predictions. The key: connecting drivers, orders, restaurants, customers, and zones into a unified graph instead of flattening them into isolated feature tables.
  • 3Delivery prediction depends on relationships across 10+ connected tables. Traditional ML requires engineers to manually join and aggregate this data, losing cross-entity patterns. Relational deep learning reads all tables simultaneously and discovers multi-hop patterns automatically.
  • 4What previously required 4-5 years of iterative feature engineering by internal teams was achieved in months. The model automatically discovers signals like driver-restaurant efficiency, zone-level demand shifts, and customer ordering patterns that would take years to manually encode.
  • 5The 30% improvement is not from a better algorithm on the same features. It is from seeing data that flat-table models structurally cannot access: the relationships between entities, temporal dynamics, and multi-hop patterns across the full relational database.

The delivery prediction challenge

Platforms like DoorDash sit on top of one of the most complex prediction problems in consumer technology. Every order triggers a cascade of interdependent predictions: How long will the restaurant take to prepare this order? Which driver is best positioned to pick it up? How long will the delivery take given current traffic, weather, and zone congestion? Will this customer order again this week?

These predictions are not independent. A driver's delivery time depends on the restaurant's prep speed. Restaurant prep speed depends on current order volume. Order volume depends on zone-level demand. Zone-level demand depends on time of day, weather, and local events. Everything connects to everything.

The data that drives these predictions lives across 10 or more connected tables: drivers, orders, restaurants, customers, zones, menus, ratings, promotions, weather feeds, and event calendars. Getting accurate predictions requires understanding the relationships between these tables, not just the data within them.

The headline result: SAP SALT benchmark

The SAP SALT benchmark is an enterprise-grade evaluation where real business analysts and data scientists attempt prediction tasks on SAP enterprise data. It measures how accurately different approaches predict real business outcomes on production-quality enterprise databases with multiple related tables.

sap_salt_enterprise_benchmark

approachaccuracywhat_it_means
LLM + AutoML63%Language model generates features, AutoML selects model
PhD Data Scientist + XGBoost75%Expert spends weeks hand-crafting features, tunes XGBoost
KumoRFM (zero-shot)91%No feature engineering, no training, reads relational tables directly

SAP SALT benchmark: KumoRFM outperforms expert data scientists by 16 percentage points and LLM+AutoML by 28 percentage points on real enterprise prediction tasks.

KumoRFM scores 91% where PhD-level data scientists with weeks of feature engineering and hand-tuned XGBoost score 75%. The 16 percentage point gap is the value of reading relational data natively instead of flattening it into a single table.

Why traditional ML plateaus

The standard approach to delivery prediction follows a familiar pattern: extract data from multiple tables, join and aggregate it into a flat feature table with one row per prediction, and train a gradient-boosted model (XGBoost, LightGBM) on that table.

This works, up to a point. A well-engineered flat-table model can capture obvious signals: average delivery time by zone, restaurant average prep time, driver average speed. But it misses the relational patterns that drive the most important variations:

  • Driver-restaurant affinity. A specific driver may be 15% faster at restaurants in a specific zone because they know the parking, the pickup flow, and the fastest route out. This signal lives in the relationship between the driver table and the restaurant table, filtered by the zone table.
  • Prep time by order complexity. A restaurant's average prep time is misleading. Prep time varies dramatically by order composition: number of items, menu category mix, and whether the order includes items that share cooking infrastructure. This requires joining orders, order items, and menu tables.
  • Customer temporal patterns. A customer who always orders 20 minutes before their usual dinner time is highly predictable, but only if the model can see the customer's full order history with timestamps correlated against their profile.
  • Zone-event correlations. Demand at zone X spikes when there is a sporting event at the nearby stadium. This pattern requires connecting zone data to external event data and learning the radius of impact.

The relational approach

Relational deep learning takes a fundamentally different approach. Instead of flattening the database into a single table, it connects all tables into a graph that preserves the full relational structure.

Every row in every table becomes a node: each driver, each order, each restaurant, each customer, each zone, each menu item. Every foreign key relationship becomes an edge: orders connect to restaurants, drivers connect to orders, customers connect to orders, restaurants connect to zones. Timestamps are preserved as temporal attributes, so the model knows when each relationship was active.

A graph neural network then processes this structure by passing messages along edges. Information flows from restaurants to their orders, from orders to their drivers, from drivers to their zones. After multiple layers of message passing, each node accumulates information from its full relational neighborhood.

The result is that the model automatically learns the cross-entity patterns that flat-table models miss:

  • This driver is 15% faster at restaurants in this zone (driver → orders → restaurant → zone).
  • This customer always orders 20 minutes before their usual dinner time (customer → orders → timestamps).
  • Demand at zone X spikes when there is an event at the nearby stadium (zone → events → temporal patterns).
  • This restaurant's prep time increases 40% when more than 3 items share the same cooking station (restaurant → orders → order items → menu items → cooking categories).

Results: 30% accuracy improvement

By applying relational deep learning to delivery predictions, DoorDash achieved a 30% accuracy improvement over their existing internal model. To put this in context: their internal model had been refined over years of iterative feature engineering by a world-class data science team. The 30% gain did not come from better tuning of the same features. It came from seeing data that the previous approach structurally could not access.

Equally significant was the time-to-value. What had previously taken 4-5 years of iterative feature engineering, where data scientists would hypothesize a new feature, compute it, test it, and repeat, was achieved in months. The relational model discovered cross-entity patterns automatically that would have taken additional years to find manually.

traditional_ml_vs_relational_ml_for_delivery

prediction_tasktraditional_ML_signalsrelational_ML_signals
Delivery timeAvg zone delivery time, distance, time of day+ Driver-restaurant affinity, route history, current zone congestion patterns
Driver availabilityDriver shift schedule, current location+ Driver acceptance patterns by restaurant type, fatigue modeling from recent order sequence
Demand by zoneHistorical hourly demand, day of week+ Event proximity, weather-demand correlation, promotional cascade effects across zones
Restaurant prep timeRestaurant avg prep time, order count+ Order complexity by menu mix, cooking station contention, prep time by order sequence position
Customer reorderDays since last order, order frequency+ Menu preference shifts, response to promotions, restaurant closure impact on reorder

Traditional ML captures single-table averages. Relational ML captures the cross-entity patterns that explain the variance traditional models miss.

Why years of feature engineering compressed into months

The traditional path to improving delivery predictions follows a slow, manual cycle. A data scientist hypothesizes that driver familiarity with a restaurant affects delivery time. They write SQL to join driver and order history, compute a familiarity score, add it to the feature table, retrain, and evaluate. If it helps, they move on to the next hypothesis. If not, they try a different formulation.

Each iteration takes days to weeks. Over 4-5 years, a team might explore a few hundred features out of the hundreds of thousands of possible cross-table combinations. The model improves incrementally, a few percentage points per year.

Relational deep learning bypasses this entire cycle. The graph neural network explores the full combinatorial space of cross-table patterns simultaneously. It does not need a human to hypothesize that driver-restaurant familiarity matters. It discovers it, along with thousands of other relational patterns, during training.

PQL Query

PREDICT delivery_time_minutes
FOR EACH orders.order_id
WHERE orders.status = 'pending'

One predictive query replaces the entire delivery time prediction pipeline. The model reads raw tables (drivers, orders, restaurants, customers, zones, menus) directly and returns predictions that incorporate cross-entity patterns a flat-table model would need years of feature engineering to approximate.

Output

order_idpredicted_minutesconfidencekey_factors
ORD-44201280.92Driver-restaurant familiarity, low zone congestion
ORD-44202470.85Complex order (6 items, 3 cooking stations), new driver to zone
ORD-44203220.94Regular customer route, restaurant in fast-prep phase
ORD-44204550.78Event-driven zone surge, restaurant prep backlog detected

What this means for delivery and logistics platforms

DoorDash's 30% improvement is not an outlier. It reflects a structural advantage of relational deep learning over flat-table approaches for any prediction that depends on interconnected entities. Delivery and logistics platforms are particularly well-suited because their data is inherently relational: every order connects a customer, a restaurant, a driver, a zone, and a time window.

The implications extend beyond delivery time prediction. The same relational approach applies to demand forecasting, driver dispatch optimization, dynamic pricing, customer retention, and restaurant quality scoring. Each of these problems depends on the same underlying relational structure, and each benefits from the same ability to see cross-entity patterns.

For platforms still relying on flat-table models with manually engineered features, the gap will only widen. Every year of manual feature engineering yields diminishing returns as the easy features are already built. Relational deep learning does not have this ceiling because it reads the full relational structure directly. The more complex and interconnected the data, the larger the advantage.

Frequently asked questions

What does a 30% accuracy improvement mean for a delivery platform?

A 30% accuracy improvement in delivery time prediction means the gap between predicted and actual delivery times shrinks by nearly a third. For a platform processing millions of daily orders, this translates to fewer late deliveries, more accurate customer ETAs, better driver utilization, and reduced compensation payouts for missed promises. Even a 10% improvement at DoorDash's scale can represent tens of millions in annual operational savings.

Why can't traditional ML models capture the full picture for delivery prediction?

Traditional ML models require a single flat feature table as input. To predict delivery time, an engineer must manually join and aggregate data from drivers, orders, restaurants, customers, zones, and menus into one row per prediction. This process loses cross-entity relationships: a driver's efficiency at specific restaurant types, a restaurant's prep time variation by order complexity, or zone-level demand shifts correlated with nearby events. Relational deep learning reads all tables simultaneously and discovers these patterns automatically.

How does relational deep learning handle delivery logistics data?

Relational deep learning connects all tables (drivers, orders, restaurants, customers, zones, menus) into a graph structure. Each row becomes a node, each foreign key becomes an edge, and timestamps are preserved. A graph neural network then passes messages along these edges, learning cross-entity patterns: which drivers are fastest at which restaurant types, how zone demand shifts based on time and events, and how customer ordering patterns correlate with external factors. The model discovers these patterns automatically without manual feature engineering.

How long does it take to build a relational ML model for delivery prediction?

With traditional ML, building an accurate delivery prediction model requires 4-5 years of iterative feature engineering as the data science team discovers which cross-table patterns matter. With relational deep learning, the same level of accuracy can be achieved in months because the model discovers cross-entity patterns automatically rather than requiring engineers to hypothesize and manually compute each feature.

What types of predictions benefit most from relational deep learning in logistics?

Any prediction that depends on relationships between multiple entities benefits from relational deep learning. In delivery logistics, this includes delivery time estimation, driver availability forecasting, demand prediction by zone, restaurant prep time estimation, customer reorder probability, and optimal dispatch routing. These all depend on patterns that span drivers, orders, restaurants, customers, and geographic zones, exactly the kind of multi-table relationships that flat-table models cannot capture.

Can relational deep learning work with existing data warehouse infrastructure?

Yes. Relational deep learning reads directly from the relational tables already in your data warehouse. There is no need to restructure data or build a separate feature store. The model connects to your existing tables via their foreign key relationships and builds the graph representation automatically. This is one reason deployment timelines shrink from years to months: you skip the feature engineering pipeline entirely.

See it in action

KumoRFM delivers predictions on relational data in seconds. No feature engineering, no ML pipelines. Try it free.