ETA Prediction
“When will this shipment arrive?”
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.
By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

A real-world example
When will this shipment arrive?
Inaccurate ETAs ripple through the supply chain: warehouses staff for arrivals that don't come, production lines idle waiting for delayed components, and customers receive wrong delivery promises. Carrier-provided ETAs are 40-60% inaccurate beyond 3 days out. For a logistics company managing 100K shipments per month, reducing ETA error by 30% saves $18M annually in wasted dock labor, expediting fees, and customer penalties.
Quick answer
Graph neural networks predict shipment arrival times by learning how delays propagate across the logistics network: port congestion, weather disruptions, carrier performance patterns, and route-level delay cascading. Unlike carrier-provided ETAs that are 40-60% inaccurate beyond 3 days out, graph-based models reduce ETA error by 30%, saving $18M annually for a logistics company managing 100K monthly shipments.
Approaches compared
4 ways to solve this problem
1. Carrier-provided ETAs
Use the carrier's own estimated arrival time, typically based on scheduled transit times with basic adjustments for known delays.
Best for
Zero engineering effort. Available immediately for every shipment.
Watch out for
40-60% inaccurate beyond 3 days out. Carriers have limited visibility into port congestion, weather routing, and other carriers' delays on shared routes. ETAs are often optimistic because carriers have commercial incentives to under-report delays.
2. Historical transit time models
Calculate average transit time per carrier-route combination from historical data. Use percentile-based estimates for confidence intervals.
Best for
Better baseline than carrier ETAs. Accounts for carrier-specific performance on each route.
Watch out for
Backward-looking. Cannot incorporate real-time signals like current port congestion, in-transit weather events, or vessel position data. Treats each shipment independently.
3. Regression on shipment features (XGBoost, random forest)
Engineer features like 'carrier on-time rate,' 'current port congestion,' 'season,' and 'shipment weight' and train a regression model to predict transit time.
Best for
Incorporates real-time signals that historical averages miss. Good accuracy improvement over carrier ETAs.
Watch out for
Treats each shipment independently. Cannot model delay propagation -- when congestion at Shanghai delays 50 vessels, all downstream ETAs on those routes should shift, but a per-shipment model misses this cascading effect.
4. KumoRFM (relational graph ML)
Connect shipments, carriers, routes, weather, and port congestion into a logistics graph. The GNN learns how delays propagate across the network and compound at intermediate points.
Best for
Captures delay cascading: when port congestion + weather + carrier history compound into a 4-day delay that independent models underestimate. Updates continuously as new signals arrive.
Watch out for
Requires shipment data with carrier, route, and timing information, plus real-time port and weather data feeds. Most impactful for international logistics with multi-leg shipments.
Key metric: 30% reduction in ETA error saves $18M annually for a logistics company managing 100K monthly shipments across international routes.
Why relational data changes the answer
Shipment delays are not independent events. When port congestion in Los Angeles reaches 42 vessels waiting, every shipment arriving at LA in the next week will be delayed. But the delay compounds: a vessel that was already behind schedule due to a Pacific storm hits the congestion queue and adds 4 days, not just the average 3.2-day wait. The carrier's historical pattern on this route shows they lose an additional day during congestion because of their dock assignment priority. These signals live in different tables -- shipments, carriers, routes, weather, port congestion -- and the interactions between them determine the actual arrival time.
Relational models connect the full logistics graph. They learn that SHP501's 4-day delay reflects the compound of LA port congestion (3.2 days), a Pacific storm en route (0.5 days), carrier OceanLine's historical under-performance during congestion events (0.3 days), and the vessel's current behind-schedule position. On the RelBench benchmark, relational models score 76.71 vs 62.44 for single-table approaches. For ETA prediction, that accuracy gap means the difference between warehouse teams prepared for the actual arrival and costly idle time waiting for shipments that show up days late.
Carrier-provided ETAs are like airline departure boards that show the scheduled time even though three flights ahead of yours are delayed and the airport is running at capacity. Graph-based ETA prediction is like the air traffic control system that sees every aircraft in the queue, every weather delay en route, and every runway constraint, producing an honest arrival estimate that accounts for the full picture.
How KumoRFM solves this
Graph-powered intelligence for supply chains
Kumo connects shipments, carriers, routes, weather forecasts, and port congestion into a logistics graph. The GNN learns how delays propagate: when port congestion in Shanghai affects carrier X's transit times on route Y, and how weather patterns at intermediate points compound into final delivery delays. PQL predicts arrival time per shipment, updating continuously as new signals arrive.
From data to predictions
See the full pipeline in action
Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.
Your data
The relational tables Kumo learns from
SHIPMENTS
| shipment_id | carrier_id | origin | destination | ship_date |
|---|---|---|---|---|
| SHP501 | CAR01 | Shanghai | Los Angeles | 2025-02-20 |
| SHP502 | CAR02 | Rotterdam | New York | 2025-02-22 |
| SHP503 | CAR01 | Busan | Seattle | 2025-02-25 |
CARRIERS
| carrier_id | name | on_time_rate | avg_delay_days |
|---|---|---|---|
| CAR01 | OceanLine Express | 72% | 2.4 |
| CAR02 | Atlantic Cargo | 85% | 1.1 |
ROUTES
| route_id | origin | destination | avg_transit_days | stops |
|---|---|---|---|---|
| RT01 | Shanghai | Los Angeles | 14 | 0 |
| RT02 | Rotterdam | New York | 10 | 1 |
| RT03 | Busan | Seattle | 11 | 0 |
WEATHER
| region | date | condition | severity |
|---|---|---|---|
| Pacific | 2025-03-02 | Storm | Moderate |
| Atlantic | 2025-03-01 | Clear | None |
| Pacific | 2025-03-04 | Fog | Light |
PORT_CONGESTION
| port | date | vessels_waiting | avg_wait_days |
|---|---|---|---|
| Los Angeles | 2025-03-01 | 42 | 3.2 |
| New York | 2025-03-01 | 18 | 1.0 |
| Seattle | 2025-03-01 | 12 | 0.5 |
Write your PQL query
Describe what to predict in 2–3 lines — Kumo handles the rest
PREDICT FIRST(SHIPMENTS.actual_arrival, 0, 30, days) FOR EACH SHIPMENTS.shipment_id
Prediction output
Every entity gets a score, updated continuously
| SHIPMENT_ID | CARRIER | ORIGINAL_ETA | PREDICTED_ETA | DELAY_DAYS |
|---|---|---|---|---|
| SHP501 | OceanLine Express | 2025-03-06 | 2025-03-10 | +4 |
| SHP502 | Atlantic Cargo | 2025-03-04 | 2025-03-05 | +1 |
| SHP503 | OceanLine Express | 2025-03-08 | 2025-03-09 | +1 |
Understand why
Every prediction includes feature attributions — no black boxes
Shipment SHP501 -- Shanghai to Los Angeles via OceanLine Express
Predicted: Predicted arrival: March 10 (+4 days delay)
Top contributing features
LA port congestion (42 vessels waiting)
3.2 day avg wait
32% attribution
Pacific storm on route (Mar 2)
Moderate severity
26% attribution
Carrier OceanLine historical delay rate
28% late
18% attribution
Current vessel position (behind schedule)
-1.5 days
14% attribution
Fog advisory at destination (Mar 4)
Light
10% attribution
Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability
PQL Documentation
Learn the Predictive Query Language — SQL-like syntax for defining any prediction task in 2–3 lines.
Python SDK
Integrate Kumo predictions into your pipelines. Train, evaluate, and deploy models programmatically.
Explainability Docs
Understand feature attributions, model evaluation metrics, and how to build trust with stakeholders.
Frequently asked questions
Common questions about eta prediction
How do you predict shipment arrival times accurately?
Model your logistics network as a graph connecting shipments, carriers, routes, weather, and port congestion. The key is capturing delay propagation -- how congestion at one port compounds with weather delays and carrier performance to produce the actual arrival time. Graph models do this naturally; per-shipment regression models cannot.
Why are carrier-provided ETAs so inaccurate?
Carriers have limited visibility beyond their own operations. They don't see port congestion at the destination until close to arrival, cannot predict weather routing changes, and have commercial incentives to provide optimistic estimates. Their ETAs are 40-60% inaccurate beyond 3 days out because they miss the cascading delay signals across the logistics network.
What data do you need for shipment ETA prediction?
Shipment records with carrier, route, origin, destination, and ship date. Carrier performance history (on-time rates, average delays by route). Real-time port congestion data (vessels waiting, average wait times). Weather forecasts along shipping routes. The more connected data you have, the better the model predicts compound delays.
How does port congestion affect shipment ETAs?
Port congestion creates a queue that affects every inbound vessel, but the delay varies by carrier (dock priority), vessel size, and current congestion trajectory (growing or shrinking). Graph models learn these conditional patterns: OceanLine Express at LA during 40+ vessel congestion averages 3.8 days delay, while Atlantic Cargo at the same congestion level averages 2.1 days because of better dock assignments.
What is the ROI of better ETA prediction?
A logistics company managing 100K monthly shipments saves $18M annually by reducing ETA error 30%. The savings come from three sources: reduced dock labor waste ($6M from staffing to actual arrivals), lower expediting fees ($8M from proactive rerouting), and fewer customer penalties ($4M from accurate delivery promises).
Bottom line: A logistics company managing 100K monthly shipments saves $18M per year by reducing ETA error 30%. Kumo's logistics graph captures delay propagation across routes, ports, weather, and carrier patterns that carrier-provided ETAs systematically miss.
Related use cases
Explore more supply chain use cases
Topics covered
One Platform. One Model. Infinite Predictions.
KumoRFM
Relational Foundation Model
Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.
For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.
Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.




