Executive AI Dinner hosted by Kumo - Austin, April 8

Register here
1Binary Classification · Predictive Maintenance

Predictive Maintenance

Which machines will fail in the next 7 days?

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.

By submitting, you accept the Terms and Privacy Policy.

Loved by data scientists, ML engineers & CXOs at

Catalina Logo

A real-world example

Which machines will fail in the next 7 days?

Unplanned downtime costs manufacturers $50B per year globally. Time-based maintenance over-services healthy equipment and misses early failure modes. Sensor-only models detect anomalies but generate too many false alarms and miss failures caused by interaction effects between equipment, parts, and operating conditions. For a plant with 500 machines, reducing unplanned downtime by 30% saves $8-12M annually.

Quick answer

Predictive maintenance AI uses machine learning on sensor data, maintenance logs, and production patterns to predict which machines will fail within 7 days. The best models connect equipment relationships in a graph, catching interaction effects between machines that single-sensor anomaly detection misses. Plants with 500+ machines typically save $8-12M annually by reducing unplanned downtime 30%.

Approaches compared

4 ways to solve this problem

1. Threshold-Based Alerts

Set fixed thresholds on individual sensors (vibration > 6 mm/s, temperature > 85C) and alert when crossed. Simple to implement and easy to explain to maintenance teams.

Best for

Single-mode failures with clear sensor signatures, like bearing wear on isolated equipment.

Watch out for

Generates excessive false alarms (30-50% false positive rate) and misses failures caused by multi-sensor interactions. Thresholds require constant manual tuning as equipment ages.

2. Statistical Anomaly Detection

Use statistical models (ARIMA, exponential smoothing) to learn normal sensor patterns per machine and flag deviations. More adaptive than fixed thresholds.

Best for

Detecting gradual degradation trends in well-instrumented equipment with long historical baselines.

Watch out for

Treats each machine independently, so it misses failure patterns that depend on interactions between upstream and downstream equipment. Cannot incorporate maintenance history or part age.

3. Single-Table ML (XGBoost/Random Forest)

Train gradient-boosted models on flattened feature tables combining sensor readings, equipment metadata, and maintenance history. The current industry standard for predictive maintenance.

Best for

Mid-complexity environments where most failures have clear feature signatures in a single equipment's data.

Watch out for

Flattening relational data loses the graph structure. A machine's risk depends on its neighbors, shared parts suppliers, and production line context. Feature engineering is manual and brittle.

4. Graph Neural Networks (Kumo's Approach)

Model the factory as a graph connecting equipment, sensors, parts, maintenance logs, and production runs. GNNs learn failure patterns from equipment interactions automatically, without manual feature engineering.

Best for

Complex plants where failures depend on multi-equipment interactions, shared parts, and operating condition combinations.

Watch out for

Requires relational data (not just sensor feeds). Best when you have at least 6-12 months of maintenance history across interconnected equipment.

Key metric: SAP's SALT benchmark shows graph-based predictive maintenance achieves 91% accuracy vs 75% for deep learning on flat data vs 63% for gradient-boosted trees, with the gap driven by multi-equipment interaction patterns.

Why relational data changes the answer

Most predictive maintenance systems treat each machine as an island. They monitor Machine A's vibration, Machine A's temperature, and Machine A's operating hours in isolation. But in a real factory, failures are relational. When Machine A's vibration increases, it changes the load profile on Machine B downstream. When a specific bearing batch from Supplier X is installed across 15 machines, failures cluster. When production runs push equipment above 90% load for consecutive shifts, the risk compounds across the entire line.

This is exactly why flat-table ML models plateau at 63% accuracy on equipment failure prediction, while graph-based approaches reach 91% (based on SAP's SALT benchmark). The gap comes from relational signals: part-equipment-condition triplets, production line cascade effects, and maintenance history patterns across similar equipment. RelBench benchmarks confirm this pattern, with GNN-based models scoring 76.71 vs 62.44 for gradient-boosted trees on relational prediction tasks. The factory is a graph. Treating it like a spreadsheet leaves the most predictive signals on the table.

Think of a factory like a human body. A doctor who only checks your heart rate will miss that your chest pain is caused by a pinched nerve in your spine affecting your posture, which strains your breathing, which elevates your heart rate. Good diagnostics trace the chain of causation across connected systems. Predictive maintenance works the same way: the vibration spike in Machine A is a symptom, but the root cause might be the worn bearing in Machine B that changed the load profile across the entire production line.

How KumoRFM solves this

Graph-powered intelligence for manufacturing

Kumo connects equipment, sensors, maintenance logs, parts, and production runs into a factory graph. The GNN learns failure patterns that depend on equipment interactions: when machine A's vibration increase coincides with machine B's temperature drift downstream, and how specific part-equipment-operating condition combinations predict failure. PQL predicts which machines will fail within 7 days, giving maintenance teams time to schedule repairs during planned downtime windows.

From data to predictions

See the full pipeline in action

Connect your tables, write a PQL query, and get predictions with built-in explainability — all in minutes, not months.

1

Your data

The relational tables Kumo learns from

EQUIPMENT

equipment_idtypeinstall_dateline
EQ001CNC Lathe2020-06-15Line-A
EQ002Press Machine2018-03-10Line-A
EQ003Conveyor Motor2022-01-20Line-B

SENSORS

sensor_idequipment_idmetriclatest_valuethreshold
SEN101EQ001Vibration (mm/s)4.86.0
SEN102EQ001Temperature (C)7285
SEN103EQ002Pressure (bar)148160

MAINTENANCE_LOGS

log_idequipment_idtypedescriptiondate
ML201EQ001PreventiveBearing replacement2025-01-15
ML202EQ002CorrectiveHydraulic seal repair2025-02-10
ML203EQ003PreventiveBelt tension adjust2025-02-20

PARTS

part_idequipment_idnameage_hoursrated_life_hours
PRT301EQ001Spindle Bearing3,2005,000
PRT302EQ002Hydraulic Seal8004,000
PRT303EQ003Drive Belt1,5003,000

PRODUCTION_RUNS

run_idequipment_idduration_hoursload_pctdate
RUN501EQ0011292%2025-03-01
RUN502EQ002878%2025-03-01
RUN503EQ0031695%2025-03-01
2

Write your PQL query

Describe what to predict in 2–3 lines — Kumo handles the rest

PQL
PREDICT BOOL(MAINTENANCE_LOGS.type = 'Corrective', 0, 7, days)
FOR EACH EQUIPMENT.equipment_id
3

Prediction output

Every entity gets a score, updated continuously

EQUIPMENT_IDTYPEFAILURE_PROB_7DRISK_TIER
EQ001CNC Lathe0.68High
EQ002Press Machine0.11Low
EQ003Conveyor Motor0.42Medium
4

Understand why

Every prediction includes feature attributions — no black boxes

Equipment EQ001 -- CNC Lathe on Line-A

Predicted: 68% failure probability in next 7 days (High risk)

Top contributing features

Vibration trend (14-day slope)

+32% increase

30% attribution

Spindle bearing age vs rated life

64% consumed

24% attribution

Operating load above 90% for 5+ days

92% avg

20% attribution

Temperature drift correlated with downstream press

+3.5C

15% attribution

Similar equipment failure pattern on Line-B

Failed last month

11% attribution

Feature attributions are computed automatically for every prediction. No separate tooling required. Learn more about Kumo explainability

Frequently asked questions

Common questions about predictive maintenance

How much data do I need to start predictive maintenance with AI?

You need at least 6-12 months of sensor data, maintenance logs, and production records to train a reliable model. The key is not volume but variety: you need examples of both failures and normal operation across different operating conditions. Most plants already have this data in their CMMS and historian systems. Start with your highest-cost failure modes and expand from there.

What is the ROI of predictive maintenance vs preventive maintenance?

Predictive maintenance typically delivers 25-35% reduction in unplanned downtime, 10-20% reduction in maintenance costs (by eliminating unnecessary preventive work), and 5-15% extension of equipment life. For a plant with 500 machines, this translates to $8-12M in annual savings. The ROI timeline is usually 6-9 months from deployment to measurable returns.

Can predictive maintenance AI work with legacy equipment that has limited sensors?

Yes, but with reduced prediction horizons. Graph-based models compensate for sparse sensor data by pulling signals from connected equipment, maintenance history, and production context. A machine with only 2 sensors can still be predicted accurately if its neighbors are well-instrumented. Many plants start with retrofitting 3-5 key sensors per critical machine and achieve 70%+ prediction accuracy.

How does predictive maintenance handle new equipment with no failure history?

Graph-based models handle cold-start better than traditional ML because they transfer knowledge from similar equipment. A new CNC lathe inherits failure patterns from existing CNC lathes on the same production line, with the same parts, under similar operating conditions. Prediction accuracy for new equipment typically reaches 80% of mature equipment accuracy within 3 months of operation.

What is the difference between condition monitoring and predictive maintenance?

Condition monitoring tells you the current state of equipment (vibration is elevated). Predictive maintenance tells you the future state (this machine has a 68% chance of failing in 7 days). The gap is the prediction model that translates current conditions, combined with historical patterns and equipment context, into actionable forecasts with enough lead time to schedule repairs.

Bottom line: A plant with 500 machines saves $8-12M annually by reducing unplanned downtime 30%. Kumo's factory graph detects multi-equipment interaction patterns and part degradation trajectories that sensor-only anomaly detection misses.

Topics covered

predictive maintenance AIequipment failure predictionmachine learning maintenancecondition-based maintenanceindustrial IoT predictionKumoRFM manufacturingasset failure forecastingmaintenance optimization ML

One Platform. One Model. Infinite Predictions.

KumoRFM

Relational Foundation Model

Turn structured relational data into predictions in seconds. KumoRFM delivers zero-shot predictions that rival months of traditional data science. No training, feature engineering, or infrastructure required. Just connect your data and start predicting.

For critical use cases, fine-tune KumoRFM on your data using the Kumo platform and Research Agent for 30%+ higher accuracy than traditional models.

Book a demo and get a free trial of the full platform: research agent, fine-tune capabilities, and forward-deployed engineer support.