Unplanned downtime in manufacturing is among the most costly operational failures an enterprise can face. For our client — a global manufacturer of industrial equipment operating 14 plants across 3 continents — unplanned downtime cost an average of $180,000 per hour. The goal was clear: use sensor data and machine learning to predict equipment failures before they happen.
The Data Challenge
Each plant operated independently with different sensor infrastructure, different PLC manufacturers, and different historian databases (primarily OSIsoft PI and Honeywell Uniformance). Phase 1 was pure data engineering: building a unified sensor data platform on Google Cloud, ingesting 2.3 billion data points per day from 40,000+ sensors with sub-5-minute latency requirements.
Feature Engineering for Industrial IoT
Raw sensor readings are rarely useful as ML features directly. The feature engineering phase took 6 weeks and involved close collaboration with process engineers who understood the physical meaning of each sensor. Key feature families included rolling statistics (mean, variance, rate of change over 5-minute, 1-hour, and 24-hour windows), cross-sensor correlations indicative of bearing wear, and process state features that add context to anomaly signals.
Model Architecture
We tested four model architectures: isolation forest as an anomaly detection baseline, XGBoost with tabular features, LSTM for sequential modeling of time series, and a hybrid ensemble combining XGBoost on engineered features with a lightweight LSTM for sequential patterns. The hybrid ensemble won on both predictive accuracy and interpretability — the XGBoost component produced SHAP values that process engineers could validate against their domain knowledge.
Results After 6 Months
The system went live across 3 pilot plants in Q3 2024. Unplanned downtime decreased by 40% compared to the same period in the prior year. The model correctly predicted 73% of failure events with a median lead time of 18 hours — enough time to schedule maintenance during planned production windows. Total estimated savings in the pilot period: $4.2M. Full rollout to all 14 plants is underway.