Performance evaluation
The comparative analysis of Logistic Boosting, Random Forest (RF), and Support Vector Machines (SVM) on 15,000 industrial IoT instances (17.4% anomaly prevalence) revealed significant performance differences. Logistic Boosting demonstrated superior capability with 96.6% accuracy (RF: 95.6%, SVM: 93.8%), 94.1% F1-score, and 0.992 ROC-AUC (Fig. 11). The hybrid XGBoost-SVM architecture (Fig. 6) achieved particularly strong discrimination, with balanced precision (93.5%) and recall (94.8%).
Feature analysis revealed key operational insights:
-
Power consumption and motion detection were most discriminative (6.2% and 4.8% F1-score drops when excluded, respectively)
-
Environmental features showed limited impact (< 2% performance variation)
-
Strong correlations existed between power consumption and temperature (r = 0.82)
Statistical validation
Logistic Boosting’s balanced error profile (134 FPs, 117 FNs) outperformed both RF (401 FNs) and SVM (280 FPs), with statistical significance confirmed (ANOVA p < 0.05, Tukey’s HSD). The model showed consistent performance across:
-
Production loads (95.8–97.1% accuracy)
-
Shift patterns (93.3–94.7% F1-score)
-
Seasonal variations (93.9–95.4% recall)
Training convergence occurred within 80 iterations (Fig. 7), with final models demonstrating high reliability (Cohen’s κ = 0.88, MCC = 0.89).
Deployment considerations
Three key challenges emerged for industrial implementation:
-
1
Latency requirements necessitate model optimization (pruning/quantization) for real-time streaming
-
2
Resource constraints demand edge-compatible implementations
-
3
Data drift requires continuous learning mechanisms
Comparative benchmarks showed Logistic Boosting’s practical advantages over alternatives:
-
12.3% fewer false negatives than RF
-
18.7% fewer false positives than Support Vector Machines (SVM)
-
Superior stability across operational conditions
Experimental framework and model performance
Our evaluation using rigorous tenfold cross-validation demonstrated Logistic Boosting’s superior performance for industrial anomaly detection. The model achieved exceptional metrics including 96.6% accuracy, 94.1% F1-score (93.5% precision, 94.8% recall), and 0.992 AUC, significantly outperforming both Random Forest (95.6% accuracy) and SVM (93.8% accuracy). Statistical validation through ANOVA (p < 0.05) and Tukey’s HSD tests confirmed these performance differences were significant.
Feature importance analysis revealed power consumption and motion detection as the most critical predictors, with exclusion leading to 6.2% and 4.8% F1-score reductions respectively. Environmental features showed minimal impact, affecting performance by less than 2%. The model maintained consistent reliability across varying production conditions (95.8–97.1% accuracy), shift patterns (93.3–94.7% F1-score), and seasonal changes (93.9–95.4% recall), demonstrating robust adaptability to real-world industrial environments. Figure 4 contains dataset distribution following min–max normalization. Figure 4 describes dataset distribution following min–max normalization, while Fig. 5 is the matrix depicting feature correlations showcases the relationships among various characteristics within the dataset.

Dataset distribution following min–max normalization.

The matrix depicting feature correlations showcases the relationships among various characteristics within the dataset.
Operational implementation and validation
The framework achieved practical computational efficiency with training completed in 9.8 ± 1.2 s, suitable for real-time deployment. However, three key implementation challenges emerged: latency requirements necessitating model optimization through pruning or quantization, resource constraints demanding edge-compatible solutions, and data drift requiring continuous learning mechanisms. Compared to alternatives, Logistic Boosting offered 12.3% fewer false negatives than Random Forest and 18.7% fewer false positives than Support Vector Machines (SVM), while maintaining balanced error rates (2.9% false positives, 1.6% false negatives).
Performance metrics followed standard formulations, with accuracy calculated as (TP + TN)/(TP + TN + FP + FN), precision as TP/(TP + FP), recall as TP/(TP + FN), and F1-score as their harmonic mean. The model showed strong agreement metrics (Cohen’s κ = 0.88, MCC = 0.89) and rapid convergence, reaching optimal performance within 80 training iterations while maintaining interpretability through feature importance analysis.
This comprehensive evaluation establishes Logistic Boosting as particularly suitable for industrial IoT security applications, combining superior detection capability with practical implementation advantages. The consistent performance across diverse anomaly types and operational conditions suggests ensemble boosting methods should be prioritized for smart manufacturing systems, while identifying clear directions for future optimization through hybrid architectures and edge computing implementations. Table 3 describes the Feature Contribution Ablation.
The performance metrics were computed as follows53,54,55:
$$Accuracy =\frac{\left(TP + TN\right)}{\left(TP + TN + FP + FN\right)}$$
where TP, TN, FP, and FN represent true positives, true negatives, false positives, and false negatives respectively. This comprehensive evaluation framework not only quantifies detection capability but also provides operational insights for industrial deployment, particularly in balancing precision (reducing false alarms) and recall (minimizing missed detections)30. The implementation achieved computational efficiency (9.8 ± 1.2s training time) suitable for real-time applications while maintaining model interpretability through feature importance analysis33.
Accuracy signifies the ratio of correct predictions made by an algorithm to the total number of accurate predictions56. This metric aids in determining:
$$Precision = \frac{TP}{\left(TP + FN\right)}$$
Within factories and companies, recall represents the proportion of positive instances in the dataset correctly identified as positive.
$$Recall = \frac{TP}{\left(TP + FN\right)}$$
In the realm of factories and companies, the F1-score serves as a balanced measure between precision and recall, calculated as the harmonic mean of these two metrics6,57,58,59:
$$F1 Score = \frac{2 * Precision * Recall}{\left(Precision + Recall\right)}$$
Deployment Considerations and Computational Efficiency.
The Logistic Boosted model demonstrated efficient training, completing in under 10 s on a standard i7 CPU with 16 GB RAM for 15,000 instances.
However, real-world deployment raises several challenges:
-
Latency in streaming environments requires optimization via pruning or model quantization.
-
Resource constraints in factory devices necessitate lightweight models or inference-on-edge strategies.
-
Data drift and sensor recalibration may degrade model performance over time, calling for continuous learning mechanisms or scheduled retraining.
To statistically validate the performance differences, a one-way ANOVA was applied to accuracy, precision, recall, and F1-score values across the three models. The ANOVA results showed statistically significant differences (p < 0.05) among the models. Post-hoc Tukey’s HSD test confirmed that Logistic Boosting significantly outperformed both Random Forest and Support Vector Machines (SVM) in all metrics.
Logistic boosting model
The Logistic Boosting model achieved exceptional results in industrial anomaly detection, with 96.6% accuracy, 93.5% precision, 94.8% recall, and 0.941 F1-score on test data. These metrics demonstrate robust performance in handling class-imbalanced IoT data12,30. The confusion matrix analysis revealed balanced error rates (2.9% false positives, 1.6% false negatives), correctly identifying 7,487 normal instances and 77 anomalies in sampled cases. Figure 6 is the proposed architecture of the Hybrid Logistic Boosting Model for anomaly detection; while Fig. 7 is the Min–max normalization effects & performance of logistic boosting model.

The proposed architecture of the Hybrid Logistic Boosting Model for anomaly detection.

Min–max normalization effects & performance of logistic boosting model.
The model showed rapid convergence within 50 iterations and stable performance after 80 iterations (Fig. 7), outperforming Random Forest (12.3% fewer false negatives) and SVM (18.7% fewer false positives). Its effectiveness stems from three capabilities: handling class imbalance, managing complex feature interactions, and filtering noisy sensor data. These translate to practical benefits including reduced downtime, enhanced security, and improved operational efficiency.
With training times under 10 s for 15,000 instances38, the model demonstrates strong potential for real-world deployment. Future research directions include developing hybrid architectures combining these strengths with deep learning approaches34.
Model performance visualization analysis
Fig. 8 (a, b) present key insights into model optimization and evaluation. Fig. 8-a shows performance plateauing at 80 iterations, providing empirical guidance for hyperparameter tuning that balances computational cost with marginal gains12. Fig. 8-b’s confusion matrix analysis demonstrates robust detection capability, with 94.8% sensitivity (77 true anomalies identified) and high specificity (7,487 true negatives), while maintaining balanced error rates (29 false positives, 16 false negatives) critical for industrial applications30,38.

(a-b) confusion matrix of the proposed model is a visual representation showcasing its classification performance.
These visualizations collectively support three key analytical functions: (1) evidence-based complexity selection, (2) performance trade-off evaluation, and (3) operational threshold optimization33. The framework effectively bridges model development with practical deployment requirements in industrial settings12.
The comparative performance of our proposed method against prior works is summarized in Table 4. As evident, while earlier studies such as Wang & Li (2017)20, Li & Zhang (2018)19, and Choi et al. (2023)30 achieved competitive results on simulated, controlled, and semi-real IoT datasets, our approach demonstrated superior performance on real-world factory IoT data, attaining the highest accuracy and F1-score, along with improved metric balance in a deployment-ready scenario.
The confusion matrices reveal distinct behavioral patterns across models60,61,62,63,64,65,66,67,68. Logistic Boosting’s superior performance (134 FPs, 117 FNs) stems from its iterative weighting of misclassified instances, effectively handling class imbalance as in Fig. 8(a, b). Comparatively, Support Vector Machines (SVM) produced more false positives (280) due to margin sensitivity in high-dimensional spaces, while Random Forest generated more false negatives (401) from minority class boundary challenges. These results demonstrate ensemble boosting’s advantage in capturing nuanced anomalies.
Figure 9’s comparative histogram shows all models achieving high accuracy, with SVM leading in recall (94.6%) but Logistic Boosting maintaining the best precision-recall balance (F1-score = 94.1%). The visualizations enable direct performance comparisons, highlighting Logistic Boosting’s optimal trade-offs for industrial applications where both false alarms (2.9%) and missed detections (1.6%) carry significant operational consequences.

The histogram illustrates the performance outcomes of each model.
Comparative algorithm performance
Our evaluation of three machine learning approaches identified Logistic Boosting as the optimal solution for industrial anomaly detection. The boxplot analysis (Fig. 10)) demonstrates its consistent superiority across all metrics:
-
Accuracy: 96.6%
-
Precision: 93.5%
-
Recall: 94.8%
-
F1-score: 94.1%

Boxplot representation of model performance metrics, including accuracy, precision, recall, and F1-score.
The model achieved exceptional discrimination (AUC = 0.992) with balanced error rates (134 FPs, 117 FNs), outperforming both Random Forest (162 FPs, 401 FNs) and SVM (280 FPs, 376 FNs). Minimal metric variance (< 1.5%) confirms its reliability across operational conditions, making it particularly suitable for industrial deployment where consistent performance is critical.
Boxplot representation of model performance metrics, including accuracy, precision, recall, and F1-score as in Fig. 10). The results demonstrate the model’s superior consistency and effectiveness compared to Random Forest and Support Vector Machines. The minimal variance in accuracy and the highest area under the curve (AUC = 0.992) further highlight its robust classification performance.
Model performance and robustness
Figure 11 demonstrates the logistic boosting model’s exceptional outlier detection capability (AUC = 0.992), attributable to its ensemble architecture. The algorithm’s effectiveness stems from three key mechanisms:
-
1
Iterative weighting of misclassified instances
-
2
Adaptive threshold optimization for class imbalance
-
3
Aggregation of weak classifiers into a strong predictor

ROC curves (AUC: 0.992 vs. 0.982 for RF, 0.968 for SVM).
While particularly effective for industrial anomaly detection (achieving 94.8% recall with 93.5% precision), the model’s performance remains dependent on dataset characteristics. These results highlight the importance of comparative algorithm evaluation for specific industrial applications, where balanced error rates (134 FPs, 117 FNs) are often operationally critical.
In Table 5, the Logistic Boosted model also achieved a Cohen’s Kappa score of 0.88 and an MCC of 0.89, indicating strong agreement and high classification quality, respectively. In comparison, Random Forest yielded a Kappa of 0.83 and MCC of 0.84, while SVM achieved 0.79 and 0.77. These metrics confirm the superior balance of true/false positives and negatives in the Logistic Boosting model. ROC curves (AUC: 0.992 vs. 0.982 for RF, 0.968 for SVM) are described in Fig. 11.
Comparative performance analysis
Our evaluation demonstrates Logistic Boosting’s superiority for industrial IoT anomaly detection, achieving 96.6% accuracy and 94.1% F1-score (Table 4,5 and, Fig. 12), outperforming both Random Forest (95.6% accuracy) and SVM (93.8% accuracy). The model’s 0.992 AUC30 and balanced error rates (134 FPs, 117 FNs) significantly exceed alternative approaches (RF: 162 FPs/401 FNs; SVM: 280 FPs/376 FNs)12,31.

Comparative performance analysis of anomaly detection algorithms.
Key advantages include:
-
1
Effective handling of rare anomalies (< 20% prevalence) through instance reweighting
-
2
Robust feature interaction capture via gradient tree boosting
-
3
Computational efficiency (< 10 s training for 15k instances)
While Support Vector Machines (SVM) achieves higher recall (94.6%) and RF offers interpretability, Logistic Boosting’s precision (93.5%) and operational reliability make it the preferred choice for industrial settings. Future research should investigate hybrid architectures combining these strengths34, with our results providing benchmark metrics (Figs. 8–12) for smart manufacturing applications12. For comprehensive benchmarking, we compared Logistic Boosting against baseline deep learning approaches (LSTM and 1D-CNN) on our industrial dataset. While the LSTM achieved competitive accuracy (94.2%), it required 3.2 × longer training time than Logistic Boosting and showed lower precision (90.7% vs. 93.5%) due to overfitting on rare anomalies. The 1D-CNN performed similarly (93.7% accuracy) but struggled with temporal dependencies in sensor data. These results confirm that while deep learning methods can achieve respectable performance, their computational demands and data requirements make them less practical for typical industrial deployments with moderate-sized datasets.
Figure 12 includes Logistic Boosting, Random Forest, and Support Vector Machines (SVM). The bar chart displays key evaluation metrics—accuracy, precision, recall, and F1-score—highlighting the overall effectiveness of each model in detecting anomalies. The results indicate that Logistic Boosting achieves the highest accuracy, while all three models maintain competitive performance across the other metrics.
