Detection of anomalous activities around telecommunications infrastructure based on YOLOv8s

Model performance analysis

The trained YOLOv8s-modified model demonstrated high detection capabilities, notably in detecting anomalies related to fiber optic cables mounted on poles. The model attained a mAP@0.5 of more than 85%, indicating that it can reliably recognize and classify items such as poles, humans, animals, and climbing activities while minimizing false positives and false negatives. This great precision implies that the model can consistently distinguish between normal and unusual activities, making it ideal for telecom infrastructure monitoring. However, the mAP@0.5:0.95, which assesses performance across several intersection-over-union (IoU) criteria, demonstrated a progressive deterioration. This tendency is expected since tougher IoU thresholds necessitate higher localization precision, which makes object detection challenging. Figure 3a illustrates the precision-recall curve, demonstrating that the model effectively maintains a decent balance between accurately identified anomalies and erroneous detections. When comparing YOLOv8s-modified to other state-of-the-art models shown significant increases in detection accuracy, resilience, and computational efficiency. Figure 3b shows the precision-confidence curve and which indicates that YOLOv8s-modified maintains a greater degree of precision across different confidence levels, implying that its high-confidence predictions are more dependable and contain fewer false positives. Furthermore, Fig. 3c presents the F1-confidence curve, which assesses the model’s balance of precision and recall. This increase can be ascribed to YOLOv8s’ improved design, superior feature extraction, and optimized computing performance, which enable it to handle complicated settings such as crowded backdrops and variable illumination conditions commonly found in telecom networks. Figure 3d illustrates another critical feature of YOLOv8s’ performance: the recall-confidence curve.

The curve indicates that improved YOLOv8s maintains a high recall value even at moderate confidence thresholds, showing that it consistently detects the majority of anomalies without significantly increasing false alarms. This is particularly useful for fiber optic monitoring applications, where missing an anomaly such as unauthorized climbing or cable tampering could result in costly infrastructure damage or service disruptions. The proposed model provides higher generalization, enhanced object localization, and fewer false alarms, giving it a more resilient and efficient solution for real-time telecom anomaly detection. Given the time restrictions of monitoring fiber optic cables in pole-mounted infrastructure, YOLOv8s’ higher accuracy and inference speed make it the preferable choice for high-volume, real-time anomaly detection tasks in telecom networks.

Detection results

The detection results shown in Fig. 4, indicate that the YOLOv8s model has the capability of detecting anomalies in fiber optic cable systems with high accuracy and dependability. Even in complicated landscapes with varying lighting conditions and partial occlusions, the model effectively detected poles, humans, animals, and climbing activities. YOLOv8s-modified outperformed earlier models in terms of object localization and classification, resulting in less false positives and negatives. The clear boundary boxes and high-confidence detections shows the model’s ability to discriminate key anomalies from irrelevant background objects. Also, the precision-recall and confidence-based curves from prior performance evaluations are consistent with these results, confirming that the model maintains a balance of high precision and recall. These findings establish YOLOv8s-modified as a dependable solution for real-time monitoring, allowing telecom operators to promptly identify and respond to possible threats to fiber optic infrastructure.

Impact of augmentation techniques

The augmentation technique utilized in this study are wildly known in the computer vision literature, but their strategic application to the novel domain of fiber optic infrastructure monitoring in outdoor telecom environments was important for improving model performance and dependability. Rather than providing methodological innovation, these strategies were purposefully selected and customized to address domain-specific environmental problems faced during real-world surveillance. Random flipping was employed to simulate orientation variance, allowing the improved YOLOv8s model to recognize humans and animals approaching fiber optic poles from various directions. This is especially true in unstructured outdoor contexts where object positioning is not stable. Gaussian Noise Addition improved the model’s resilience to sensor noise and visual artifacts caused by low-light circumstances or compressed transmission feeds, which are typical in real-time field deployments. Brightness modifications adjusted for variations in natural lighting caused by time of day, shadows and weather changes. By training the algorithm on images with varying brightness levels, it enhanced detection consistency across circumstances ranging from direct sunlight to cloudy or dusk scenarios. Rotation Augmentation enabled the model to manage object orientation variability, such as animals mounting poles at unexpected angles or maintenance people assuming unique postures. This increased the model’s capacity to generalize to less predictable field interactions. These enhancements significantly enhanced training data diversity, lowering overfitting and increasing generalization. While not innovative in itself, these strategies were successfully used to train an improved YOLOv8s model using a special telecom anomaly detection dataset. This is consistent with Han’s et al.⁶¹ findings, which underscore the importance of environment-specific augmentation. The essential innovation is the intentional integration of standard methodologies into a domain-adapted pipeline, which allows for robust, real-time anomaly identification in demanding outdoor fiber optic monitoring circumstances.

The comparison of performance metrics with and without data augmentation techniques

The model’s performance with and without data augmentation is compared in the Table 2, which shows that applying augmentation consistently improves all key metrics. Precision increased from 93.4 to 96.9%, while recall climbed from 83.9 to 86.6%, demonstrating higher accuracy and sensitivity in object detection. Additionally, the mAP@50 increased dramatically from 87.1 to 97.3%, while mAP@50:95, a tougher and more comprehensive measure, improved from 66.8 to 71.5%. These results show that data augmentation improves the model’s generalization capacity by exposing it to a broader number of training instances, resulting in more robust and reliable detections in a variety of operational environments.

Table 2 Comparison of YOLOv8s-modified performance metrics with and without data augmentation techniques.

The Precision-Recall (PR) curves shown in Fig. 5 for the YOLOv8s model trained with and without data augmentation demonstrate how augmentation affects detection performance. In subplot (a), the model trained with augmentation has a better total class score of 97.3%, compared to 87.1% in subplot (b). The PR curve with augmentation demonstrates more consistent and elevated precision across all recall levels, indicating that the model is more confident and accurate in detecting objects in various scenarios. This improvement emphasizes the importance of augmentation in improving the model’s robustness, generalization, and ability to maintain high detection quality even under difficult or varied input conditions.

Performance comparison of YOLOv8s-modified and other models

This section compares the performance of improved YOLOv8s, YOLOv8 (baseline), YOLOv7, and YOLOv5, with a focus on their ability to detect anomalies associated with fiber optic cable infrastructure specifically, human climbing activities on pole-mounted systems. Conventional object detection datasets frequently lack annotated examples of such specialized events. As a result, a bespoke dataset was created for this investigation, which included high-resolution photos and annotated bounding boxes for unauthorized human presence, climbing, and other suspicious activity near optical fiber poles. To ensure experimental consistency and fairness, each model was trained independently but under the same settings. The improved YOLOv8s has a lightweight backbone with smaller channel sizes, which increases speed and lowers computational cost. Improved FPN and anchor-free heads increase localization accuracy, particularly in obstructed scenes. It’s designed for real-time anomaly detection on edge devices that have limited processing power. The comparative study results are summarized in Table 3. YOLOv8s-modified performed better than all other models in every key statistic. It had the highest precision (96.9%), demonstrating the capacity to reduce false positives, and the highest recall (86.6%), suggesting strong detection capability without missing crucial events. Furthermore, it achieved a mAP@50 of 97.3% and a mAP@50:95 of 71.5%, proving its stability across various detection thresholds and confirming the high localization accuracy.

Table 3 The comparison of Modified YOLOv8s with other models’ performance based on key evaluation metrics.

The improved YOLOv8s also achieved the fastest inference time of 66.8 ms, making it ideal for real-time monitoring applications that require immediate danger identification. In comparison, YOLOv5 demonstrated significant flaws, with a lower mAP@50 (66.4%) and recall (67.5%), occasionally failing to detect climbing behaviors, which could result in missing anomalies during live monitoring. Additionally, YOLOv8s-modified performed more smoothly and consistently on confidence-based metrics, such as the precision-confidence, recall-confidence, and F1-confidence curves as shown in Fig. 3. These curves showed its consistent detection thresholds and minimal variability, resulting in fewer misclassifications in uncertain environments. YOLOv8s-modified has greater performance due to its advanced transformer-inspired architecture, adaptive feature fusion algorithms, and increased object tracking consistency between frames. Therefore, YOLOv8s-modified emerged as the best effective model for detecting anomalies near pole-mounted fiber optic infrastructures. It combines high detection accuracy, rapid inference speed, and robustness against severe environmental conditions making it an optimum choice for implementation in automated telecom monitoring and maintenance systems.

Comparison of YOLOv8s-modified and YOLOv8s models

The confusion matrix comparison of modified YOLOv8s and original YOLOv8s shows significant improvements in classification accuracy following model augmentation as presented in Fig. 6. The improved YOLOv8s has less false positives and false negatives, with more predictions correctly falling along the diagonal of the matrix, indicating higher precision and recall across all classes. In contrast, the original YOLOv8s matrix shows more class confusion, especially in overlapping or visually similar activity categories. This increase indicates how adjustments, such as augmentation, attention modules, or architecture tuning, can improve detection reliability and resilience in complicated surveillance tasks.

Effect of training epochs on model accuracy

The effect of training epochs on model accuracy was investigated by comparing performance at various stages, as shown in Table 4. At 20 epochs, the model’s mAP@0.5 was 78.9%, indicating early learning but with some false positives and missed detections. By 50 epochs, accuracy had increased significantly to 87.5%, indicating better feature extraction and fewer errors. After 100 epochs, the model stabilized at mAP@0.5 of 97.3%, showing optimal convergence. While more epochs improved accuracy, the gains tapered off beyond this point, implying that additional training could lead to overfitting without significant performance gains. These findings emphasize the significance of balancing training duration and computational efficiency, with 50–100 epochs being ideal for deep learning-based object detection models such as YOLOv8s.

Table 4 Effect of training epochs on model accuracy.

Model training and validation loss summary

The improved YOLOv8s model’s training process was evaluated with key loss metrics and performance indicators. The train/box_loss metric assesses the model’s ability to predict accurate bounding box coordinates; a lower value indicates improved localization. Similarly, train/cls_loss captures classification mistakes, and train/dfl_loss measures the quality of bounding box regression refinement as illustrated in Fig. 7. On the performance front, metrics/precision(B) and metrics/recall(B) are crucial for assessing the model’s capacity to detect true abnormalities correctly. Precision determines how many predicted anomalies were correct, whereas recall reveals how many real abnormalities were correctly detected. metrics/mAP50(B) and metrics/mAP50-95(B) summarize detection accuracy at various IoU levels, with mAP@50 indicating leniency and mAP@50–95 indicating stringent examination. The model improved consistently across various metrics during training, with convergence occurring after multiple epochs. The high mAP values validate the model’s robust generalization and anomaly detection capabilities.

Evaluation on external anomaly datasets

The proposed model was tested for generalizability and robustness using three public datasets: Roboflow Universe, Kaggle, and the Open Images Dataset, which included a wide range of image types and instances. The Open Images Dataset, which lacked the YOLOv8 format, was converted before training. The results in Table 5 show persistent high performance: Kaggle achieved the highest precision (93.8%) and mAP scores, while Open Images had the highest recall (86.7%), suggesting flexibility to complex imagery. Roboflow Universe achieved balanced results, including high precision (93.2%) and mAP@50 (91.7%). Inference times remained within 71–75 ms across datasets, allowing for real-time application. Overall, the model displayed resilience and accuracy across multiple datasets, proving its potential for robust anomaly identification in a variety of surveillance scenarios.

Table 5 The modified YOLOv8s performance on anomaly detection datasets.

Ablation study

In order to evaluate the contribution of each architectural component to the performance of the modified YOLOv8s, an ablation study was conducted. The results are presented in Table 6. The original YOLOv8s (Experiment A) achieves a high accuracy of 89.6% mAP@50 and an inference time of 71.9 ms, which indicates a good trade-off between accuracy and computational efficiency for real-time object detection tasks. Introducing early-channel reduction (Experiment B) reduces the number of parameters in the first convolutional layers, resulting in a modest 0.6% improvement in mAP@50 while also lowering inference time by 1.6 ms. This demonstrates that limiting early-stage overhead enhances efficiency without compromising accuracy. Applying hierarchical channel scaling in deeper backbone stages (Experiment C) yields a further 1.3% increase in mAP@50 and improves fine-grained multi-scale feature representation, especially beneficial for detecting small and irregularly shaped objects. Importantly, inference time remains stable, showing this modification enhances accuracy at minimal additional cost. Reducing the number of C2f. repetitions (Experiment D) achieves a noticeable latency reduction (− 4.1 ms) and decreases parameter count, while maintaining accuracy close to baseline. This indicates that redundant backbone depth can be removed without significant degradation in detection quality.

Table 6 Ablation experiment results of improved YOLOv8s.

Replacing the original SPPF with a reduced-channel SPPF module (Experiment E) slightly boosts accuracy (+ 0.8% mAP@50) while lowering computational cost, confirming the efficiency of streamlined contextual aggregation. The addition of the LSKAttention block (Experiment F) provides the single largest accuracy improvement (+ 4.0% mAP@50 compared to baseline). By reducing background noise and emphasizing informative spatial regions, the network’s large-kernel attention improves generalization on intricate custom datasets. The trade-off is a somewhat longer inference time (~ + 0.6 ms). Finally, merging all improvements into the proposed YOLOv8s (modified) (Experiment H) results in the greatest performance across all metrics, attaining 97.3% mAP@50 and 71.5% mAP@50:95 while reducing average inference time to 66.8 ms. Compared to the original YOLOv8, this is a significant + 7.7% improvement in mAP@50 and − 5.1 ms latency reduction, showing that the improvements strike a superior balance between accuracy and efficiency.

Comparative study

The comparative evaluation highlights the performance differences between traditional and recent deep learning models for anomaly detection. Faster R-CNN¹⁹ achieves a precision of 82.3%, recall of 75.4%, and mAP@50 of 84.1%, but its mAP@50:95 is relatively low at 54.2%. With an inference time of 93.5 ms as shown in Table 7, it is the slowest model in the comparison. While it is strong at detecting diverse anomalies due to its region-based architecture, the computational overhead makes it unsuitable for real-time monitoring tasks.

Table 7 Comparative performance of anomaly detection models.

YOLOv5s⁴² provides a lightweight and widely adopted solution with a precision of 82.2% and mAP@50 of 72.6%. However, the recall rate is substantially lower (63.9%), indicating that it frequently fails to recognize minor or more complicated anomalies. This shortcoming restricts its utility in infrastructure surveillance applications where sensitivity to uncommon anomalies is essential. YOLOv7⁴³ outperforms YOLOv5 with a precision of 86.4% and recall of 71.3%. Its mAP@50 is 75.9% and mAP@50:95 is 54.4%, indicating improved detection consistency than YOLOv5, but it still struggles with more difficult anomalous conditions. Despite its precision advantage, the lack of reported inference time limits assessment of its real-time practicality.

The original YOLOv8s strikes a stronger balance between accuracy and efficiency. It records 86.9% precision, 79.8% recall, 89.6% mAP@50, and 59.0% mAP@50:95, with a faster inference time of 71.9 ms compared to Faster R-CNN. This makes it more appropriate for real-time applications, although its speed remains limited when dealing with small-scale or visually complicated abnormalities.

In contrast, the updated YOLOv8s proposed in this work had a clear advantage across all criteria. It reduces the inference time to 66.8 ms while achieving 96.9% precision, 86.6% recall, 97.3% mAP@50, and 71.5% mAP@50:95. This performance boost is due to architectural improvements such as lightweight channel reduction, hierarchical scaling, fewer C2f. repetitions, streamlined SPPF, and the introduction of LSKAttention. These enhancements not only increase feature extraction and contextual awareness, but they also reduce processing costs, allowing for more robust detection in real time. Overall, the updated YOLOv8s model strikes the best balance of accuracy, generalization, and efficiency, making it the most ideal model for identifying abnormalities in telecommunications infrastructure.

Source link