Advancing common bean (Phaseolus vulgaris L.) disease detection with YOLO driven deep learning to enhance agricultural AI

Machine Learning

Building a diverse dataset for common bean disease detection

Common beans, also known as nearly perfect food for their nutritional richness, stand as a linchpin for economic stability, uplifting the livelihoods of smallholder farmers worldwide67. Yet, the specter of disease looms large over common bean cultivation, presenting a daunting challenge. Detecting and validating these diseases constitutes a second major hurdle for pathologists, a complex and time-consuming endeavor that invariably demands expert supervision.

To expedite disease detection and facilitate timely management interventions, a comprehensive image dataset, recognizing the inadequacy of existing public resources like the PlantVillage dataset for common bean diseases, collaborating with CGIAR bean network experts were developed. Collectively, 9564 original field images from diverse disease hotspots were amassed. A subset of these images formed our annotated image dataset, outlined in Table 1. These images were curated by expert pathologists for precise disease identification.

To ensure heterogeneity, images were captured in real-field settings, documenting the intricacies of actual field conditions and plant interactions across different growth stages (Supplementary Table 2). Additionally, various cameras were used to capture the images, introducing variations in image quality and background complexity. A realistic spectrum of disease presentations within the context of agricultural variability depicts the challenges that crops encounter in the dynamic environmental conditions during various growth stages, which is an essential step in developing a globally beneficial, mobile-assisted disease detection tool35. This strategic preparation equips our model for deployment in diverse and unpredictable environments where common beans are cultivated.

Micro-annotation and augmentation techniques for enhancing CNN performance

To enhance the performance of our CNN in identifying common bean diseases, we implemented micro-annotations and data augmentation techniques to create a more robust training dataset. Data augmentation techniques, such as flipping and brightness adjustments, were applied strategically to diversify the dataset and address overfitting to effectively generate additional data variations (Table 1). These techniques were selectively applied to datasets with least amount of data to maximize their impact. This includes CBMV, Rust, and ANTH classes for whole leaf annotations; Rust class for micro leaf annotations; and healthy class for pod annotations. These augmentation strategies enriched the training dataset, introducing diversity into the samples and enhancing the performance and generalization of the deep-learning models used for disease detection.

Conversely, micro annotations focus on identifying specific disease symptoms at a micro level, which are essential for training highly accurate and sensitive models. While manually annotating each small symptom can be challenging due to resource constraints, micro annotations have the potential to enhance the generalization of models, allowing them to recognize a wider range of disease variations. However, their performance is highly dependent on the factors like data complexity, data quantity and annotation quality.

The dataset was split (70% training, 20% testing, and 10% validation) to ensure representation across different disease classes. Each image underwent rigorously validation by a bean phytopathologist, resulting in a comprehensive set of 44,022 annotations before data augmentation, and expanding to 54,264 after data augmentation (Table 1). This labor-intensive annotation process, conducted by three experts over 4 months, underpins the dataset’s quality and reliability. This precise level of labeling, surpasses the scope of publicly available datasets, bolstering our model more robust against common issues like overfitting and underfitting. Consequently, the system demonstrates greater efficacy and adaptability for real-world disease detection in diverse agricultural settings.

Comprehensive evaluation of YOLO models for common bean disease detection

This study represents a trailblazing effort in evaluating one-stage YOLO object detectors, including YOLOv7, YOLOv8, and YOLO-NAS, specifically for detecting CB diseases. The YOLO series is known for its single-stage detection capability and real-time processing efficiency68,69. Notably, YOLO-NAS stands out within the YOLO family for its advanced performance in detection metrics and its rapid inference speed. We comprehensively assess the performance of our advanced YOLO-based detectors using a range of detailed metrics. The metrics encompass various annotation resolutions (whole and micro) for both leaf and pod datasets. This multifaceted evaluation approach allows us compare the detector’s performance across different plant parts, to providing a comprehensive analysis.

Training loss function

Training loss analysis plays a crucial role in emphasizing the efficiency, adaptability, and stability of the YOLOv7 and YOLOv8 models during the learning process. Both models exhibited a rapid initial decline in loss for both the leaf and pod datasets (see Fig. 5). This rapid decrease signifies the overall effectiveness of the learning and adaptation to the training data. This observation is consistent with prior findings on training duration and loss convergence70, affirming the diverse convergence rates observed during training. The consistent decline in training loss further validates the effectiveness of the model.

Figure 5
figure 5

Total loss for different models, at whole annotations level; (a) train loss for leaves and pods, (b) validation loss for leaves and pods.

However, YOLOv8 model displayed an anomaly in the annotated pod dataset, where the loss starts to increase around epoch 16 before continuing on a downward trend. This could be attributed to the increased complexity of the annotated pod dataset compared to its previous training datasets. Once it overcome the initial hurdle, the model begins to effectively learn from the data, signifying a positive stability in the learning process.

Despite this anomaly, both models exhibited a consistent and steady decline in loss over time, indicating a positive stability in their learning process. Lower loss values on the training set compared to the validation set align with expectations. The relative stability in the difference between training and validation losses across epochs indicates the absence of significant overfitting in both the models, highlighting effective generalization, a common challenge in model training. This means that the models appear to learn underlying patterns in the data rather than memorizing specific training examples.

On the contrary, while the YOLO-NAS model exhibited similar trends for the leaf dataset (Fig. 6), in both full annotation and micro-annotation levels (Supplementary Fig. 1), its validation losses for the pod dataset displayed significant fluctuations (Fig. 7, Supplementary Fig. 2). These fluctuations suggest potential overfitting, likely stemming from non-representative validation data or inadequate regularization techniques. This behavior could elucidate the lower mAP scores observed for YOLO-NAS in the pod dataset. It underscores the critical importance of careful dataset curation and the potential need for adjusting regularization while training such models.

Figure 6
figure 6

YOLO-NAS model evaluation indicators (loss and mAP@0.5) during training, for whole leaf annotations.

Figure 7
figure 7

YOLO-NAS model evaluation indicators (loss and mAP@0.5) during training, for whole pod annotations.

Notably, at the micro annotation level, the YOLOv8 training on the pod dataset stopped at epoch 217 (Supplementary Fig. 3) due to the absence of improvement in the validation loss for 50 consecutive epochs. Additionally, YOLOv7 training on the leaf dataset showed a slight overfitting tendency as the validation loss followed an increasing trend around epoch 40 instead of decreasing, signifying an augmented difference between validation and training loss. The individual contributions of each loss to the total loss of the YOLOv7 model can be seen in Supplementary Figs. 4, 5, 6 and 7, and those of the YOLOv8 model in Supplementary Figs. 8, 9, 10 and 11.

mAP score

To assess the performance of object detection models, particularly focusing on the widely recognized mean Average Precision (mAP) metric35, we evaluated different YOLO models for leaf and pod detection. This metric has been the benchmark in competitions like PASCAL VOC, ImageNet, and COCO datasets. We complemented this analysis with confusion matrix to gain deeper insights into model performance, with a particular emphasis on unseen data at both whole and micro annotation levels.

The mAP scores of different YOLO models for leaf and pod detection were detailed in Table 3. YOLO-NAS stands out with a remarkable performance for whole leaf disease annotation, achieving an impressive mAP score of 97%. Nevertheless, for micro annotation on leaves, YOLOv8 excelled with a notable mAP score of 67%. In pod detection tasks, YOLOv8 continued to perform superior, achieving mAP scores of 96% and 87% for whole and micro annotations, respectively (Table 4). It is worth noting that YOLOv7 closely mirrored the performance of YOLOv8, achieving high mAP scores for both leaf and pod datasets.

Table 3 mAP metric score per leaf class for different models (IOU = 0.5, Conf = 0.35).
Table 4 mAP metric score per pod class for different models (IOU = 0.5, Conf = 0.35).

Across all classes and models, whole annotations generally yielded better results than micro annotations. Specifically, for the healthy class, YOLOv7 and YOLOv8 achieved high mAP accuracies of 96% and 98%, respectively, for pods across both annotation levels, and for leaves at the micro annotation level. In the case of whole leaf annotations, these models also performed well, with mAP scores of 96% and 85%, respectively. The evolution of the validation mAP during model training is illustrated in Fig. 8, demonstrating a continual increase of the mAP throughout the epochs until it becomes relatively uniform.

Figure 8
figure 8

mAP@0.5 evaluation metric for different models; (a) whole leaf annotations, (b) whole pod annotations.

These findings provide further context to our study, emphasizing that although all three YOLO versions can achieve high accuracy, their performance nuances become apparent based on the complexity of annotation levels and the specific nature of the tasks. The noteworthy performance of YOLO-NAS under certain conditions and the close competition between YOLOv7 and YOLOv8 highlight the continuous advancements in object detection technologies, showcasing their potential applications in precision agriculture.

The differences in loss patterns and mAP scores among the models suggest that while YOLOv7 and YOLOv8 exhibit robustness in a various scenario, YOLO-NAS may require more specific tuning, especially when confronted with datasets of higher variability or complexity. This insight proves invaluable for future model development and application, particularly in precision agriculture, where precise and reliable disease detection is imperative. These findings underscore the necessity for continuous model evaluation and adjustment to cater to the specific characteristics of diverse datasets and detection tasks.

Performance of YOLO models using confusion matrix

In our study, confusion matrices served as a pivotal tool for assessing the performance of various YOLO model variants. These matrices, delineating true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), played a crucial role in evaluating disease-specific accuracy and identifying misclassifications (Fig. 9). The analysis revealed valuable instances, where class complexity resulted in reduced accuracy, providing insights into areas prone to errors for targeted improvements.

Figure 9
figure 9

Confusion matrix using whole annotations for different models, (a) YOLO-NAS model, (b) YOLOv7 model, (c) YOLOv8 model.

The YOLO-NAS model demonstrated robustness with over 97% accuracy in detecting symptoms of leaf diseases across all classes. However, for ALS pod symptoms, YOLOv7 and YOLOv8 outperformed YOLO-NAS with accuracies of 95% and 93%, respectively. YOLO-NAS’s detection rate for these symptoms dropped from 79% for whole annotations to 48% at the micro annotation level, compared to 66% each for YOLOv7 and YOLOv8. Diseases like rust and anthracnose, particularly challenging at micro levels, showed lower accuracies around 56%. No misclassifications occur at the micro annotation level (Supplementary Fig. 12), but the number of undetected objects increases considerably.

Interestingly, the confusion matrices highlighted additional detections made by the models. Despite these additional detections slightly affected precision, they were mostly correct and indicated the models’ effectiveness in identifying objects––a critical factor in complex agricultural scenarios where exhaustive annotation might be challenging. This aspect is significant as it showcases the models’ capability in comprehensive detection despite the inherent difficulties in annotating every detail in a small diseased image. Furthermore, recognizing the widespread occurrence of CB leaf spot disease across diverse regions such as Asia, Latin America, and Africa, we are actively compiling and annotating early-stage symptom images. This endeavor aims at enhancing models’ accuracy and adaptability for real agricultural settings.

Precision and recall

Our study also focused on evaluating the performance of various YOLO models in detecting CB diseases, utilizing precision and recall as key metrics. Precision, indicating the ratio of correctly identified positive cases to all predicted positives, and recall, measuring the ratio of correctly identified positives out of all actual positive cases, are essential metrics for assessing the diagnostic accuracy of the models.

The YOLOv7 and YOLOv8 models exhibited excellent performance, achieving precision and recall scores of 0.9 at the whole annotation level for both leaves and pods, as shown in Table 5. However, a decrease in these scores was observed for micro annotations, suggesting a variation in model performance based on annotation detail. YOLO-NAS demonstrated high effectiveness, particularly in whole-leaf annotations, with a precision of 0.7 and an impressive recall of 0.988. Despite its lower precision due to a higher number of extra detections, its high recall confirms its strong detection capability. For whole pod annotations, YOLO-NAS showed a precision of 0.6, primarily impacted by the model misclassifying some diseased pods as healthy (Fig. 9). The model maintained high recall levels even at the micro annotation level, though there was a slight drop compared to whole annotations.

Table 5 Precision and recall metrics score for different models (IOU = 0.5, Conf = 0.35).

In the evaluation of the performance of models, a confidence threshold (conf) of 0.35 was set uniformly across all three models. This threshold choice involves a trade-off between precision and recall, where a higher confidence threshold tends to increase precision by reducing false positives but may decrease recall as some true positives with lower confidence might be overlooked (Fig. 10a,b, Supplementary Figs. 13–20b,c). This delicate balance is visually represented in the Precision-Recall (PR) curve (Fig. 10c,d, Supplementary Figs. 13–20a), with the area under the PR curve (AUPRC) serving as a comprehensive metric summarizing of model performance across all thresholds. A similar behavior is observed at the micro annotation level (Supplementary Fig. 21).

Figure 10
figure 10

Operation results curve for YOLO-NAS model, using whole leaf and pod annotations; (a) precision-confidence curve, (b) recall-confidence curve, (c) precision-recall curve for leaf, and (d) precision-recall curve for pod.

The standardized dataset employed in our study ensures rigorous evaluation, enabling reliable comparisons of each model’s performance.

Model predictions using unseen images

After analyzing the results obtained from all the metrics mentioned above, Fig. 11 shows the predictions made by the YOLO-NAS model on the leaf dataset at the whole annotation level, Fig. 12 at the micro annotation level, and finally Fig. 13 shows the predictions on the pod dataset at both annotation levels.

Figure 11
figure 11

Some examples of common bean disease detection results using YOLO-NAS and whole annotations.

Figure 12
figure 12

Some examples of common bean disease detection results using YOLO-NAS and micro annotations.

Figure 13
figure 13

Some examples of common bean disease detection results using YOLO-NAS on the pod dataset.

Surprisingly, our analysis found that micro annotation yielded lower performance than whole annotation across all explored YOLO models, regardless of disease classes. This result contradicts the hypothesis that micro annotations will improve detection accuracy. The discrepancy suggests that the effectiveness of annotation methods may vary depending on factors such as dataset complexity, disease characteristics, and model requirements. However, the detection accuracy may be lower with micro annotation models because we did not fully annotate all lesions in each image, particularly due to the high number of lesions per image. Developing improved annotation techniques could significantly enhance the accuracy and efficiency of these annotations. Further investigation is warranted to understand this finding and optimize annotation strategies for future CB disease detection research.

Conversely, Figs. 14 and 15 show situations where specific model outperform others. For example, in Fig. 14, the YOLOv7 and YOLOv8 models successfully detect the POD within the image, whereas YOLO-NAS does not. This aligns with the mAP, precision, and recall results as shown above. However, Fig. 15, demonstrates YOLO-NAS ability to more accurately identify healthy leaves, while the YOLOv8 model fails to detect many of the healthy leaves. This aligns with the model’s respective mAP scores for the healthy class.

Figure 14
figure 14

Example of prediction in the same image. (a) YOLO-NAS model, (b) YOLOv7 model, (c) YOLOv8 model.

Figure 15
figure 15

Example of prediction in the same image. (a) YOLO-NAS model, (b) YOLOv8 model.

Deployment and testing of the YOLO-NAS model in an AI powered app

To bridge the gap between research and practical application, we seamlessly integrated our promising whole annotation YOLO-NAS models into a user-friendly Android app. The app boasts a straightforward design, allowing users to either upload existing photos from their storage or capture new ones in real time for immediate analysis (Fig. 16). This real-time capability played a pivotal role in evaluating the functional accuracy of the models within the practical context of a real-world app.

Figure 16
figure 16

Developed mobile application for bean disease detection. (a) Initial screen, (b) image taking and scan, (c) diagnostic screen for leaf, (d) recommendations screen.

To evaluate the real-time performance of the app, we tested real-field images from disease hotspots in Latin America and Africa. Our results demonstrate that the YOLO-NAS model achieved outstanding accuracy in accurately detecting almost all classes (Table 6). Specifically, the model achieved close to 100% accuracy with high confidence scores across all disease classes, except for the pod classes (Table 6). The lower confidence in pod predictions can be attributed to the inherent complexity of the task. In real-field environments, bean pods are often surrounded by diverse background elements that can pose challenges to distinguish between bean pods and background elements, leading to lower confidence levels in its predictions. Furthermore, factors such as varying lighting conditions, shadows, and cluttered backgrounds can contribute to the difficulty of the task. Despite these limitations, the model successfully manages to detect almost all pods correctly.

Table 6 Real-time disease detection using the Tumaini AI app and validation.

This successful field deployment not only validates the reliability of the app but also underscores its potential as a robust agricultural tool for enabling timely interventions in CB disease management practices that can significantly enhance crop yields and reduce reliance of pesticides.

Advantages of YOLO-NAS models in precision agriculture

Our extensive analysis revealed the prowess of YOLO-NAS model in real-time CB disease detection within agricultural settings (Supplementary Fig. 22, Table 6). Engineered through the cutting-edge technique of Neural Architecture Search (NAS), YOLO-NAS adeptly strikes a balance between speed and accuracy amidst the challenges in field conditions, which is a pivotal attribute for prompt and precise disease diagnosis71. Notably, it effectively combines the quick detection characteristic of one-stage detectors and the precision akin to two-stage detectors, achieving high efficiency and precision, with reduced risk of overfitting72. The performance metrics of YOLO-NAS stands out for its exceptional performance, particularly for its high mAP scores on benchmark datasets like COCO and its lower latency compared to counterparts such as YOLOv7 and YOLOv8.

In a continuation of exploring the model performance, YOLOv7 and YOLOv8 models have demonstrated robustness for edge computing applications in agricultural contexts, especially in remote areas with limited internet connectivity. Our findings demonstrated high accuracy, precision, and recall achieved by these models, proving them reliable tools for rapid and effective CB disease management. Their ability to function independently on local devices empowers farmers to conduct immediate on-site diagnostics. This aligns with the findings44 which emphasized the proficiency of the YOLOv7 model in tea leaf disease detection. Their research, utilizing key metrics such as precision and recall, recorded values exceeding 90%, reinforces the effectiveness of YOLOv7 in accurately identifying plant diseases, mirroring our own observations with both YOLOv7 and YOLOv8 models.

In contrast, while YOLO-NAS exhibits good precision and generalization, its current complexity introduces challenges for offline use in the field conditions. This primarily stems from the lack of readily available tools to efficiently convert the model into a lightweight format suitable for mobile devices. This limitation renders YOLO-NAS as less viable for immediate use in the field without cloud support. Nevertheless, the ongoing advancements in YOLO models, particularly the online proficiency of YOLO-NAS, paint a promising future. Researchers aim to amalgamate the accuracy of YOLO-NAS with the offline capabilities of YOLOv7 and YOLOv8, expanding the accessibility of AI-powered disease detection tools to agricultural professionals worldwide, irrespective of their internet connectivity. This transformative interaction holds the potential to revolutionize disease management strategies in agriculture, offering a seamless blend of precision and accessibility.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *