A Multi-kernel CNN model with attention mechanism for classification of citrus plants diseases

Over the past few decades, the automated methods for identifying and categorizing multiple diseases affecting plant leaves have been developed using a wide range of traditional machine learning models. Consequently, the application of deep learning methods proves highly advantageous in the efficient detection and classification of multiple diseases in plant leaves, offering reduced processing time and heightened accuracy. These deep learning approaches excel in extracting optimal feature representations from extensive image datasets, significantly expanding the scope of computer vision within the precision agriculture domain.

Vijayaganth and Krishnamoorthi⁹ introduced a novel ensemble learning model for the multi-disease classification of plant leaves that combines soft computing and deep learning techniques. The ensemble architecture results in improved classification performance through the application of enhanced Oppositional Adaptive Galactic Swarm Optimization (OA-GSO) to fine-tune the parameters of Convolutional Neural Network (CNN), Support Vector Machine (SVM), and Deep Neural Network (DNN), The experimental analysis reveals that the accuracy of the OA-GSO-Ensemble surpasses that of traditional models, with a notable superiority of 6.66% over Decision Trees (DT), 4.34% over SVM, 6.66% over Neural Networks (NN), and 9.09% over the Ensemble model. Additionally, it outperforms Incremental Deep Neural Network (IDNN) by 5.49%, particularly at a learning percentage of 35%. The high accuracy achieved demonstrate the effectiveness of the proposed model compared to established models in the field.

Srinivasa et al.¹⁰ employed a Convolutional Neural Network (CNN) model for feature extraction and classifying them as either healthy or diseased from the citrus fruits images. The final prediction based on the extracted features is obtained using the Random Forest algorithm. This user friendly system enables prompt disease diagnosis and provides reliable management plans allowing farmers to upload images of their citrus fruits through web-based or mobile platforms. The system’s design focuses on realizing patterns associated with healthy and diseased fruits in improving the model’s performance. Additionally, it incorporates an automated alert system and management plan generation, regulating farmers to take timely measures for controlling the disease prevention. The primary goal of the system is to offer an efficient and user-friendly tool for diagnosing and treating diseases, leading to improved crop yields and profitability for citrus growers.

Dhimam et al.¹¹ presented an efficient model for predicting diseases in citrus fruits. The model combines CNN and LSTM deep learning architectures with edge computing. It features an refined feature-extraction mechanism, including sub-sampling and feature fusion on edge devices to enable effective disease recognition while maintaining high accuracy. The study uses include 2950 images of citrus fruits from the plant village datasets and Kaggle images, with disease categories such as scabs, melanosis, black spots, cankers, and greening. The performance metrics including support, recall, precision and F-measure are evaluated using the two approaches without and with pruning on the existing model and the proposed model. The experimental analysis is conducted in two phases: first using Magnitude Based Pruning and then using Magnitude Based Pruning with Post Quantization. The achieved accuracy rate of the Magnitude-Based Pruning with CNN-LSTM model is 97.18% and with Magnitude-Based Pruning with Post Quantization is 98.25% outperforming the existing CNN method.

The healthy growth of plant production can be promoted by limiting the curb infection spread in their early stages through accurate classification and detection of plant leaf diseases. Hosny et al.¹² developed a novel lightweight deep Convolutional Neural Network (CNN) model to extract high-level hidden feature representations. These deep features are combined with traditional handcrafted Local Binary Pattern (LBP) features to capture local texture information in plant leaf images. The proposed model undergoes training and testing on three publicly available datasets representing Apple Leaf, Tomato Leaf, and Grape Leaf. The achieved validation accuracies are 99%, 96.6%, and 98.5% for the respective datasets, while test accuracies are 98.8%, 96.5%, and 98.3%. The experimental results indicate that the proposed approach offers an efficient solution for plant disease control.

Faisal et al.¹³ introduced an automated system for early detection and classification of citrus plant diseases using a deep learning (DL) model, which not only boosts accuracy but also reduces computational complexity. By leveraging the latest transfer learning techniques on the Citrus Plant Dataset, the study aimed to enhance the accuracy of classification process. Through pre-trained models, the research successfully introduced a pre-trained Convolutional Neural Network (CNN) model (including EfficientNetB3, ResNet50, MobiNetV2, and InceptionV3) for identifying and categorizing citrus plant diseases. Performance evaluation revealed that transferring an EfficientNetB3 model achieved the highest training, validation, and testing accuracies at 99.43%, 99.48%, and 99.58%, respectively. CNN model developed surpasses the other state-of-the-art CNN architectures for citrus plant disease identification and categorization.

Rao et al.¹⁴ introduced a new approach using automated segmentation and classification models for classifying plant diseases. The Solution Index-based Jaya-Krill Herd Optimization (SI-JKHO) technique is incorporated for improving the adaptive leaf abnormality segmentation. The methods such as Gray level co-occurrence matrix and the hybrid local binary pattern with local gradient patterns are performed for the extraction of color features and shape features. To identify the optimal features, these features undergo feature selection using principal component analysis which are then used in the “tuned long short-term memory with recurrent neural network (T-LSRNN)” model. The results show that the enhanced-JKHO-LSRNN method outperforms other models, achieving performance gains of 6.66%, 6.66%, 7.86%, and 4.34% over RNN, long short-term memory, convolutional neural networks, and LSRNN, respectively, at the 35th learning percentage in dataset 3. The various performance metrics such as false positive rate, false negative rate, specificity, sensitivity, accuracy, and precision shows that the developed model can accurately detect plant diseases compared to conventional models as evidenced by its classification accuracy.

Monitoring plant leaves is often characterized as prone to error, uncertain, and unreliable. Consequently, various deep learning algorithms have been evolved to address this challenge, with many relying on convolutional neural networks (CNNs) designed for restricted-resolution images. Chillakuru et al.¹⁵ presented an (PLDC) structure with Optimized Nine-Layer (ONL)-CNN model to ensure precise classification results for plant leaf disease classification. The abnormality segmentation process involves segmenting leaf images using the “Adaptive Fuzzy C-Means” (A-FCM) technique. The optimized nine-layer CNN is used for plant leaf disease classification in the segmented images. In both the FCM and ONL-CNN stages the optimization of parameters is carried out to enhance the PLDC model’s performance, achieved through a hybrid heuristic optimization algorithm incorporating hybrid leader cat swarm optimization (HLCSO). Exploratory analysis demonstrates the effectiveness comparing the different baseline methods with the developed PLDC model, achieving a noteworthy accuracy rate of 96%. Consequently, the developed model ensures effective performance in terms of accuracy metrics moreover aids in the early diagnosis of leaf diseases.

Elfatimi et al.¹⁶ introduced an advanced method for classifying bean leaf diseases, incorporating innovative network optimization and hyperparameter approaches. The study explored various architectures individually to determine their effectiveness and identify optimal configuration for accurately categorizing bean leaf diseases. Among the architectures assessed, MobileNetV2 emerged as the preferred algorithm due to its faster training speed, simplicity in retraining, and higher accuracy in distinguishing between healthy and unhealthy bean plant leaves.

Dhimam et al.^11,17 presented a precise model for identifying fruit diseases by fusing context data with Faster-CNN in an edge computing environment. The aim is to create an accurate, efficient, and reliable fruit disease detection model, crucial for autonomous food production in a robotic edge platform. The research focuses on four Citrus fruit diseases using CNN models for edge computing solutions. Disease identification for cankers, black spots, greening, scab, melanosis, and healthy citrus fruits is achieved using the proposed sequential model, with and without pruning at various sparsity levels, followed by post quantization. Transfer learning optimizes the model for fruit disease detection using visual inputs from Near-infrared (NIFR) and RGB patterns. Early and late data fusion techniques are evaluated for integrating multi-model (NIFR and RGB) information. The proposed model achieves high accuracies for various diseases: canker 97%, scab 95%, melanosis 99%, greening 97%, black spot 97%, and healthy 97%. The study compares the proposed model’s performance with different sparsity levels (50–80%, 60–90%, 70–90%, and 80–90%) of pruning and post-quantization. Results show that pruning at 60–90% sparsity can reduce the model size to 47.64% of the baseline model without significant loss of accuracy. Post-quantization further reduces the size from 28.16 to 8.72%. Additionally, this approach is faster to implement for new fruit diseases as it requires bounding box annotation instead of pixel-level annotation.

Khamparia et al.¹⁸ developed a hybrid approach for crop leaf disease detection using a combination of CNN and autoencoders’. This novel method involves utilizing convolutional encoder networks with leaf images to detect diseases in crops. The developed approach was tested on a dataset consisting of over 900 images, with 300 images serving as the testing set and 600 as the training set. The model was designed to identify five types of crop diseases across three different crops. Training the developed network involved differentiating crop diseases based on leaf images, and the architecture demonstrated varying accuracy levels for different convolution filter sizes and numbers of epochs.

Most plant diseases manifest in leaves, precise identification requires specialized knowledge, posing a challenge for farmers. Hence, there is a critical demand for timely and accurate automatic recognition of plant diseases in agriculture field. Sheng et al.¹⁹ introduced a innovative transformer block to model long-range features employing the transformer architecture. Additionally to acquire local information from surrounding pixels and patches, soft split token embedding is used. Also the inception architecture and cross-channel feature learning are incorporated to enhance information richness, particularly beneficial for fine-grained feature learning. The proposed model outperforms previous convolution and vision transformer-based models, achieving a remarkable accuracy on standard benchmark datasets. The experimental results highlight the superiority of the proposed model over existing ones in plant disease identification.

Abisha and Bharathi²⁰ proposed a method to predict the plant diseases using the hybrid Xception architecture and random forest classifier. The different transfer learning models employed for the feature extraction are VGG 16, RESNET, Mobile Net, InceptionResNetV2, Inception, and Xception models. The developed hybrid model achieved an accuracy rate of 100% predicted outcomes in the classification process. The overall implementation cost and the number of parameters is reduced using the depth-wise separable convolution layers of the Xception model. This method of extracting features through Xception model and subsequently employing the random forest for classification, integration of both the deep learning and machine learning achieves notable improvement in accuracy performance.

Deqin et al.²¹ introduced a method called Texture Reconstruction Loss CycleGAN (TRL-GAN) to enhance the accuracy and generalization of a Citrus Greening Disease recognition algorithm. The TRL-GAN generates realistic diseased leaf data of citrus to increase the diversity of samples. Experimental results on a dataset of 4516 images show that TRL-GAN improves FID by 13.49% and KID by 1.1% compared to the original CycleGAN. The fake data generated by TRL-GAN are deemed similar to real data by experts. T-SNE analysis shows that the real data and generated fake data have similar distributions. The ResNeXt101 classification network achieves 97.45% accuracy, and RCG TRL-GAN improves the average accuracy by 2.76% in recognizing citrus greening disease. This study demonstrates that RCG TRL-GAN effectively enhances the generation and recognition of citrus greening disease phenotype data, providing a method reference for expanding and recognizing complex plant disease phenotype images.

Satya et al.²² implemented a Zero-shot Transfer Learning approach to address the challenge of limited data availability in the target domain for leaf disease classification. By combining advanced CNN models with various techniques like data augmentation, synthetic data generation, and robust discriminative losses, a reliable method is established for transferring knowledge from a well-populated source domain to a less abundant target domain. Through extensive experimentation on plant disease classification datasets, the effectiveness of each model and technique is validated within the developed framework. Experimental results demonstrate the significant improvements in classification accuracy, highlighting the effectiveness of the Zero-shot Transfer Learning framework and its potential to enhance disease classification in agricultural contexts.

Attention mechanism is crucial in human perception, as our visual system can specifically focus on specific parts of a scene. Inspired by this, researchers have introduced attention mechanisms in deep learning to enhance the performance improvement of convolutional neural networks in tasks such as classification of images and segmentation tasks. Despite its proficiency in capturing local feature details, CNN struggles to acquire global feature representations. To effectively address these issues, Siqi et al.²³ proposed a novel model that integrates CNN and Transformer architectures, influenced by Visual Attention Net (VAN) and CoAtNet. The developed model introduces Large Kernel Attention (LKA) that mitigates the limitations of CNN in capturing global features by incorporating Transformer blocks to alleviate the issue of detail degradation in local Transformer features with a self-attention variant to a relative location. While previous studies have primarily focused on enhancing model structures and fusion techniques, this model centers on refining the attention mechanism to enhance and combine the self-attention mechanisms and large kernel attention with a minimal number of parameters in the model to achieve outstanding accuracy recognition. Some research on attention mechanism indicates that using larger kernels, like 9 × 9, 7 × 7, and 5 × 5, can lead to improved accuracy. However, if the kernel size continues to expands or matches the input resolution, the network becomes more complex with more parameters like a fully-connected network, resulting in inferior performance. Therefore, networks relying solely on single or oversized kernels are not optimal solutions. Motivated by these principles and observations, Gaihua et al.²⁴ introduced the Hybrid Kernel Mask R-CNN (HKMask) to address scale variability. The hybrid kernel module proposed comprises of kernels of various sizes and depths, each assigned to different groups. This model incorporates an enhanced channel attention module, inspired by Squeeze-and-Excitation Networks, which preserves critical information using shortcut connections.

Currently, there have been several attempts to incorporate attention processing to enhance the performance of CNNs in large-scale classification tasks. Woo et al.²⁵ developed the Convolutional Block Attention Module (CBAM), an attention mechanism designed for feed-forward convolutional neural networks. CBAM operates on intermediate feature maps, generating attention maps along two dimensions—channel and spatial. These maps are then multiplied with the input feature map to refine features adaptively. CBAM is lightweight and versatile, seamlessly integrating into any architecture of CNN with minimal additional computational resources. Moreover, CBAM is trainable end-to-end along with base CNNs.

The ECA-Net proposed by Wang et al.²⁶ utilizes a convolutional kernel that dynamically adjusts its size to effectively capture inter-channel relationships. These success in analyzing natural images lies in the hands of attention mechanisms. However, there’s a risk of degrading crucial features in plant images, resulting in subpar performance. When the spatial and channel attention modules are applied sequentially. Additionally, using a fixed-size kernel hinders the model’s ability to capture features of varying scales and characteristics. The introduced Residual Attention Network, incorporates an encoder-decoder style attention module. This module refines feature maps, leading to strong performance and robustness against noisy inputs. Unlike directly computing a 3D attention map, we decompose the process to independently learn channel attention and spatial attention. This approach significantly minimizes computational and parameter overhead, making it convenient as a plug-and-play module for existing CNN architectures.

Li et al.²⁷ developed a multibranch CNN with attention (MBCNNA) for plant species recognition. MBCNNA comprises 12 convolutional layers, 4 max pooling layers, and 2 fully connected layers. The initial part of MBCNNA includes an attention block to diminish background influence, while the latter part is a multibranch CNN that extracts multilevel features through multiple channels. Similarly, Zhu et al.²⁸ proposed a plant species recognition method using a two-way attention model based on DCNN. The first attention mechanism aims to identify plant species, while the second focuses on discriminative features by identifying the maximum-sum part of the fully convolutional network heat map.

Attention mechanisms have been extensively used in conjunction with Convolutional Neural Networks (CNNs) to capture spatial and channel-wise information. However, existing methods typically focus on using attention modules to capture local interactions at a single scale. To address this, Bakr et al.²⁹ introduced channel-wise attention across multiple feature scales, demonstrating its ability to outperform limited local and single-scale attention modules empirically. The proposed method, called EMCA, is lightweight and can efficiently incorporate global context. It can easily integrate into any feed-forward CNN architecture and can be trained end-to-end. The novel architecture introduced is validated through extensive experiments on tasks such as image classification, object detection, and instance segmentation, using various backbone networks. The experiments consistently show performance improvements compared to existing methods. Specifically, the EMCA module surpasses other channel attention techniques in terms of accuracy and latency trade-off.

Current research in deep learning has made significant strides in processing both citrus plants and images, achieving good results. However, existing studies on citrus plants classification often overlook the purpose of attention mechanisms in plant image classification. They also fail to catch the complex relationships across channels and spatial dimensions. Previous work on attention mechanisms typically focuses on either spatial or channel attention, or compromises certain features during attention computation. For instance, sequential execution of spatial and channel attention modules can lead to the destruction of important features, such as patchy consolidations, in plant images, leading to suboptimal results. Additionally, present methods only consider relationships with single-sized kernels for channel or spatial relations, without integrating relationships across multiple kernel sizes. In this work, we address the problem of plant disease diagnosis using deep learning analysis of citrus images. We propose a Multi Kernel-size CNN(MKCNN) with attention mechanism to effectively identify leaf disease in plants while maintaining computational efficiency and performance.

Problem statement and objective of the present work

Conventional approaches frequently use image models that have been trained on large datasets, but their usefulness is limited because of their singular backgrounds. Techniques such as pathogen identification or field observations might be subjective, need specific equipment and training, and have poor accuracy and efficiency. The efficacy of classical machine learning techniques in real-world applications is limited since they are inappropriate for complex image data and multicategory classifications. Although handcrafted feature extraction methods are frequently employed, their recognition accuracy may drastically decline in natural citrus forests. Moreover, the main objectives of this work is follows,

To enable prompt action and stop extensive losses in citrus fruit crop output and quality. To use technology advancements to create automated systems for identifying and categorizing various citrus diseases.
To improve disease classification accuracy through the use of techniques such as deep learning, which have proven to be more successful than conventional image processing methods.
To develop systems that are more useful for real-world applications by providing high accuracy while lowering computational complexity.
To tolerate the economic stability of nations that depend on citriculture and increase fruit production.
To offer thorough evaluations of different methods for identifying and categorizing illnesses in the leaves of citrus plants.

Source link