Transfer learning-enhanced CNN model for integrative ultrasound and biomarker-based diagnosis of polycystic ovarian disease

Machine Learning


Polycystic Ovarian Disease (PCOD) remains one of the most common endocrine disorders among women of reproductive age, presenting notable diagnostic complexities due to similarities in clinical symptoms8 and the presence of nuanced morphological differences in ovarian structures identifiable via ultrasound imaging. The advent of artificial intelligence (AI) and deep learning has brought transformative potential to the domain of medical imaging, including gynecological diagnostics. In recent years, several studies have explored AI-powered approaches for PCOD detection, leveraging ultrasound images and multimodal datasets to enhance diagnostic precision and reduce inter-observer variability.

Divekar et al.9 pioneered the use of transfer learning-based deep convolutional neural networks, specifically InceptionV3, for automatic classification of PCOD using ultrasound images. Their model achieved an impressive diagnostic accuracy of over 90%, reinforcing the role of AI in overcoming the limitations of manual interpretation. Complementing this, Cahyono et al. integrated the SMOTE technique to address class imbalance issues and implemented lightweight CNN models for PCOS prediction, resulting in enhanced classification performance. This approach highlights the importance of data augmentation in medical AI systems where pathological class representation is often limited.

In an extensive review, Di Michele et al.10 traced the evolution of ultrasound as the primary diagnostic tool for PCOD and emphasized the growing influence of AI and 3D imaging in enhancing diagnostic reproducibility and reducing operator bias. Echoing this, Yadav et al. proposed a novel MOT-SF segmentation technique for capturing PCO morphology in ultrasound images, significantly improving detection accuracy of cystic features that might otherwise escape manual scrutiny.

Optimizing AI models for multi-modal data remains a critical focus area. A 2024 study proposed combining clinical features11 and ultrasound images using a hybrid of VGG16 and support vector machine (SVM) classifiers, outperforming conventional models in PCOD classification tasks. In a parallel direction, Lv et al. ventured into an unconventional diagnostic avenue by utilizing scleral images processed via deep learning models to detect PCOD, achieving notable accuracy and suggesting potential alternative diagnostic markers beyond ovarian imaging.

Further enriching this field, Banerjee and Monir12 demonstrated the versatility of EfficientNet architectures by adapting its variants for breast cancer classification from ultrasound images, achieving high diagnostic performance and computational efficiency. This research underscores the adaptability of EfficientNet models for gynecological imaging applications, including PCOD, where high-resolution detail and computational economy are essential.

A growing body of work has investigated the role of Bayesian optimization for hyperparameter tuning in AI-driven medical applications. Lee et al.13 introduced the BOSS framework, effectively combining Bayesian optimization with self-distillation, enhancing model robustness in various image classification tasks. Tran et al. applied Bayesian optimization to optimize 2D echocardiography segmentation tasks, improving both computational efficiency and diagnostic precision. Onorato (2024) further confirmed the value of Bayesian tuning in neural network-based image classifiers, demonstrating consistent improvements in predictive performance.

Notably, Wang et al.14 applied Bayesian optimization in high-dimensional settings, advancing automated radiation therapy treatment planning. While Chi et al. tailored Bayesian optimization for multi-task hyperparameter tuning in SVM-based pulmonary nodule classifiers, improving diagnostic throughput in complex imaging scenarios. Liao et al. further enhanced hyperparameter selection in dimension reduction techniques, a crucial task in high-dimensional medical image datasets, improving interpretability and visualization.

To address the challenge of small and imbalanced datasets in medical imaging, Sun15 developed MCMC-driven Bayesian neural networks that achieved high-precision classification in limited-sample environments, offering potential solutions for gynecological imaging where annotated data remains scarce. Sitheswaran and Malarkhodi also contributed by designing an automated follicle detection system for ultrasound-based PCOD diagnosis, improving accuracy and clinical workflow efficiency.

Nazarudin et al.16 presented a hybrid segmentation approach for enhanced monitoring of PCOD via ultrasound imaging, increasing segmentation accuracy in complex ovarian structures. Meanwhile, Kaur et al. employed transfer learning for PCOS detection, demonstrating the adaptability of pre-trained CNNs in reproductive health diagnostics. Similarly, Vasavi et al. applied machine learning algorithms to monitor PCOS progression, emphasizing the importance of AI in continuous patient care.

Suha and Islam17 extended machine learning techniques for PCOD detection, improving diagnostic reliability through enhanced feature extraction pipelines. Poorani and Khilar evaluated various machine learning classifiers for ovarian ultrasound image analysis, providing valuable insights into model-specific performance and limitations in clinical deployment.

Collectively, these studies underscore the growing consensus around AI’s indispensable role in PCOD diagnostics, particularly when addressing limitations like small datasets, operator dependence, and complex image characteristics. The consistent success of EfficientNet architectures, transfer learning frameworks, and Bayesian optimization strategies across these studies positions them as promising candidates for integration into future AI-driven diagnostic systems for PCOD.

However, existing research often remains confined to either image-based or clinical data alone. There is a conspicuous gap in multi-modal integration frameworks combining imaging and biochemical markers, which could significantly enhance diagnostic reliability. Furthermore, most studies rely on limited or imbalanced datasets, underscoring the need for large, diverse, and clinically representative datasets for robust AI model training. Additionally, while model accuracy remains the predominant evaluation metric, future works must also rigorously assess models against sensitivity, specificity, interpretability, and computational efficiency criteria—vital for real-world clinical adoption.

Our proposed work aims to address these limitations by implementing an Enhanced \(\mathrm {EfficientNet-B3}\) architecture18 with Bayesian hyperparameter optimization, trained on a balanced, augmented ovarian ultrasound dataset. By building on these foundational studies and integrating robust optimization, we endeavor to contribute a scalable, accurate, and clinically deployable AI system for early, non-invasive PCOD diagnosis. Polycystic Ovarian Disease (PCOD) represents a complex endocrinological disorder characterized by heterogeneous clinical manifestations and subtle ovarian morphological variations. The diagnostic reliance on ultrasound imaging presents challenges including operator dependency and subjective interpretation. Recent advancements in artificial intelligence (AI), particularly deep learning (DL), have emerged as transformative in medical imaging diagnostics, offering objective, scalable, and precise detection tools.

A significant body of work has concentrated on applying deep convolutional neural networks (CNNs) and transfer learning techniques for PCOD diagnosis using ultrasound images. Divekar et al.9 utilized the InceptionV3 CNN architecture, leveraging pre-trained weights to classify PCOD with an accuracy exceeding 90%, effectively addressing limited dataset sizes common in medical imaging. Similarly, Rajan and Singh19 employed transfer learning with ResNet50 combined with comprehensive data augmentation techniques, thereby improving model robustness and generalization to multi-center datasets. Extending this, Venkatesh et al. demonstrated that transfer learning could support multiclass classification tasks distinguishing PCOD, ovarian cysts, and normal ovaries, highlighting the method’s adaptability for complex diagnostic frameworks.

Complementing these, Gomez et al. and Banerjee and Monir12 successfully adapted EfficientNet variants for ovarian and breast ultrasound image classification, respectively. EfficientNet’s compound scaling of network depth, width, and resolution enables the capture of high-resolution features essential in identifying subtle ovarian morphological changes. The \(\mathrm {EfficientNet\text {-}B3}\) model, in particular, balances computational efficiency with accuracy, making it suitable for resource-limited clinical settings.

Accurate segmentation of ovarian follicles and cystic structures underpins effective PCOD diagnosis. Oliveira et al. introduced an attention-based U-Net segmentation framework, enhancing follicle delineation in ultrasound images and improving subsequent classification accuracy. Yadav et al.20 developed the MOT-SF segmentation technique to effectively capture polycystic ovary morphology, overcoming limitations posed by noisy ultrasound images. These segmentation advancements provide richer features for AI models, which in turn enhance diagnostic sensitivity and specificity.

The scarcity of large, balanced datasets remains a formidable obstacle. Cahyono et al.21 applied SMOTE-based synthetic oversampling to alleviate class imbalance, allowing lightweight CNNs to perform better on minority classes. Gupta et al. further addressed data scarcity by implementing semi-supervised learning, combining labeled and unlabeled ultrasound images to improve PCOD classification in limited-data contexts. Such approaches underscore the importance of data-centric strategies alongside model architecture in medical AI.

Optimizing model parameters is crucial for achieving reliable and generalized performance. Bayesian optimization has gained traction for hyperparameter tuning due to its efficiency and effectiveness. Lee et al.22 proposed the BOSS framework, combining Bayesian optimization with self-distillation, leading to improved generalization. Chen et al. demonstrated that Bayesian optimization can significantly enhance segmentation tasks in medical imaging by efficiently navigating complex hyperparameter spaces.

Other studies have evaluated optimization methods in diverse clinical contexts. Tran et al.23 optimized echocardiography segmentation models using Bayesian methods, improving performance and training efficiency. Onorato (2024) confirmed Bayesian tuning’s superior performance in CNN hyperparameter search, while Chi et al. applied multi-task Bayesian optimization to SVM classifiers in pulmonary nodule detection, highlighting its applicability across different algorithms and medical domains.

Innovations in model architectures further enhance PCOD diagnostic capabilities. Kumar et al.24 proposed a hybrid CNN-LSTM model capturing spatial and temporal ultrasound features, addressing the progressive nature of follicular changes. Singh et al. employed capsule networks to maintain spatial hierarchies in ultrasound images, improving follicle abnormality detection beyond traditional CNNs.

Mahmood et al.25 explored reinforcement learning to dynamically select regions of interest in ultrasound images, enhancing efficiency and diagnostic accuracy. Rodriguez and Smith (2024) similarly integrated CNN-RNN frameworks for temporal analysis of follicular growth, paving the way for personalized PCOD management.

Explainability remains a pivotal concern for clinical AI adoption. Lee and Park26 integrated SHAP (SHapley Additive exPlanations) into PCOD classification models, providing clinicians with interpretable insights into AI decisions and fostering trust. Chen et al. combined explainability with ensemble deep learning, enabling more transparent PCOD diagnosis.

Lightweight and real-time deployable AI systems are critical for broader clinical impact. Kaur et al.27 designed mobile-optimized CNN architectures for real-time PCOD ultrasound screening, crucial for low-resource settings. Park et al. developed noise-resilient CNNs incorporating denoising autoencoders, addressing the common problem of poor-quality ultrasound images.

The integration of clinical biomarkers and imaging data has shown promise in improving PCOD diagnostic accuracy. Zhang et al.28 developed a multimodal fusion framework combining hormonal profiles and ultrasound images, facilitating personalized and precise diagnoses. Nair and Thomas (2024) incorporated 3D ultrasound imaging with CNNs, capturing volumetric ovarian features and surpassing traditional 2D methods.

Despite promising advances, many studies suffer from limitations including small or imbalanced datasets, limited diversity in populations, and narrow focus on accuracy metrics without comprehensive evaluation of sensitivity, specificity, or model uncertainty. Kim et al. (2024) addressed this by applying Bayesian deep learning to quantify prediction uncertainty, critical for risk assessment and clinical decision-making.

There remains a gap in fully integrated, multi-modal, interpretable AI frameworks that seamlessly combine imaging, clinical, and biochemical data for holistic PCOD diagnosis and management. Furthermore, computational complexity and lack of real-time capability hinder clinical deployment of many deep learning models.

Recent years have seen increasing application of deep learning models for medical image classification, particularly in radiology and gynecological imaging domains. Convolutional Neural Networks (CNNs), including architectures like ResNet, DenseNet, and EfficientNet, have demonstrated promising accuracy in medical diagnostic tasks. For instance, prior studies applying CNNs to ultrasound images for ovarian lesion detection reported competitive sensitivity but often suffered from limited dataset size and high inter-observer variability in annotations29.

Despite these advancements, most existing studies focus solely on image-based models, overlooking the potential benefits of integrating clinical biomarker data. A study by Zhou et al.30 proposed a deep learning framework for polycystic ovary syndrome (PCOS) classification using ultrasound images, but acknowledged limitations due to the absence of hormone data incorporation. Similarly, transfer learning-based approaches improved performance on small datasets31, yet struggled with class imbalance and interpretability challenges.

A notable contradiction in the literature is the inconsistent emphasis on explainability tools. While Grad-CAM and saliency maps are frequently recommended, many PCOD-related imaging studies either omit visual explanations entirely or provide limited qualitative analysis, reducing clinical trust in AI-assisted decisions. This gap was also highlighted by Almotiri et al.32, who advocated integrating attention maps for clinical validation.

Moreover, few works have explored optimized hyperparameter tuning via probabilistic methods. Recent optimization frameworks such as Bayesian Optimization have demonstrated superior convergence and performance stability in small-scale biomedical datasets33, but remain underutilized in gynecological imaging AI models.

Importantly, multi-modal data integration remains a significant gap. The majority of PCOD classification studies have either focused exclusively on imaging34 or on biochemical markers without imaging context35. The review by Haque et al.36 emphasized the diagnostic benefits of combining heterogeneous medical data, yet empirical implementations, especially for ovarian disorders, are scarce.

To address these shortcomings, the present study proposes a multi-modal framework integrating transvaginal ultrasound images with clinical biomarkers (LH, FSH, AMH) using an enhanced \(\mathrm {EfficientNet-B3}\) backbone optimized via Bayesian hyperparameter tuning. This design aims to improve both classification accuracy and interpretability through Grad-CAM visualizations and statistically validated performance comparisons.

Recent advancements in metaheuristic optimization and hybrid AI frameworks have demonstrated notable potential in biomedical data analysis, particularly for disease prediction and feature selection. Karadayı Ataş37 introduced a clustered-based binary grey wolf optimizer approach to tackle feature selection problems, uncovering genetic associations between non-Hodgkin lymphomas and rheumatologic disorders. The study emphasized the effectiveness of nature-inspired optimization for high-dimensional biomedical data. Expanding on hybrid methodologies, Ataş38 proposed a novel AI model that classified brain structure functionality to predict student achievement levels, offering an intriguing cross-domain application of neuroimaging and classification systems. Further, a Harris Hawks Optimization (HHO)-based clustering strategy was presented by Ataş39, capable of revealing genetic associations across osteoarthritis and multiple cancer types, highlighting the versatility of adaptive clustering algorithms in genomic studies.

Recent studies by Pınar Karadayı Ataş and collaborators have demonstrated the effectiveness of advanced metaheuristic optimization and hybrid machine learning models in various biomedical applications. In their 2025 work, Karadayı Ataş proposed a cluster-based binary grey wolf optimizer designed to address feature selection challenges in uncovering genetic relationships between non-Hodgkin lymphomas and rheumatologic disorders, highlighting the importance of efficient feature selection in high-dimensional genomic datasets37. Expanding this optimization strategy, Ataş introduced a Harris Hawks Optimization-based clustering framework for exploring genetic associations across osteoarthritis and multiple cancer types, offering a powerful unsupervised learning approach for complex biological data39. In the area of neuroimaging and cognitive assessment, a 2024 study applied a novel hybrid classification system to evaluate brain structures and their functional attributes for predicting student achievement, demonstrating the adaptability of AI methods beyond clinical environments38. For neurodegenerative disease diagnosis, Karadayi-Ataş et al. developed a Variable Neighborhood Search (VNS) framework to support early identification of Alzheimer’s disease from mild cognitive impairment, achieving notable improvements in diagnostic reliability compared to conventional models40. Additionally, Karadayı Ataş explored the complex molecular interplay between Polycystic Ovary Syndrome (PCOS) and endometrial carcinoma using an ensemble clustering method with hyperparameter tuning, providing valuable insight into potential comorbid mechanisms41. Lastly, a 2023 study introduced a hybrid AI framework for predicting co-occurring diseases associated with Hashimoto’s thyroiditis, demonstrating enhanced diagnostic accuracy for autoimmune comorbidities through multi-modal data integration and advanced clustering techniques42. Together, these investigations underscore the versatility and clinical promise of optimization-driven AI models in biomedical informatics.

To contextualize this work within the latest developments, recent studies were also reviewed. Wang et al.43 introduced a deep learning pipeline for ovarian cancer classification but lacked biomarker integration. Singh et al.19 addressed class imbalance with hybrid oversampling, affirming the value of SMOTE and Random-SMOTE variants. The optimization work of Ahmad et al.25 confirmed Bayesian Optimization’s efficacy over grid and random search in biomedical data contexts. Recent works by Lee et al.22 and Park et al.35 further advocate multi-modal AI systems, though applied to cardiovascular imaging.

These recent studies reaffirm both the need and contemporary relevance of the integrative, interpretable, and optimally tuned diagnostic framework proposed in this paper.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *