This study aims to develop non-invasive diagnosis of prostate cancer through a machine learning approach based on MRI data, which has been gaining interest in research.18. The Random Forest Algorithm, an ensemble machine learning method, aggregates predictions from multiple decision trees, each trained on a randomly selected subset of the entire dataset. By aggregating predictions from all trees, we can reduce noise that can affect individual trees and lead to overfitting, which can improve accuracy and reduce variance19. Donisi et al. Radiomics-based machine learning models are demonstrated using open and accessible datasets that are well-versed in clinically important prostate cancer (PCA), and Random Forest models achieve the highest accuracy20. In this study, the random forest algorithm was applied to ADC sequences integrated with clinical parameters such as TAP, and showed a complimentary performance when achieving an AUC of 0.915. These findings suggest that the proposed model has robust diagnostic capabilities to distinguish between prostate cancer and benign prostatic hyperplasia.
TAP represents a tumor biomarker identified in recent years, and its assessment can be used to indirectly reflect the number of transformed cells and the extent of malignant tumors. Widely applied to screening and prognostic evaluation of gastric cancer, bladder cancer, prostate cancer and other diseases21,22. Over the past few years, there have been significant advances in the diagnostic efficacy of MPMRI in the prostate and related radiation research.9,23,24,25. Such radioactive studies focus primarily on the differentiation of non-biodiagnostic diagnosis and the phenotype of aggressive diseases, as well as diagnosis of clinical significance and non-clinical significance.26,27,28. Currently, taps are used frequently on their own or combined with PSA levels for the diagnosis of prostate cancer7. However, few studies have examined the integration of TAP with alternative diagnostic modalities. There is a lack of high quality research in this domain regarding the development of diagnostic models that integrate TAP with Radiomics capabilities.
In MPMRI, T2WI provides a clear depiction of the band-like anatomy of the prostate gland. On the other hand, ADC values reflect tissue microstructural properties, including cell density and intracellular complexity. These parameters distinguish between malignant lesions by contrasting the higher cell density and more organized microstructures commonly found in benign prostate tissue. Parameters derived from ADC values act as tumor invasive proxy29,30,31,32,33. Furthermore, ADCs and T2WIs have been shown to negatively correlate with the proportion of nuclear or cytoplasmic regions measured in histological prostate tissue specimens, and are positively correlated with the proportion of lumen space.32. Increased Gleason score leads to more chaotic and fragmented lumens34. Results show the possibility of MPMRI integrating morphological, metabolic, and vascular characterizations to detect prostate cancer. Based on these fundamental principles, our study was designed and performed using both T2WI and ADC sequences as the primary imaging modalities.
In this study, the ADC sequence was Hou et al.35. DWI-based ADC sequences accurately measure water molecule diffusion and are sensitive to both intracellular and extracellular water. This reflects the microstructural properties of tissue and spatial distribution of water molecules. Conversely, T2WI relies on T2 relaxation time measurements to evaluate the lateral relaxation properties of water molecules. This provides information about the rate and extent of free water diffusion within the tissue. The results show that ADC sequences are more sensitive and specific than T2WI for diagnostic purposes.
In our study, the AUC for the random forest model using only ADC sequences was 0.869, and Hou et al. For models using the same ADC sequence30. Additionally, we constructed a random forest model that integrates ADC sequences combining TAP with additional clinical features, achieving an improved AUC of 0.915. Interestingly, Hou et al. A similar model combining ADC sequence and clinical features was adopted to achieve an AUC of 0.912. This shows that our study comprehensively analyzed clinical and radioactive features while applying optimized machine learning methodologies to build predictive models. Importantly, inclusion of TAP in our model highlights progressive diagnostic values to distinguish prostate cancer from benign conditions.
However, this study has some limitations. First, this study employs a retrospective design with a single-centered dataset, but without external validation. Second, small sample sizes introduce selection biases and may limit the generalizability of the results. Clinicians' assessment of potential lesions depends on individual clinical expertise and experience, introducing subjectivity that leads to variations in practitioner interpretations and contradictions. Future research should address the variability, robustness, and reproducibility of radiomics-based models, allowing for a wide range of clinical applications. Large-scale multi-center testing is essential to minimizing selection bias and improving model accuracy. Furthermore, radiation-based prognostic assessments of PCA patients should be prioritized to improve clinical decision-making and coordinate adjuvant therapy. Compared to existing radioactive studies, our study demonstrates the following methodological advantages:
First, many radiation studies use nomograms to construct visualizable models36. However, these approaches often lack computational efficiency to efficiently handle large-scale data and complex machine learning operations. Second, compared to Bayesian and K-Nearth Neighbor (KNN) algorithms, the random forest model exhibits superior performance in processing high-dimensional and unbalanced data sets. Furthermore, our study provides a unique visualization of the decision-making process of the random forest model, increasing transparency and interpretability.
In summary, our study shows that multimodal models that integrate radiomics features, TAP and clinical features, may distinguish between malignant prostate lesions and benign. Our findings highlight the effectiveness of MPMRI with innovative imaging methodology and clinical data for the identification of prostate cancer. Additionally, the integration of automated machine learning into the MPMRI framework provides a comprehensive approach to increasing cancer detection sensitivity and improving risk stratification strategies.
