Clinical Features
Of the 478 eligible patients, 419 did not have omental metastases and 59 had omental metastases, for an incidence of 14.1%. The training cohort consisted of 262 patients, of which 189 were male and 79 were female, with 229 patients without omental metastases and 33 with omental metastases. The test cohort consisted of 112 individuals, of which 83 were male and 29 were female, with 13 patients with omental metastases and 99 without omental metastases. The validation cohort consisted of 104 patients, of which 75 were male and 29 were female, with 13 patients with omental metastases and the remaining 91 patients without omental metastases. No significant statistical differences were found in the clinical characteristics of the training, test, and validation cohorts (Table 1), ensuring the reliability of the results obtained from the test and validation cohorts.
Selection of radiological and clinical features
We extracted 864 radiomic features from the arterial phase CT images. The detailed radiomic features are shown in Additional file 1. Features with intraclass correlation coefficients (ICC) greater than 0.75 were considered stable features, and 548 radiomic features were finally selected. The details of the ICC values of the radiomic features are shown in Additional file 2. The radiomic features of the training cohort were analyzed using LASSO regression. As the lambda value increased, the absolute values of the feature coefficients gradually decreased and eventually approached 0 (Fig. 2A). As the lambda value increased, the bias percentages initially decreased gradually and then increased gradually. The optimal lambda value corresponds to the minimum bias percentage (Fig. 2B). The radiological features selected based on the optimal lambda value were diagnostic image original mean (DIOM), original shape largest 2D diameter slice (OSMDS), original shape largest 3D diameter (OSMD), original primary kurtosis (OFK), wavelet LH primary kurtosis (WLFK), and wavelet HLH Gldm large-scale dependent high gray-level enhancement (WHGL).Similarly, the clinical features of the training cohort were subjected to LASSO regression to select distinct clinical features.The clinical features selected based on the optimal lambda value were CA125 and clinical N staging (Figure 2C,D).

Screening of radiological and clinical features.a) Relationship between lambda value and radiomic feature coefficient. (B) Relationship between lambda value and bias rate of radiomic features. (C) Relationship between lambda value and clinical characteristics coefficient;(is) Relationship between lambda value and bias rate of clinical characteristics.
SVM-based predictive model
In the omental metastasis prediction model constructed using SVM, the model had the highest accuracy when the number of vectors was 7 (Fig. 3A). When the multidimensional data of LAGC patients was converted to two-dimensional data, the omental metastasis group and the non-omental metastasis group showed a more clearly concentrated distribution in two-dimensional space (Fig. 3B). In the training cohort, the AUC value of the prediction model was 0.844, the sensitivity and specificity were 0.849 and 0.704, respectively, and the PPV and NPV were 0.292 and 0.790, respectively (Fig. 3C). In the test cohort, the AUC of the prediction model was 0.735, the sensitivity and specificity were 0.769 and 0.622, respectively, and the PPV and NPV were 0.212 and 0.953, respectively (Fig. 3D). In the validation cohort, the AUC of the prediction model was 0.741, the sensitivity and specificity were 0.833 and 0.756, respectively, and the PPV and NPV were 0.385 and 0.974, respectively ( Figure 3E ).

SVM-based capability evaluation of predictive models.a) Number of vectors in SVM and model accuracy;(B) Distribution of patients with omental metastasis and those without omental metastasis in two-dimensional space (C) ROC curve for the training cohort (is) ROC curve for the test cohort;(picture) ROC curve for the validation cohort.
DT-based prediction model
For DT, the number of tree split nodes was determined to be 3 (Figure 4A). The variable features of the constructed prediction model were ranked according to their importance, and the top six features were clinical N staging, CA125, DIOM, WLFK, OFK, and OSMD, respectively (Figure 4B). Based on the DT split node number of 3, the top three important features were selected to construct the DT prediction model (Figure 4C). In the training cohort, the AUC of the prediction model was 0.759, the sensitivity and specificity were 0.606 and 0.883, respectively, and the PPV and NPV were 0.426 and 0.940, respectively (Figure 4D). In the test cohort, the AUC of the prediction model was 0.624, and the sensitivity and specificity were 0.231 and 0.990, respectively. The PPV and NPV were 0.227 and 0.910, respectively (Figure 4E).In the validation cohort, the AUC of the prediction model was 0.658, the sensitivity and specificity were 0.400 and 0.917, respectively, and the PPV and NPV were 0.400 and 0.917, respectively (Figure 4F).

DT-based capability assessment of predictive models.a) Relationship between the number of division points of DT and the complexity parameter. (B) Importance of clinical and radiological features in DT-based prediction models;(C) decision tree diagram; (is) ROC curve for the training cohort (picture) ROC curve for the test cohort;(debt) ROC curve for the validation cohort.
RF-based Predictive Model
In the RF prediction model, the out-of-bag (OOB) error reaches the minimum value of 0.122 when the number of trees in the model is 5 (Fig. 5A). We then determine the tree split nodes, and the error reaches the minimum value of 0.318 when the number of split nodes is 6 (Fig. 5B). Furthermore, we conducted feature importance analysis, and among the eight selected features, OSMD, OSMDS, and N-staging played important roles in the prediction accuracy. N-staging, OSMDS, and OFK played important roles in lowering the Gini coefficient of the prediction model (Fig. 5C). In the training cohort, the AUC value of the prediction model was 0.995, the sensitivity and specificity were 0.970 and 0.965, respectively, and the PPV and NPV were 0.800 and 0.995, respectively (Fig. 5D). In the test cohort, the AUC value of the prediction model was 0.750, the sensitivity and specificity were 0.769 and 0.663, respectively, and the PPV and NPV were 0.233 and 0.956, respectively (Figure 5E).In the validation cohort, the AUC value of the prediction model was 0.808, the sensitivity and specificity were 0.750 and 0.800, respectively, and the PPV and NPV were 0.308 and 0.964, respectively (Figure 5F).

RF-based capability evaluation of predictive models.a) The relationship between the number of trees and the OOB error in RF, where “0” represents the omental metastasis group and “1” represents the non-omental metastasis group. (B) The relationship between the number of split points in the tree and the error. (C) The importance of radiological and clinical features in improving model accuracy and reducing the Gini coefficient. (is) ROC curve for the training cohort (picture) ROC curve for the test cohort;(debt) ROC curve for the validation cohort.
KNN-based predictive model
Through hyperparameter optimization of the KNN function, we found that the best kernel function for the prediction model was “triangle” and the optimal k value was 14 (Fig. 6A). We constructed a prediction model based on these conditions. In the training cohort, the AUC value of the prediction model was 0.759, the sensitivity and specificity were 0.714 and 0.827, respectively, and the PPV and NPV were 0.370 and 0.952, respectively (Fig. 6C). In the test cohort, the AUC value of the prediction model was 0.797, the sensitivity and specificity were 0.909 and 0.598, respectively, and the PPV and NPV were 0.227 and 0.980, respectively (Fig. 6D). In the validation cohort, the AUC value of the prediction model was 0.611, the sensitivity and specificity were 0.500 and 0.852, respectively, and the PPV and NPV were 0.333 and 0.920, respectively (Fig. 6E).

KNN and LR-based capability evaluation of predictive models.a) Screening for optimal kernel function and k value;(B) The best predictive model and the AIC value for each feature. (C) ROC curve of the training cohort for the LR-based prediction model. (is) ROC curve of the test cohort for the LR-based prediction model. (picture) ROC curve of the validation cohort for the LR-based prediction model. (debt) ROC curve of the training cohort for the KNN-based prediction model.G) ROC curve of the test cohort for the KNN-based prediction model.H) ROC curves of the validation cohort for KNN-based prediction models.
LR-based predictive model
LR analysis showed that when the prediction model consisted of N staging, CA125, DIOM, and WHGL, the regression model was in the best fitting state, with an AIC value of 155.24. When the prediction model removed these features or added other features, the AIC value increased (Fig. 6B). In the training cohort, the AUC value of the regression prediction model was 0.864, the sensitivity and specificity were 0.879 and 0.774, respectively, and the PPV and NPV were 0.358 and 0.978, respectively (Fig. 6F). In the test cohort, the AUC value of the prediction model was 0.808, the sensitivity and specificity were 0.615 and 0.888, respectively, and the PPV and NPV were 0.210 and 0.941, respectively (Fig. 6G). In the validation cohort, the AUC value of the prediction model was 0.782, the sensitivity and specificity were 0.692 and 0.846, respectively, and the PPV and NPV were 0.250 and 0.953, respectively (Fig. 6H).
Comparing the predictive ability of different models
The predictive abilities of omental metastasis models of LAGC constructed using different machine learning methods are shown in Table 2. In the training cohort (Figure 7A), the RF prediction model had better accuracy, AUC, sensitivity, specificity, PPV, and NPV compared with LR, SVM, DT, and KNN. The RF prediction model had significantly improved PPV compared with the other four prediction models. The DT prediction model had a lower sensitivity compared with the other four prediction models, with a sensitivity of only 0.606. In the test cohort (Figure 7B), all five prediction models constructed with machine learning showed relatively low PPV. The evaluation indexes of the DT prediction model were more significantly different compared with the other four prediction models, with the DT prediction model's sensitivity of 0.231 and specificity of 0.990. The LR constructed prediction model had the lowest PPV of 0.210 compared with the other four prediction models. In the external validation cohort (Figure 7C), the evaluation index results of the predictive model were generally similar to those of the test cohort.

A line chart showing the predictive power of machine learning-based models.a) Training cohort; (B) test cohort; (C) Validation cohort.