Construction and validation of a machine learning-based clinical risk model for screening characteristic factors of lymphovascular invasion in endometrial cancer

Machine Learning


Comparison of patient clinical data

When analyzing the patient data, significant statistical differences were observed in myometrial invasion depth, tumor size, pathological type, histological grade, and cervical stromal invasion between the LVSI and non-LVSI groups (P < 0.05, Table 1).

Table 1. Analysis of differences in clinical data between the LVSI and non-LVSI groups in the training group.

Comparison of patient test data

The clinical test values ​​of the LVSI patients were compared with those of the non-training group. As a result, significant statistical differences were observed between the two groups in lymphocyte count (LYM), monocyte count (MONO), albumin (ALB), fibrinogen (FIB), monocyte-to-lymphocyte ratio (MLR), and platelet-to-lymphocyte ratio (PLR) (P < 0.001, Table 2).

Table 2 Comparison of clinical test indicators of patients with and without LVSI in the training group.

Multivariate logistic regression analysis

Values ​​were assigned to differential indices in univariate analysis (Table 3), where the cut-off values ​​were used as cut-off points to dichotomize the metric data.16Multivariate logistic regression analysis revealed that myometrial invasion depth ( P < 0.001, OR = 17.876, 95% CI 5.546–57.619), cervical stromal invasion (P = 0.003, OR = 8.028, 95% CI 2.037–31.639), LYM (P = 0.001, OR = 0.159, 95% CI 0.055–0.465), mono (P= 0.001, OR = 0.149, 95% CI 0.048–0.460), ALB (P= 0.002, OR = 0.175, 95% CI 0.059–0.524), FIB (P< 0.001, OR = 11.853, 95% CI 4.285–32.788) was an independent risk factor for LVSI in EC (Table 4 ).

Table 3 Allocation table.
Table 4 Multivariate logistic regression

LASSO regression analysis

In this study, LASSO regression was used to screen characteristic factors leading to LVSI in EC. In LASSO regression, 22 characteristic factors were identified when = λmin (0.0046564) and 19 characteristic factors were identified when = λ0.1se (0.010757) (Figure 1A). Considering the generalization performance of the model, 19 characteristic factors were identified when = λ0.1SE was selected to build the model. A total of 19 characteristic factors were screened out, including age, menarche, menopause, parity, parity, diabetes history, myometrial invasion depth, tumor diameter, pathological type, histological grade, cervical stromal invasion, adnexal metastasis, FIGO stage, NEUT, LYM, MONO, ALB, FIB, and PLR (Figure 1B).

Figure 1
Figure 1

LASSO regression screening of characteristic factors of lymphovascular space invasion in endometrial cancer.a) LASSO regression analysis for screening characteristic factors leading to lymphovascular space invasion in endometrial cancer. (B) = λ.1se for 19 characteristic factors.

Building a risk model

Two risk models were constructed using logistic regression and LASSO regression. The logistic regression model was constructed using the β coefficient (Table 4), and this coefficient constructed model was also used in the LASSO regression (Table 5). By comparison, in both models, the risk scores of patients in the LVSI group were significantly higher than those of patients in the non-LVSI group in the training group, with no statistical difference (P< 0.001, Fig. 2A). Delong test analysis showed that the area under the curve (AUC) of the risk model constructed based on logistic regression was significantly lower compared with the LASSO-constructed model (P< 0.001, Figure 2B).

Table 5. LASSO characteristic variables.
Figure 2
Figure 2

Comparison of risk scores and prediction efficiency between LVSI and non-LVSI patients in the training group.a) Comparison of patient scores in the training group calculated by logistic regression and LASSO regression.B) ROC curve analysis of the AUC of logistic regression and LASSO regression risk scores for predicting lymphovascular space invasion in endometrial cancer. NoteLVSI: lymphatic invasion; ROC: receiver operating characteristic; AUC: area under the curve; C represents the non-LVSI group; PRepresents an LVSI group.

Modeling Verification

When comparing patient baseline data between the training and validation groups, no statistical differences were found (PLogistic and LASSO risk scores were calculated for patients in the validation group. Patients with LVSI in the validation group had higher logistic and LASSO risk scores compared with non-LVSI patients, with statistically significant differences (P< 0.001, Fig. 3A). The Delong test then revealed that the AUC of the logistic regression risk model was significantly lower than that of the LASSO risk model (P< 0.001, Figure 3B , Tables 7 and 8 ).

Table 6. Comparison of clinical data between the training and validation groups.
Figure 3
Figure 3

Comparison of risk scores and predictive efficiency between LVSI and non-LVSI patients in the validation group.a) Comparison of risk scores of patients in the validation group calculated by logistic regression and LASSO regression. (B) ROC curve analysis of the AUC of logistic regression and LASSO regression risk scores for predicting lymphovascular space invasion in endometrial cancer. NoteLVSI: lymphatic invasion; ROC: receiver operating characteristic; AUC: area under the curve; C represents the non-LVSI group; PRepresents an LVSI group.

Table 7. ROC curve parameters.
Table 8. Delong test parameters.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *