A machine learning model for the prediction of invasive Klebsiella pneumoniae liver abscess syndrome in diabetes mellitus: a single-center retrospective study. BMC infection

Machine Learning


The high incidence of IKPLAS is predominantly in Asian populations, which may be related to the fact that Asian populations are more likely to colonize the gut with K1/K2 serotype Klebsiella pneumoniae. . [29, 30]Diabetes is considered a significant risk factor for IKPLAS, with up to 63% of patients with bacterial liver abscess in Taiwan having diabetes. This may be related to impaired phagocytosis of her K1/K2 Klebsiella pneumoniae in diabetic patients. [31] Better vascular permeability in diabetics to facilitate bacterial invasion [11]The above two serovars of Klebsiella pneumoniae are also highly virulent Klebsiella pneumoniae, exhibiting high viscosity in string tests. [9]Although highly virulent, Klebsiella pneumoniae is sensitive to most antibiotics, it often has a poor prognosis for patients if not recognized and treated early. [32].

This study screened four characteristic variables: hemoglobin, platelets, D-dimer, and SOFA score. The SHAP package, whose SOFA score ranked first among all four models, was used to interpret the importance of model trait variables.

The SOFA score is a scoring system that measures the degree of impairment of vital organ function in patients with sepsis or suspected sepsis to determine prognosis. [33]Several studies have confirmed its predictive value in the prognosis of infected patients [34, 35]This study also suggests that the SOFA score is an important predictor of diabetes complicated by IKPLAS. As can be seen from the SHAP plot, the higher the SOFA score, the higher the risk of progression to IKPLAS. The etiology of IKPLAS is currently unknown, but a study by Chen-Guang Zhang et al. Shows that most diabetics with IKPLAS are prone to sepsis [11]Bloodborne transmission may be one of the more important routes.

The impact of platelets on the SVM model ranked second in the functional importance ranking. Jai Hoon Yoon et al. Klebsiella pneumoniae showed that thrombocytopenia is an independent risk factor for invasive syndrome in diabetic patients with liver abscess [10]This is also consistent with the SVM model platelet conclusions established in this study. The mechanism of thrombocytopenia in diabetes combined with IKPLAS is that when the body is infected, platelets are stimulated and activated, and by inducing the expression of membrane proteins and the production of mediators, they participate in the body’s inflammatory response and anti-inflammatory It is possible that the role of Infection and removal of pathogens.Activated platelets produce and release pro-inflammatory, anti-inflammatory, chemokines, antimicrobial and other mediators to modulate the body’s innate or adaptive immune response [36]Interactions between platelets and pathogens or their products, endothelial cells, and immune cells promote endothelial cell damage and leukocyte activation. As a result, platelet adhesion is promoted, platelets in the blood are continuously activated, and anti-platelet antibodies and macrophage colony-stimulating factors are continuously produced in the body, promoting platelet destruction and consumption. [37].

SHAP plots show that hemoglobin is the third most important characteristic variable after SOFA score, with lower values ​​indicating higher risk of progression to IKPLAS. Hemoglobin can be an index for assessing disease severity in infected patients, presumably due to reduced erythropoiesis, increased destruction of red blood cells by hemolysis, and a systemic inflammatory response that leads to bleeding leading to reduced capacity is shown.Lack of blood to carry oxygen and carbon dioxide and inadequate oxygen supply to the body, resulting in multiple organ damage [38].

D-dimer is a specific molecular marker of secondary hyperfibrinolysis in vivo and a valid indicator of the coagulation state of the body. The coagulation and fibrinolytic systems are usually closely associated with the development of inflammation.Infection leads to damage to vascular endothelial cells and alveolar epithelial cells, stimulates the coagulation system, and results in impaired coagulation function and abnormal coagulation index in patients, aggravated by infection-associated elevation of D-dimer. [39, 40]The two above promote each other and form a vicious circle. Autoimmune function is weakened in diabetics and inflammatory responses are enhanced after infection. Diabetic patients with IKPLAS may show marked D-dimer changes in early stages. In the SVM model, D-dimer was positively associated with her IKPLAS risk of developing diabetes, which is consistent with the above findings.

In the area of ​​IKPLAS, more research is focused on risk factors for IKPLAS. [41] showed that patients using IKPLAS were more likely to develop chronic renal failure, thrombocytopenia, and increased total bilirubin than those using non-IKPLAS.Wang Hailui and others [42]A logistic regression prediction model was used to predict the incidence of IKPLAS by incorporating clinical and CT features. The validation set had an AUC value of 0.842, with no other predictive models compared. Unlike many studies, initially he used seven machine learning models. for forecasting. Parameter tuning and validation showed that the SVM model with the best performance with an AUC value of 0.969 and an AP value of 0.890 was selected and is a reliable IKPLAS predictive model. At the same time, the variables included in this model are clinical indices, are easy to collect, and can be used by clinicians to conveniently determine her IKPLAS potential in a patient with diabetes mellitus complicated by Klebsiella pneumoniae liver abscess. .

Machine learning algorithms can build complex models that perform well given enough data. However, for certain applications, the amount of data is often insufficient, so it is essential to analyze these machine learning algorithms and obtain good results with relatively small sample sizes. Although this study used only a small data set of 213 patients, the power analysis was satisfied with he calculating a power value greater than 0.80. The main reason for the superior performance of his SVM model in this work is that it is a nonlinear learner that is better suited for small samples, can ideally separate samples, and has better generalization.

This study still has some limitations. First, this is a single-centre regression study, and some potential biases cannot be avoided. Second, for machine learning, the sample size of this study is insufficient. To further improve the accuracy of the model, we will collect more clinical data and further optimize the parameters.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *