Multimodal Deep Learning for COVID-19 Prognosis Prediction in the Emergency Department: A Two-Way Study

Machine Learning


This study described the development of an algorithm to predict 30-day mortality and ICU admission for COVID-19 emergency department patients using textual and tabular data.

The use of text data for prognostic purposes was a new aspect of this research.

The main finding was that combining tabular and textual variables improved predictions of 30-day mortality compared to models using only tabular variables. The TensorFlow tabular text model outperformed the tabular model on performance metrics such as specificity, accuracy, F1 score, and MCC for 30-day mortality prediction. For ICU admissions, the combined model had higher precision, specificity, F1 score, and MCC values. Both models were well calibrated and gave low ECE scores.

Methodologies for analyzing text data, or NLP, are commonly used for entity recognition, literature-based discovery, and question answering.25,26. However, its potential to predict outcome in patients with COVID-19 has not been fully explored. For example, Izquierdo et al.twenty fourused NLP to identify and clinically characterize COVID-19 patients from a general cohort. These were input into a decision tree algorithm and admission to ICU was identified as the risk factor most associated with adverse outcomes. Similarly, Graziani et al.27 investigated the impact of COVID-19 on COPD patients and used NLP to identify key characteristics of that population. Identified features correlated with patient outcomes by multivariate analysis and were associated with increased hospitalization and mortality. Ancochair et al.28 We used NLP to identify patient characteristics from EMR and addressed the effect of gender on COVID-19 incidence and severity. The primary objective of the aforementioned investigation was to identify risk factors that may have influenced patient outcomes. In contrast to previous studies, this study utilized NLP analysis of text data to process latent clinical information to predict 30-day mortality and ICU admission in patients with COVID-19. This highlights the importance of including text data in prognostic prediction of COVID-19. A previous study by Silverman et al.29 We also demonstrated the link between text data and outcome prediction by using NLP techniques to extract COVID-19 symptoms and correlate them with in-hospital mortality and ventilator use. These authors observed associations between textual data and outcome prediction and emphasized the importance of textual contributions. Similarly, his TensorFlow tabular text algorithm developed in this study can manage clinical information such as patient history, physical examinations, specialist consultations and radiological reports, which are relevant to clinical outcomes. may provide important information to Previously, optimizing such an approach required manual annotation.30However, the algorithm developed in this study more efficiently exploited the predictive power of the textual variables contained in the EMR without requiring manual corrections.

According to the performance metrics shown in Table 2, the combined tabular and textual model in TensorFlow used only tabular data in predicting 30-day mortality in COVID-19 patients. was more effective than the model This finding is consistent with that of Sung et al.31 They compared the effectiveness of a classical ML model and an NLP-enhanced ML model in predicting acute ischemic stroke prognosis and found that incorporating a textual predictor (i.e., current illness history) improved the algorithm’s AUC measure. found to be However, it is important to note that stroke is a well-studied disease, whereas COVID-19 was still a new disease at the time of data collection. This may have led medical professionals to use more simplified representations in notes and radiology reports, resulting in a reduced amount of information elements suitable for accurate outcome prediction.

The increased specificity observed in current TensorFlow tabular and text models compared to simple tabular models may have important clinical implications. For example, it may help ED physicians determine the optimal time to initiate positive pressure ventilation.32,33 Based on the predicted prognosis, select the ward that best fits the patient’s disposition, such as general medicine, semi-intensive care unit, or ICU. In addition, the combined algorithm has higher accuracy values ​​and fewer false positives, helping to distinguish between patients requiring hospitalization and those eligible for discharge from acute care hospitals, potentially contributing to optimized resource allocation. There is a nature. The F1 score and MCC values, which provide a more comprehensive assessment of the algorithm’s predictions, further support these findings. Integrating textual and tabular data may therefore improve the prediction of negative outcomes in COVID-19 patients.

ML algorithms typically use tabular data to predict the outcome of various diseases such as sepsis and upper gastrointestinal bleeding.26,27. Since the advent of SARS-CoV-2, numerous studies have been conducted for early recognition, diagnosis, and prognosis of novel coronavirus disease (COVID-19), including interstitial pneumonia from chest radiographic (X-ray) images. ML models for tabular databases have been developed.28,29,31,34,35,36.

We found that the average performance of TensorFlow tabular and XGBoost models is consistent with previous literature results.16,37,38,39. His AUC values ​​in his TensorFlow tabular model at present were higher than those in his Vaid et al. using the cohort during the first outbreak in New York City. However, that study focused on predicting his 10-day mortality from COVID-19.32. In comparison with the results of the current study, Gao et al.33 They used a neural network-based algorithm similar to ours and obtained better AUC, sensitivity, specificity, and F1 scores, but used more predictors, leading to model overfitting. may have been connected.40.

The choice of predictors may explain the different prognostic predictive abilities of different tabular models. In our study, we developed a TensorFlow tabular model using age, plasma creatinine, C-reactive protein, hemoglobin, and platelet values.They were selected based on previous research results41,42,43,44,45,46,47,48 It will be immediately available in the emergency settings.However, other variables such as myoglobin, ferritin and troponin47, 49, 50 It is also known to influence the prognosis of the novel coronavirus disease (COVID-19).

Our data were obtained from EDs of two academic hospitals, providing a larger cohort for model training and validation and improving the applicability of the algorithm to other cohorts. However, we had to overcome the challenges of differences in the structure of EMR and radiology reports, as well as slight differences in the language used between the two centres. The latter includes variations in the use of single terms and language syntax.

The results show that the TensorFlow tabular text model outperforms models using simple tabular variables when it comes to predicting ICU admissions. Prediction of ICU admission due to confounding factors not necessarily related to disease severity, such as lack of available beds and ventilators, which frequently occurred in the early stages of the initial COVID-19 outbreak It is important to emphasize the complexity ofin Italy36. Other studies have considered ICU admission as an adverse outcome after COVID-19 pneumonia. For example, Li et al.51 We have developed a convolutional neural network in which admission to the ICU is the primary outcome of COVID-19. These showed similar but not identical performance compared to the models developed in this study. Differences from our results can be explained by the higher proportion of patients admitted to the ICU and the different predictors used in the study.

Finally, the low ECE scores of both models suggest that our algorithm is performing with high confidence. This is very important for ED physicians to make safe decisions about admitting or discharging patients.

Limitations and future prospects

A limitation of the present study is that external validation is required to confirm the reliability of the results beyond the two centers where data were collected. Furthermore, the inherent nature of neural networks limits model interpretability and makes it impossible to establish causal relationships between predictors and outcomes.

Furthermore, because the current dataset was collected during the early stages of the pandemic in northern Italy, predictions of outcomes are limited to the specific characteristics of the circulating virus and treatments available at that time. increase. Furthermore, the absence of vaccinated patients in the dataset means that the potential protective effect of the vaccine is not reflected in the results.

From a clinical perspective, applying the same algorithm to data collected during different waves of the novel coronavirus disease (COVID-19) could help predict vaccine efficacy and disease progression due to novel mutations. It is believed that valuable information about In the future, combining textual and tabular data analysis could help clarify the clinical progression of COVID-19 in hospitalized patients with novel mutations and optimize antiviral therapy. have a nature.

Finally, algorithms that combine text and tabular predictors are not limited to COVID-19 prognosis prediction. As new data is provided, language models can be trained to analyze EMRs associated with other conditions commonly encountered in ED, such as heart failure, sepsis, pneumonia, COPD, and syncope. When dealing with these diseases, emergency physicians can better stratify patient risk and make more informed decisions about admission or discharge. Ultimately, this could lead to better utilization of hospital resources and a more efficient healthcare system.

Conclusion

The results of this study suggest that a combined analysis of tabular and textual data is effective in predicting 30-day mortality and ICU admission in SARS-CoV-2-infected patients presenting to the emergency department. doing.The combination of tabular and textual data shows novelty, at least in parttwenty fourin clinical studies dealing with the COVID-19 outbreak.

Our findings suggest that NLP may also have potential prognostic applications for other common diseases found in emergency settings, as pointed out in a recent systematic review. doing.26.

Future studies will be needed to improve the accuracy of the algorithm and confirm its generalizability through externally validated cohorts, especially as the epidemiology of the pandemic changes.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *