NASHmap: The Clinical Utility of Machine Learning Models to Identify Patients at Risk for NASH in Real-World Settings

Machine Learning

Performance of NASHmap in NIDDK patients with biopsy-confirmed NASH status

The test cohort used to test the performance of NASHmap in the NIDDK NAFLD Adult Database included 181 patients with biopsy-confirmed NASH and 100 patients with biopsy-confirmed non-NASH. It consisted of 281 patients including (Table 1).

Table 1 Clinical characteristics of study cohorts by clinical status.

NASHmap performance as assessed by area under the curve (AUC) was 0.82, accuracy was 75% (210/281), sensitivity was 81% (147/181), and accuracy or positive predictive value (PPV) was 80% (147/181). 184) was. ), the negative predictive value (NPV) was 65% (63/97)8When the key feature HbA1c was considered missing and imputed to the test patients (N = 281), the AUC was 0.79 using K nearest neighbor (kNN) imputation, using the median dataset value 0.77 using the average dataset value and 0.76 using the mean dataset value.

NASHmap showed excellent performance in correctly classifying patients according to their clinical status. Eighty-one percent (147/181) of NASH patients and 63% (63/100) of non-NASH patients were correctly classified as NASH and non-NASH, respectively (Table 2). Good performance was also achieved in cohorts stratified by T2DM status. NASH was correctly predicted in 86% (72/84) of his NASH patients with T2DM compared with 77% (75/97) of NASH patients without T2DM (Table 3).

Table 2 Performance of NASHmap in classifying cohort patients with known status from NIDDK.
Table 3 Performance of NASHmap in classifying patients in the NIDDK cohort stratified by T2DM status.

To investigate potential sources of NASHmap errors, we compared the means of each of the 14 features for each taxonomic group. In the biopsy-confirmed NASH patient subcohort, those correctly classified (true positive) by NASHmap had a clinical profile consistent with NASH (Table 4). For example, the mean ± SD values ​​for AST and ALT in this group were 75.88 ± 49.64 U/L and 104.09 ± 52.77 U/L, respectively, well above the normal upper limit of 35 U/L. Misclassified (false-negative) patients had a clinical profile close to non-NASH for several features, with significantly lower mean HbA1C, AST, ALT, total protein, and albumin than true-positive NASH. (Table 4 and Supplementary Fig. 1). Table 4 shows the clinical variables in order of predictive importance in the NASHmap. Of the five features with the highest predictive value (HbA1C, AST, ALT, total protein, and AST/ALT ratio), three (AST, ALT, and total protein) There was a statistically significant difference (Table 4).

Table 4 Characteristics of correctly classified and misclassified patients by NIDDK’s NASHmap.

Similarly, a group of biopsy-confirmed non-NASH patients misclassified as NASH (false positives) had clinical profiles consistent with NASHmap predictions, with mean HbA1C, AST, ALT, and total protein was significantly increased compared with correctly classified patients. (true negative). There was a statistically significant difference between true negatives and false positives for all five features with the highest predictive value (Table 4).

Metabolic complications such as hypertension and obesity common in NASH1,9,10 It does not appear to correlate with patient misclassification by NASHmap. BMI, triglyceride levels, and hypertension rates were all similar between true positives and false negatives. Interestingly, the incidence of hypertension was lowest in the false-positive group (Table 4).

Performance in real-world Optum electronic medical records

In the Optum database, 13.72 million patients met inclusion and exclusion criteria, and 3.14 million patients had data available for all 14 NASHmap components. Among them, 1,016 patients were biopsy-confirmed NASH, 21,930 patients were diagnosed with NASH ICD-10, no liver biopsy was documented, and 2,886,653 were undiagnosed (Table 1). 1). Patients with liver biopsy-confirmed NASH and ICD-10 NASH had similar clinical profiles, but undiagnosed patients had slightly lower BMI and a lower rate of type 2 diabetes than the NASH cohort. lower, lower mean HbA1C, and mean older age. Non-NASH and undiagnosed patient subcohorts from NIDDK and Optum appeared to have lower mean BMI and fewer NASH-related comorbidities (Table 1).

Of the biopsy-confirmed Optum patients with NASH, 72% (727/1016) were correctly classified by NASHmap (Table 5), and 81% (147/181) of NASH patients were correctly classified from the NIDDK test cohort. also performed poorly (Table 2). AUC is also slightly low at 0.768The capture of irregular features and reliance on medical coding for diagnosis in real-world databases may explain some of this difference. Her NASHmap misclassification reasons for Optum patients are consistent with those for NIDDK patients. A group of false-negative patients has a clinical profile that is more likely to have values ​​within the normal range compared to true-positives. For example, the mean ± SD values ​​for the false negative group were 27.63 ± 19.85 U/L for AST and 30.55 ± 22.93 U/L for ALT, while the values ​​for the true positive group were 66.63 ± 81.18 U/L and 72.22 ± 59.19 U. bottom. /L, respectively. HbA1C (32%), AST (53%), and ALT (51%) (Table 5).

Table 5 Clinical characteristics of Optum real-world database patients predicted as NASH or non-NASH by NASHmap.

NASHmap prediction of NASH in undiagnosed NASH patients in the Optum database

All undiagnosed patients in the Optum subcohort selected for this study had possible NASH due to the presence of comorbidities, but none had an ICD code for NASH or NAFLD diagnosis in their electronic medical records. (see Methods and Supplementary Methods). Therefore, NASHmap was used to determine the number of her NASH patients predicted among these undiagnosed patients. Approximately 31% (883,867 of 2,886,653) were predicted to have NASH (Table 5). Unlike the predicted non-NASH patients, these predicted NASH patients had mean AST and ALT levels above the normal range (41.21 ± 144.28 U/L and 42.44 ± 99.86 U/L), with 66% (585,272 /883,867) in 53% of the total undiagnosed population (1,532,744/2,886,653) in contrast to T2DM.

In the subcohort of undiagnosed patients with T2DM, 38% (585,272 out of 1,532,744) were predicted to have NASH, a slightly higher percentage than the overall undiagnosed patient population (Table 6). Within the T2DM cohort, there were few clinical differences between patients with predicted and biopsy-confirmed NASH. Of her NASHmap-predicted patients, 51% were female, whereas 66% of her biopsy-confirmed NASH patients were female. We found fewer values ​​outside the normal range for AST and ALT in NASH-predicted patients compared to patients with biopsy-confirmed NASH (% difference, % true positives – % false negatives): AST (27% ) and ALT (21%). However, the AST/ALT ratio was outside the normal range in a slightly higher number of patients with predicted NASH (43%) than in patients with biopsy-confirmed NASH (37%).

Table 6 Profiles of patients with biopsy-confirmed and predicted NASH in the Optum database T2DM cohort.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *