Evaluation of the use of body mass index changes as a proxy for anorexia nervosa: a machine learning perspective | Journal of eating disorders

Machine Learning


The use of ΔBMI as the only outcome measure of therapeutic efficacy in assurance is performed carefully. From a practical standpoint, the rationale appears to be grounded. In severe cases of (mean hospitalization BMI for this study = 14.13), maintaining weight and achieving small benefits within a short hospital stay can represent meaningful clinical milestones. This view is supported by previous findings suggesting that short-term stabilization may be an appropriate proximal target in very low-weight patients. [2, 3].

However, evaluating BMI stabilization (ΔBMI=0) with successful treatment can oversimplify the complexity of recovery. While weight recovery is indeed a key component of treatment, it does not cover the broader psychological, behavioral and functional aspects of recovery. For example, two patients with similar BMI gains may differ significantly in cognitive stiffness, fear of weight gain, social functioning, or disruption in body image. [4].

Furthermore, the actual magnitude of the BMI changes observed in this study (mean difference = 0.35) is clinically conservative, with weight gain of approximately 1 kg. This raises concerns about whether such changes are robust enough to reflect a true therapeutic response, or rather represent natural variations or changes in hydration. Furthermore, it emphasizes the emphasis on sustained psychological and behavioral improvements, including food confiscation, improved emotional regulation, and enhanced social reintegration as key markers of recovery. [5, 6]. Reliance on weight-based classification may omit these domains and overestimate treatment efficacy or not detect partial responders.

Another limitation is the lack of follow-up data after discharge. The authors are openly acknowledged this, but still limit the usefulness of predictive models in deriving long-term outcomes. This is often the case that the rate of recurrence is high and the central concern of weight gain achieved during hospitalization is not maintained. [7]. Therefore, using short-term ΔBMI as a proxy for short-term treatments does not fully capture meaningful recovery trajectories. Future models should aim to integrate multidimensional outcomes such as psychometric scores and recurrence rates.

Opportunities and challenges to apply machine learning to treat eating disorders

Brizzi et al. The incorporation of ML methods into research into eating disorders is consistent with a broader trend towards personalized, data-driven healthcare. In contrast to traditional hypothesis-driven statistics, ML is excellent at capturing complex, nonlinear interactions between high-dimensional variables [8]. The authors appropriately employ cross-validation, hyperparameter adjustment, and class tiering to ensure the robustness and generalizability that are essential for small clinical data sets. Reported performance metrics demonstrate the promise of clinical utility, particularly high recalls (94%) and accuracy (79%) for classifying “improvement” cases.

In particular, this study also employs the growth field of explainable artificial intelligence (XAI) by integrating Shapley Additive Description (SHAP) to interpret individual-level predictions. This is a laudable move for transparency and user trust. This is two important factors in the clinical acceptance of artificial intelligence models. SHAP allows clinicians to see which features contribute most to a particular prediction, increasing the interpretability of the model and reducing the criticism of the “black box” that is often collected in ML applications. [9].

However, there are some warnings to emphasize. First, the sample size (n = 72) For high-dimensional ML modeling, it remains relatively small. Despite repeated use of layered K-fold cross-validation, limited sample sizes limit the ability of the model to generalize to outpatient settings, different age groups, or intercultural contexts, among other things. [10, 11]. Further external validation using multicenter data is essential before confidently deploying the model in clinical practice [12].

Second, the problem of class imbalances (improvement of 53 and worsening of 19) raises another methodological concern. Unbalanced datasets can distort the performance of the model, particularly the specificity [13]. The authors show that only 33% of the specificity of the worsened class, indicating that the model struggled to accurately detect patients at risk of worsening. In a real clinical setting, such false negatives are particularly troublesome. It may underestimate the need for intervention.

Third, SHAP offers localized interpretability, but does not inherently confer causal inference. Identifying variables (e.g., physical anxiety) as key features of predictions does not establish them as modifiable treatment targets. Therefore, careful clinical validation and ideally prospective testing are required to translate predictive insights into practical treatment recommendations. [14].

Finally, integrating ML models into everyday clinical workflows remains an important challenge. The model must be embedded in an electronic health record system, interfaced with real-time patient data, and presented in a clinician-friendly way. Additionally, issues related to data privacy, regulatory compliance, and algorithm bias must be addressed closely. The author correctly positions ML as a decision support tool, rather than as a cautious decision-making authority.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *