Article Highlights | April 14, 2026
Comparative analysis of simulated and real psychiatric ward videos reveals clinical limitations of AI behavioral recognition models
image:
Professor Hyun Kang-jeong, Department of Psychiatry, Korea University (Korea University Guro Hospital)
view more
Credit: KU Medicine
A research team led by Professor Hyun Kang Jeong of the Department of Psychiatry at Korea University School of Medicine (Korea University Guro Hospital), in collaboration with a research team from GeoVision, announced the results of a large-scale verification study investigating the feasibility of early detection of self-harm using artificial intelligence (AI) in psychiatric wards. This study was published in an internationally renowned journal scientific report.
Early detection of self-harm in closed psychiatric wards plays an important role in ensuring patient safety. However, continuous human monitoring has structural challenges such as limited staffing and blind spots in observation. To address this issue, the research team evaluated how accurately a video-based AI behavioral recognition technique could detect self-harm in a real-world clinical setting, and whether an AI model trained in a controlled laboratory environment could maintain its performance when applied to a real-world hospital situation.
The research team generated 1,120 simulated self-harm video samples in a studio environment designed to closely replicate conditions in a psychiatric ward. Additionally, 118 real clinical video samples collected from the closed psychiatric ward of Korea University Guro Hospital were used as validation data. All clinical video data were fully anonymized before analysis, and clinical reliability was ensured through cross-validation with medical records. The team then trained and evaluated six state-of-the-art deep learning-based behavioral recognition AI models under identical conditions and compared their self-harm detection performance.
The results showed that although the AI model showed relatively high performance in a simulated studio environment, the performance degraded significantly when applied to real clinical video data. Even the latest transformer-based AI models have struggled to generalize effectively due to the variability of real-world situations, including diverse behavioral patterns, occlusions, and irregular motion features. In particular, subtle and repetitive self-harm behaviors such as scratching and skin picking were identified as the most difficult types for existing AI models to detect.
Professor Hyun Ghang Jeong said, “The most important contribution of this study is that we have quantitatively demonstrated both the potential and limitations of AI-based self-harm detection technology. By systematically comparing simulated and real clinical datasets, we were able to clearly identify what improvements are needed for clinical implementation of AI models at the current technology level. Furthermore, the studio-based self-harm dataset established through this study is an excellent tool for psychiatry and medical AI. We hope that this will contribute to the advancement of research.”
The study was titled “Benchmarking behavioral recognition models for self-harm detection in studio and real-world datasets” and was published in the current issue. scientific report.
