image:
Overview of the BehaVERT pipeline. This diagram shows how BehaVERT analyzes animal behavior from videos. First, use a web-based tool to label the key points and movements of the skeleton. The skeletal coordinates of each video frame are then transformed into 768-dimensional “tokens” and input into a BERT-based transformation model. This model can classify both the behavior of individual frames and the overall state of the entire sequence. Tokens from the final layer are further used for unsupervised clustering and attention analysis, allowing researchers to visualize which behaviors the model focused on over a timeline when making decisions.
view more
Credit: KAIST
KAIST researchers have developed an artificial intelligence model that can read and interpret animal behavior like language. The research team created BehaVERT, an AI model that learns behavioral data in a way similar to natural language, and was able to independently identify social behavioral deficits in an autism mouse model, opening new avenues for interpretable neuroscience.
KAIST (Chairman Lee Kwang-hyun) announced that a research team led by Professor Kim Dae-su of the Department of Brain and Cognitive Sciences has developed an AI model that interprets animal movements as a type of behavioral language.
The researchers transformed the mouse’s skeletal movements into tokens that resemble words in natural language and trained a transformer-based model to learn the meaning of the actions. The resulting model, named BehaVERT, successfully identified core social behavioral abnormalities in autism mouse models without being provided with any prior biological knowledge.
This study introduces a novel AI framework for analyzing animal behavior through language-based representations. This model has demonstrated the ability to go beyond simple behavioral classification to reveal biologically meaningful patterns and may serve as the basis for next generation behaviorally based models that can be applied to drug discovery, psychiatric research, and behavioral genetics.
Inspired by the idea that animal behavior may have a language-like structure, the researchers trained a BERT-based transformer architecture by representing the positions of a mouse’s nose, ears, spine, limbs, and tail as behavioral tokens.
As a result, BehaVERT has learned not only to classify behaviors, but also to understand their contextual meaning over time, much like a language model infers meaning from a set of words.
The model achieved state-of-the-art performance across five international benchmark datasets covering social interaction, multiple animal behavior, three-dimensional movement analysis, and autism-related behavioral assessment.
Importantly, BehaVERT also provides interpretability, allowing researchers to visualize which behavioral cues influenced that decision.
In experiments distinguishing between Shank3B knockout autism model mice and healthy control mice, AI consistently focused on mouth-to-mouth contact behavior. This finding is consistent with previous biological studies showing that autism model mice exhibit a lack of social interaction despite maintaining normal approach behavior.
In other words, the AI independently rediscovered important biological properties solely from behavioral observations, without explicit biological instructions.
The researchers further discovered that the model’s internal representation space organizes behavioral characteristics such as mobility, attention, and social engagement into structured patterns. This suggests that animal behavior, like language, may have an underlying semantic structure.
The study also highlights an unusual interdisciplinary achievement. Lead author Dr. Soonjae Shin and other members of the research team were primarily trained in biology rather than artificial intelligence. By uniquely learning the Transformer architecture and deep learning techniques, they designed specialized models and training strategies tailored for behavioral analysis.
Professor Kim’s lab has long pursued AI-based behavioral analysis and previously developed AVATAR, a technology that reconstructs rodent behavior in virtual environments, which led to the founding of Actnova Inc.
“The project started with a simple question: Do animal movements contain language-like structures?” said Dr. Sunjae Shin, lead author of the study.
The team also employed a self-supervised learning framework that allows the AI to learn directly from behavioral data without manual annotation. Furthermore, the model trained on rat behavior was successfully transferred to mouse behavioral analysis, demonstrating the feasibility of a cross-species behaviorally based model.
“BehaVERT goes beyond classifying behaviors and enables interpretation of their meaning,” said Professor Dae-Soo Kim. “We expect this to become an important research tool for discovering new insights in drug development, psychiatric disorders, behavioral genetics, and many other areas of the life sciences.”
The study was published on March 24, 2026 in the International Journal of Computer Vision (IJCV), one of the world’s leading journals in computer vision.
Paper information
- title: BehaVERT: A transformer-based motion language model for decoding mouse behavioral semantics.
- journal: International Journal of Computer Vision (IJCV)
- DOI: 10.1007/s11263-026-02834-y
Related videos
funding
This research was supported by the Mid-Career Researcher Program and Brain Convergence Technology Development Program through the National Research Foundation of Korea (NRF) funded by the Korean Ministry of Science, Information and Communication (MSIT).
journal
International Journal of Computer Vision
Research method
meta-analysis
Research theme
not applicable
Article title
BehaVERT: A transformer-based motion language model for decoding mouse behavioral semantics.
Article publication date
March 24, 2026
Disclaimer: AAAS and EurekAlert! We are not responsible for the accuracy of news releases posted on EurekAlert! Use of Information by Contributing Institutions or via the EurekAlert System.
