AI reconstructs ‘high quality’ video directly from brain readings under study

AI Video & Visuals


Researchers used generative AI to reconstruct “high-quality” videos from brain activity, a new study reports.

Researchers Jiaxin Qing, Zijiao Chen, and Juan Helen Zhou from the National University of Singapore and the Chinese University of Hong Kong used fMRI data and the text-to-image AI model Stable Diffusion to create a model called MinD-Video. brain readings. their paper A description of this work was posted to the arXiv preprint server last week.

their demonstration in the newspaper Corresponding website shows similarities between the video shown to the subject and the AI-generated video created based on the subject’s brain activity. The differences between the two videos are minor and mostly contain similar subject matter and color palettes.

The researchers define MinD-Video as “a two-module pipeline designed to bridge the gap between image brain decoding and video brain decoding.” To train the system, the researchers used a publicly available dataset that included a video and her fMRI brain measurements of a subject who watched it. The “two-module pipeline” consisted of a trained fMRI encoder and a fine-tuned version of stable diffusion, a widely used image generation AI model.

The video released by the researchers shows the original video of the horse in the field and a reconstructed video of the horse in more vibrant colors. In another video, a car drives through a wooded area, and a reconstructed video shows a first-person view of someone driving down a winding road. The researchers found that the reconstructed videos were of “high quality” as defined by the motion and scene dynamics. Also, the video accuracy he reported was 85%, an improvement over previous approaches.

“From neuroscience to brain-computer interfaces, we believe there are promising areas of application in this area as large-scale models are developed,” the authors write.

Specifically, they say, these results reveal three major findings. One is the predominance of the visual cortex, and this part of the brain has been shown to be a major component of visual perception. The other is that fMRI encoders work in a hierarchical manner, starting with structural information and moving on to more abstract and visual features in deeper layers. Finally, the authors found that fMRI encoders evolved through each learning stage, exhibiting an ability to receive more nuanced information as they continued training.

This research fundamentally represents a new advance in the field of using AI to read people’s minds. Previously, researchers at Osaka University Reconstruct high-resolution images Brain activity is measured by a method that combines fMRI data and stable diffusion.

The stable diffusion model enhanced in this new study allows for more accurate visualization. “One of the main advantages of our stable diffusion model over other generative models such as GANs is that it can produce higher quality videos. to generate videos that are not only of high quality, but also better match the original neural activity,” the researchers wrote.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *