Meta on Monday released a new artificial intelligence (AI) model that can perform complex computer vision tasks. The model, called Segment Anything Model 2 (SAM 2), is the successor to a predecessor released last year and built into Instagram's Backdrop and Cutouts tools. This successor to the model comes with advanced capabilities that the company says can perform segment identification and tracking even in videos. Like most of Meta's large language models (LLMs), SAM 2 is an open-source AI model.
In a newsroom post, Meta announced the new AI model, which is primarily focused on video segmentation analysis and has improved image segmentation capabilities. Highlighting the achievements of its predecessor, Meta said that its AI model is used in Instagram's Backdrop and Cutouts features, while marine scientists are using it to “segment sonar images to analyze coral reefs, analyze satellite imagery for disaster relief, and in the medical field to segment cellular images to help detect skin cancer.”
SAM 2 has the ability to segment objects in images and videos, as well as track them in real-time across different frames of a video. The AI is also able to track and segment objects even in scenarios where they are moving at high speed, changing appearance, or even obscured by other objects or entirely different scenes.
The underlying model for prompt-based visual segmentation is built on a simple transformer architecture. It is equipped with streaming memory, allowing it to process videos in real time. The company also claims that the model was trained on the largest video segmentation dataset known as the SA-V dataset.
Meta said the AI model will not only simplify the process of video editing and AI-based video generation, but also help enable new experiences in its mixed reality ecosystem. The company added that the object tracking capability in videos will help speed up the annotation of visual data for training other computer vision systems.
Because this is an open-source AI model, the company is hosting the weights on its GitHub page, where anyone interested can download and test the AI model. Notably, the model is under the Apache 2.0 license, which permits research, academic, and non-commercial use.