Researchers from Google, Cornell University, and UC Berkeley Announce OmniMotion: A Revolutionary AI Method for Dense, Long-Range Motion Estimation in Video -

Historically, the two main methodologies used in motion estimation algorithms were sparse feature tracking or dense optical flow. Both types of methods are successful in their particular applications. However, neither representation perfectly captures the motion of the video. Sparse tracking cannot describe the motion of all pixels. In contrast, pairwise optical flow cannot capture motion trajectories over large time frames. To reduce this gap, many methods have been used to predict dense and long-range pixel trajectories in videos. These range from simple two-frame optical flow field chain techniques to more sophisticated algorithms that directly predict pixel-by-pixel trajectories over multiple frames.

However, all these approaches ignore information from the current temporal or geographic context when calculating velocity. This localization can lead to spatio-temporal discrepancies in motion estimation and cumulative errors over extended trajectories. Previous techniques considered long-range context, but did so in the 2D domain, resulting in tracking loss in occlusion situations. Creating dense, long-range trajectories still presents several problems, such as tracking points across occlusion, maintaining coherence in space and time, and maintaining accurate tracking over time. In this study, researchers from Cornell University, Google Research, and the University of California, Berkeley used all available video data to develop a comprehensive algorithm for estimating the full-length motion trajectory of each pixel in a movie. provide a way.

Their approach, called OmniMotion, uses a quasi-3D representation in which a collection of local canonical bijections maps a canonical 3D volume onto a frame-by-frame local volume. These bijections describe the combination of camera and scene motion as a dynamic multiview flexible relaxation of his geometry. Every pixel can be observed, even hidden pixels, and its representation guarantees cycle consistency (“everything, everywhere”). Optimize the presentation of each video to jointly solve motion for the entire video “at once”. After optimization, you can query the representation at any continuous coordinate in the movie to get the trajectory of motion throughout.

🚀 Check out 100’s of AI Tools at the AI Tools Club

In conclusion, it provides a way to process real film with any combination of camera and scene motion.

Generate globally consistent full-length motion trajectories for all points throughout the video.
You can track points through occlusion.
You can track points through occlusion.

They demonstrate these strengths statistically on the TAP video tracking benchmark, achieving state-of-the-art performance and significantly outperforming all previous techniques. They have some demo videos on his website and will be releasing the code soon as well.

As can be seen from the motion routes above, they provide a new technique for computing the full-length motion trajectories for each pixel in each frame of a movie. Even though our technique calculates the motion of every pixel, only sparse trails of foreground objects are shown to maintain clarity. Their approach produces accurate and consistent long-range motion and reliably tracks through occlusion, even for fast-moving objects, as demonstrated in the examples of dogs and swings. To provide context, moving items appear on his second line at various times.

please check out paper and plan.don’t forget to join 23,000+ ML SubReddit, Discord channeland email newsletterShare the latest AI research news, cool AI projects, and more. If you have any questions regarding the article above or missed something, feel free to email me. Asif@marktechpost.com

🚀 Check out 100’s of AI Tools at the AI Tools Club

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his Bachelor of Science in Data Science and Artificial Intelligence from the Indian Institute of Technology (IIT), Bhilai. He spends most of his time on projects aimed at harnessing the power of machine learning. His research interest is image processing and he is passionate about building solutions around it. He loves connecting with people and collaborating on interesting projects.

➡️ Try Noota: AI Meeting Assistant to Record, Analyze & Summarize Meetings (Sponsored)

Source link