This AI model provides an intuitive understanding of how the physical world works

original version of this story appeared in Quanta Magazine.

This is a test for young children. Please show me the glass of water on the desk. Hide it behind the wooden plank. Then move the board towards the glass. Would they be surprised if the board passed through the glass as if it wasn't there? By the age of 6 months, many children have an intuitive concept of the permanence of objects that they have learned through observation, and by the age of 1, almost all children have an intuitive concept of the permanence of objects that they have learned through observation. Now, some artificial intelligence models are doing the same.

Researchers have developed an AI system that learns about the world through videos and exhibits the concept of “surprise” when presented with information that contradicts the knowledge it has gathered.

The model, created by Meta and called Video Joint Embedding Predictive Architecture (V-JEPA), makes no assumptions about the physics of the world contained in the video. Nevertheless, you will be able to understand how the world works.

“Their argument is deductively very plausible, and the results are very interesting,” says Mika Heilbron, a cognitive scientist at the University of Amsterdam who studies how the brain and artificial systems understand the world.

higher level abstraction

As engineers who build self-driving cars know, it can be difficult to ensure that AI systems understand what they see. Most systems designed to “understand” videos and classify content (e.g., “people playing tennis”) or identify the contours of objects (e.g., a car in front of you) operate in so-called “pixel space.” This model basically treats every pixel in the video as being of equal importance.

However, these pixel space models have limitations. Imagine trying to understand a suburban street. If your scene has cars, traffic lights, and trees, the model may focus too much on irrelevant details such as leaf movement. You may miss the color of traffic lights or the location of nearby cars. “When I use images and videos, I don't want to work in them. [pixel] There are too many details that you don't want to model,” said Randall Balestriello, a computer scientist at Brown University.

Image may contain, Yann LeCun, face, happy head, person, smile, photography, portrait, dimples, adult and accessory

Source link

b"asta binance h"anvisningskod commented on IP Basics: Copyright Law (Podcast) – Copyright: I don't think the title of your article matches th
binance konto commented on AI And The Channel: It’s Go Time: Thanks for sharing. I read many of your blog posts
小艾彩票平台 commented on Create the content you envision: Hello, for all time i used to check blog posts her
天天官网 commented on 10 AI Applications to Streamline Business and Customer Experiences: After looking into a few of the blog posts on your
免费Binance账户 commented on Foreshadowing Biden’s AI Executive Order? — AI: The Washington Report | Mintz: Can you be more specific about the content of your

This AI model provides an intuitive understanding of how the physical world works

higher level abstraction

RECENT POSTS

Viral video shows CRPF personnel criticizing PM Modi is deepfake

Albertsons integrates branded product placement with AI search tools

Canada’s technology leaders are unprepared for the wave of AI adoption: Report

higher level abstraction

Related Posts