Make AI vision a better focus

Today's AI vision is effective in recognizing simple images on their own, such as buildings, cars, people. However, when more complex terrains are required, their accuracy becomes questionable. This is one of the challenges facing autonomous automotive technology. AI visual systems must find all buildings, cars and people correctly. In all cases, in fluid environments such as mixed intersections.

“Can we develop a learning algorithm that can process data directly from what we are experiencing, as opposed to recognizing simple images on computer screens?” asks Mengye Ren, assistant professor at NYU's Institute of Mathematical Sciences and the Center for Data Science.

Ren and his colleagues have built an algorithm that does exactly that, allowing AI systems to learn to effectively identify their surroundings from their environment, their streets, the oceans, and even other planets.

Their way, the poodle, is inspired by how humans and animals handle messy scenes. Use “optical flow” to capture both foreground images (pedestrians crossing the street) and background images (intersecting streets in the distance). This is information about how pixels move between video frames. This process allows you to over time identify areas of a pair that contain the same object, such as pedestrians traveling from curb to crossings and going down crowded streets.

“Poodle combines the best of existing AI vision tools by recognizing both large and small objects,” explains Mengye Ren, assistant professor at NYU's Institute of Mathematics and Data Science Center. “Our goal is to continue to enhance this tool so that it can recognize a variety of objects, such as scenes, roads, traffic lights, and cyclists.”

Ren explains his team's progress in this video:

Source link