Researchers use AI to identify similar materials in images | Massachusetts Institute of Technology News

For example, a robot that manipulates objects while working in a kitchen would benefit from understanding which items are made of the same material. With this knowledge, the robot knows that it can exert the same amount of force whether it picks up a little piece of butter from the shaded corner of the counter or takes a whole stick out of the bright refrigerator.

Identifying objects in a scene that are made up of the same material, called material selection, is especially difficult for machines because the appearance of materials can change dramatically based on object geometry and lighting conditions. is a problem.

Scientists at MIT and Adobe Research have taken a step towards solving this challenge. They have developed a technique that can identify every pixel in an image that represents a particular material displayed in a user-selected pixel.

The method is accurate even when objects have different shapes and sizes, and the machine learning model they developed isn’t fooled by shadows and lighting conditions that can make the same material look different.

Although they trained their models using only “synthetic” data created by a computer that altered the 3D scene to generate different images, the system was able to simulate a real indoor environment that had never been seen before. and works effectively in outdoor scenes. This approach can also be used for videos. Once the user identifies pixels in the first frame, the model can identify objects made of the same material throughout the rest of the video.

Four horizontal images of a person walking with luggage. First, the image still shows red dots on the yellow pant material. The second and third are anime, but the third is pink pants. The fourth shows a monochrome animation of his version, with luggage and shoes barely visible on a black background. — Researchers’ techniques can also be used to select similar material within a video. Once the user identifies the pixel in the first frame (the red dot in the leftmost image on the yellow fabric), the system automatically identifies objects made of the same material throughout the rest of the video.

Image: Provided by Researcher

In addition to its application in scene understanding in robotics, the method can also be used in image editing and incorporated into computational systems for estimating the parameters of materials in images. It can also be used for material-based web recommendation systems. (For example, shoppers may be looking for clothing made from a particular type of fabric.)

“Knowing what material you are working with is often very important. Two objects that look similar can have different material properties. The method can facilitate the selection of all other pixels in an image that are made of the same material,” said a graduate student in electrical engineering and computer science and lead author of a paper on the technique. Praful Sharma said.

Sharma’s co-authors include Adobe Research fellows Julien Philip and Michael Gharbi. and lead author William T. Freeman, Thomas and Gerd He Perkins Professor of Electrical Engineering and Computer Science, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). Fredo Durand, Professor of Electrical Engineering and Computer Science and member of CSAIL. And Valentin Deschaintre, a researcher at Adobe Research. The research will be presented at the SIGGRAPH 2023 conference.

new approach

Existing methods for material selection have difficulty accurately identifying all pixels representing the same material. For example, some methods focus on the object as a whole, but an object can consist of multiple materials, such as a chair with wooden armrests and a leather seat. Other methods may utilize a pre-determined set of materials, often with a broad label like “wood” even though there are thousands of different types of wood.

Instead, Sharma and his collaborators work with a machine that dynamically evaluates every pixel in an image to determine material similarity between user-selected pixels and all other regions of the image. Developed a learning approach. If the image contains a table and his two chairs, and the chair legs and table top are made of the same type of wood, the model can accurately identify those similar areas.

Before researchers could develop an AI method to learn how to select similar materials, several hurdles had to be overcome. First, existing datasets did not contain sufficiently finely labeled material to train machine learning models. Researchers rendered a proprietary synthetic dataset of an indoor scene. This included 50,000 images of him and over 16,000 materials of him randomly applied to each object.

“We wanted a dataset where each type of material was marked separately,” says Sharma.

They got their hands on a synthetic dataset and trained a machine learning model for the task of identifying similar material in real images, but it failed. Researchers found that a change in distribution was the culprit. This happens when the model is trained on synthetic data, but fails when tested on real-world data, which can differ significantly from the training set.

To solve this problem, they built a model on top of a pre-trained computer vision model that they checked millions of real-world images. They leveraged the model’s prior knowledge by leveraging already learned visual features.

“In machine learning, when you use a neural network, you usually learn together the representation and the process of solving a task. Neural networks just focus on solving tasks,” he says.

solve similarities

The researcher’s model transforms general pre-trained visual features into material-specific features. This is done in a manner that is robust to object geometry and varying lighting conditions.

Four images showing horizontal columns of matches. First, the image still shows a red dot at the tip of the middle match. The second and his third image are animations of the flames on both ends reaching the center, while the third image shows the match in the center blazing bright red. Fourth, we see a monochrome version of his animation with a black background with barely visible flames. — The system the researchers have developed to identify similar materials is robust to changing lighting conditions, as seen in this match head burning example.

Image: Provided by Researcher

The model can then compute a material similarity score for every pixel in the image. When the user clicks on a pixel, the model determines how close he looks to every other pixel to the query. A map is created in which each pixel is ranked in terms of similarity on a scale of 0 to 1.

“If the user clicks one pixel, the model automatically selects all areas with the same material,” he says.

Since the model outputs a similarity score for each pixel, the user can set a threshold such as 90% similarity to fine-tune the results and receive a map of the image with those regions highlighted. increase. This method also works for selections between images. User can he select pixels in one image and find the same material in another image.

During experiments, the researchers found that their model was able to predict image regions containing the same material more accurately than other methods. When we measured how accurate the predictions were compared to the ground truth (real-world regions of the image composed of the same material), the models matched with about 92% accuracy.

In the future, we would like to enhance the model to better capture the details of objects in the image and improve the accuracy of our approach.

“Rich materials contribute to the functionality and beauty of the world we live in. However, computer vision algorithms typically ignore materials and focus instead on objects. It makes an important contribution to recognizing material in images and videos under difficult conditions,” said Cornell Bowers, Dean and Professor of Computer Science at the College of Computing and Information Sciences, who was not involved in the research. said Kavita Bara, . “This technology is very useful for end consumers and designers alike. You can imagine and feel confident in your design choices based on these visualizations.”

Source link