KAIST breakthrough: robots now imitate human decisions

<(From left) Professor Chang D. Yoo, Mr. Tung M. Luu (doctoral candidate, first author), center back, and Mr. Hwanhee Kim (master's degree candidate, second author) front row right>

“Robots that can make decisions in the same way as humans will appear sooner than we think.” A fundamental technology has been developed in South Korea that will accelerate the era in which robots understand human intentions and choose the right actions on their own. KAIST researchers have solved a key challenge in the commercialization of physical AI by developing a technology that allows AI to learn human decision criteria on its own in just a few videos. On June 10, KAIST announced that a research team led by Professor Zhang D. Yu of the School of Electrical Engineering has developed, for the first time in the world, a new technology called VOTP (Video-based Optimal TransPort Preference) that allows AI to learn human intentions and judgment criteria using just a few preference videos instead of thousands to tens of thousands of human evaluation data points.

The research team’s paper was accepted at ICML (International Conference on Machine Learning) 2026, the world’s most prestigious AI conference, to be held at COEX in Seoul in July of this year. In recognition of the excellence of their research, they were selected for an oral presentation, an honor given only to the top 0.7% (168 papers) of all submitted papers (23,918 papers). ICML is considered one of the most influential international conferences in the field of AI and machine learning. In recent years, AI technology has rapidly evolved from generative AI that writes text and draws pictures to an era of “physical AI” that operates on real machines and operates in the real world. Typical examples include robots that perform dangerous tasks in place of humans in factories, self-driving cars that judge road conditions on their own, and medical robots that perform delicate surgeries. However, there were hurdles that needed to be overcome in order to put physical AI into practical use. The problem is learning human-level evaluation criteria for determining whether the actions performed by a machine match human intentions and which actions are more desirable. For example, when a surgical robot performs sutures or a self-driving car navigates a complex intersection, AI must choose the most appropriate action from among many options. To achieve this, we need a “reward function” that reflects human preferences and judgment criteria. However, until now, humans had to directly evaluate and construct thousands to tens of thousands of action data points, which took a huge amount of time and cost. The research team focused on how humans learn new tasks after seeing just a few demonstrations. VOTP, developed by a team of researchers, helps AI uniquely understand human behavior patterns by simply using videos with some good and bad examples. Instead of humans having to evaluate vast amounts of data one by one, as in the past, AI can understand human judgment criteria and extend learning to a variety of situations. The core idea of this research is that intelligent machines, such as robots and self-driving cars, can quickly understand human intentions with just a few videos that contain human preferences. The algorithm developed for this purpose has proven its effectiveness and generalization performance through extensive experiments across different environments and tasks. This method can significantly reduce the human feedback and data construction costs required for physical AI development. Robots, self-driving cars, industrial machinery, etc. can learn behaviors that meet human expectations with a small number of examples, which is expected to significantly shorten development time and costs. This technology can be widely applied to robot arm control, humanoid robots, self-driving cars, smart factories, drones, surgical robots, as well as AI agents that directly operate computers. In particular, it is expected to be used as a core fundamental technology for all physical AI systems that need to learn human intentions and satisfaction.

Professor Chang D. Yu said, “The core of physical AI is to make machines understand human intentions and choose the correct actions,” and “VOTP is a core technology that will accelerate the era in which robots make human-like decisions because they can learn human judgment standards with a small amount of footage.” This research, in which Tung M. Luu, a doctoral student in the Department of Electrical Engineering, participated as the lead author, was selected as an oral presentation paper at ICML (International Conference on Machine Learning) 2026, the world’s most prestigious AI conference. * Paper title: Video-Based Optimal Transport for Feedback-Efficient Offline Preference-Based Reinforcement Learning, Paper file: https://sanctusfactory.com/data/file/publications/202606091714078906.pdf This research was supported by the Institute of Information and Communication Technology Planning and Evaluation (IITP) and the National Research Foundation of Korea (NRF). Science and ICT. ”

/Open to the public. This material from the original organization/author may be of a contemporary nature and has been edited for clarity, style, and length. Mirage.News does not take any institutional position or position, and all views, positions, and conclusions expressed herein are those of the authors alone. Read the full text here.

Source link