
Human skills rely heavily on their ability to handle items, not just grab them. Manipulations such as pushing, flipping, tipping, and sliding are examples of difficult-to-grab maneuvers, which are very important in a variety of tasks, such as difficult-to-grab objects or crowded workspaces. But robots still struggle with unpredictable maneuvers.
Object geometry, touch, and sequential decision-making are all areas of research that present challenges with the oblivious manipulation techniques currently in use. This indicates that previous studies have shown success only with narrow range of items and simple movements, such as pushing a plane or manipulating articulated objects with a few degrees of freedom.
Researchers at Carnegie Mellon University and Meta AI have proposed an approach to perform complex and hard-to-understand manipulation tasks and generalize across item geometries with flexible interactions. They provide a reinforcement learning (RL) strategy called Hybrid Actor-Critical Maps for Manipulation (HACMan) for non-grasping manipulation informed by point cloud data.
The first technological advance made by HACMan proposes a temporally abstracted, spatially grounded, object-centric representation of actions. The agent decides where to contact and selects a set of motion parameters to determine the next action. Point clouds of observed objects determine the location of contacts, giving conversations a solid geographic foundation. They isolate the most touching parts of the action for learning, but this has the unintended consequence of making the robot’s decisions more abstract in time.
The second technical advance by HACMan is to use the Actor Critic RL framework to implement the proposed action representation. Since the motion parameters are defined over the continuous action space, the action representation is within the hybrid discrete-continuous action space. In contrast, the contact position is defined on the discrete action space (choosing the contact point among the points in the object point cloud). On the object point cloud, HACMan’s critical network predicts the Q-value of each pixel, and the actor network produces continuous motion parameters for each pixel. The per-point Q value is used to update actors and scores when selecting contact locations. This is different from the popular continuous action space RL algorithm. They tweak their standard off-policy RL algorithm update rules to account for this new hybrid action space. They use HACMan to complete the assignment of pose alignment of his 6D objects with random initial and target poses and various object shapes. Simulations show a success rate of 79% for invisible non-flat items, indicating that their policy generalizes well to invisible layers.
In addition, HACMan’s alternative action representation results in a training success rate of over 3x over the best baseline. We also conduct tests with real robots using zero-shot sim2real transfers to demonstrate dynamic object interactions across various shapes of invisible objects and non-planar targets.
Disadvantages of this method include the reliance on point cloud registration to estimate the object-goal transformation, the need for somewhat accurate camera calibration, and the fact that contact positions are restricted to the portion of the visible object. will be The research team highlights the potential for extending the proposed approach and using it for more manipulative activities. For example, we might expand our approach to cover grasping and non-grasping behaviors. Taken together, the proposed strategy and experimental results show promise for state-of-the-art advances in robotic manipulation over a wider range of objects.
Please check paper and plan.don’t forget to join 21,000+ ML SubReddit, Discord channeland email newsletterShare the latest AI research news, cool AI projects, and more. If you have any questions regarding the article above or missed something, feel free to email me. Asif@marktechpost.com
🚀 Check out 100’s of AI Tools at the AI Tools Club
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree at the Indian Institute of Technology (IIT), Bhubaneswar. She is a data her science enthusiast and has a keen interest in the range of applications of artificial intelligence in various fields. She is passionate about exploring new advances in technology and its practical applications.