- Rethinking energy-based models as policy: Ranking noise-contrastive estimates and interpolation of energy models (arXiv)
Author: Sumeet Singh, Stephen Tu, Vikas Sindhwani
Summary: A key design decision for a robot learning pipeline is the choice of policy representation. What type of model should be used to generate the following sequence of robot movements? The inherent multimodal nature of many robotic tasks, coupled with recent successes in generative modeling, has led researchers to We focus on cutting-edge probabilistic models such as diffusion models for policy expression. This study revisits the choice of energy-based models (EBMs) as a policy class. We show that the common folklore that it is unrealistic to train energy models in high-dimensional continuous spaces is incorrect. We develop practical training objectives and algorithms for energy models. It combines several key elements: (i) ranked noise contrast estimation (R-NCE), (ii) a learnable negative sampler, and (iii) non-adversarial joint training. We prove that the proposed objective function is asymptotically consistent and quantify its marginal variance. On the other hand, we have shown that the objectives of implicit behavioral cloning (IBC) are actually biased even at the population level, and several independent follow-up studies have mathematically demonstrated the poor performance of IBC-trained energy policies. I'm explaining. We further extend the algorithm to model this process using a family of EBMs indexed by a scale variable to learn a continuous stochastic process that bridges noise and data. In doing so, we demonstrate that the core ideas behind recent advances in generative modeling are indeed compatible with his EBM. In summary, our proposed training algorithm allows policies to compete with and even outperform diffusion models and other state-of-the-art approaches on several difficult multimodal benchmarks, such as obstacle avoidance path planning and contact. You can train energy-based models as -Rich block pushing.
2. Full variational noise contrast estimation (arXiv)
Author: Christopher Zack
Abstract: By using the underlying theory of appropriate scoring rules, we design a family of noise-contrastive estimation (NCE) methods that are tractable for latent variable models. Both terms in the underlying NCE loss, the term using data samples and the term using noise samples, can have lower bounds similar to variational Bayes, so this loss family can be combined with a fully variational noise-contrastive estimate. It is called. Variational autoencoders are a particular example of this family, so they can also be understood as separating real data from synthetic samples using an appropriate classification loss. Additionally, we discuss other examples of this family of fully variational NCE objectives and demonstrate the differences in their empirical behavior.
