Despite significant progress over the past decade, 3D face reconstruction from a single unconstrained image remains an important research challenge in the active computer vision community. Its uses are now numerous and diverse, including human digitization for applications in virtual and augmented reality, social media and gaming, generation of synthetic datasets, and health applications. However, recent research has fallen short of accurately reproducing the identities of different people, often requiring the creation of components that can be utilized for photorealistic rendering.
3D Morphable Modeling (3DMM) is a popular method of obtaining facial shape and appearance from a single “real-life” shot. This is due to the need for comprehensive datasets of scanned human geometry and reflectance, the limited and complex information contained in a single facial image, the limitations of current statistical and machine learning methods, etc. , which can be attributed to several factors. Principal component analysis (PCA) was used in his first 3DMM investigation to model facial shape and appearance using different identities and expressions learned from over 200 participants.
Since then, more complex models involving thousands of individuals have been developed, including LSFM, Basel Face Model and Facescape. Additionally, his 3DMM of the entire human head, or other facial features including ears and tongue, has recently been developed. Finally, subsequent publications include extensions from direct regression of 3DMM parameters to nonlinear models. However, such models cannot be textured with photorealistic realism. Deep generative models have come a long way in the last decade. Progressive GAN architectures produce excellent results, especially in learning the distribution of high-resolution 2D pictures of human faces using generative adversarial networks (GANs).
Recently, style-based progressive generation networks were used to learn meaningful latent regions that can be traversed to reconstruct and control various aspects of the generated samples. Some techniques such as UV mapping have also succeeded in obtaining his 2D representation of 3D facial features. To generate a 2D headshot, the rendering function can use her 3D face model generated by 3DMM. Iterative optimization also requires differentiating the rendering process. The recent development of photorealistic differentiable rendering of such assets is made possible by differentiable rasterization, photorealistic face shading, and rendering libraries.
Unfortunately, the Lambertian shading model used in 3DMM work falls short of accurately representing the complexities of face reflectance. The problem is that realistic facial representations require multiple RGB textures, which require different facial reflectance coefficients. There have been recent attempts to simplify such settings, but such datasets are few and small, making them difficult to obtain. High-fidelity, reilluminatable facial reflectance reconstruction has been made possible by several modern methods, including infrared methods. However, these reconstructions still need to be found. Moreover, while powerful models can capture facial appearance using deep models, it has been demonstrated that they cannot display single or multiple image reconstructions.
An alternative modern paradigm that relies on learned neural rendering captures the avatar’s appearance and shape through implicit representations. Despite their good performance, standard renderers cannot use such implicit representations and they are usually not rewritable. The latest Albedo Morphable Model (AlbedoMM) also uses a linear PCA model to record face reflectance and shape. Still, per-vertex color and normal reconstruction are too low resolution for photorealism. AvatarMe++ can reconstruct a high-resolution texture map of face reflectance from a single “nature” photo. However, he cannot optimize the three steps of the process (reconstruction, upsampling, reflectance) directly on the input image.
Imperial College London Researchers Introduce FitMe, a Fully Renderable 3DMM That Can Fit Free Headshots Using Accurate Differentiable Rendering Based on High-Resolution Facial Reflectance Texture Maps Did. FitMe establishes similarities of identities and produces highly realistic and fully renderable reconstructions ready for use in commercial rendering programs. The texture model is built as a multimodal style-based progressive his generator that creates face surface normals, specular albedo, and diffuse albedo simultaneously. A well-crafted divergence discriminator allows for easy training using a variety of statistical techniques.
They optimized AvatarMe++ on the public MimicMe dataset, built a capture-quality face reflectance dataset of 5,000 people, and made further changes to balance skin tone representation. His PCA models of face and head, trained on sizable geometry datasets, are used interchangeably for the form. These create style-based generator projections and 3DMM fitting-based single or multiple image fitting approaches. Rendering functions must be differentiable and capable of performing efficient iterative fitting quickly (within a minute), rendering models such as path tracing useless. Previous work relied on slow optimization or simpler shading models (such as Lambertian).
These build on previous work by adding shading that looks more authentic and has convincing diffuse and specular rendering that allows you to get shapes and reflectance for photorealistic rendering in common rendering engines. (Fig. 1). Due to the generator’s extended latent space and photorealistic fitting flexibility, FitMe reconstructs high-fidelity facial reflectance while accurately capturing diffuse, specular albedo, and normal features. , can achieve striking identity similarity.
Figure 1: FitMe uses reflectance models and differentiable rendering to reconstruct relightable forms and reflectance maps of face avatars from single (left) or multiple (right) unconstrained facial photographs. increase. General engines allow you to view results in photographic detail.
Overall, this work shows that:
• First 3DMM capable of generating high-resolution face reflectance and shape with an increased level of detail that can be rendered in a photorealistic manner
• Technology to acquire and enhance
First branch multimodal style-based progressive generator for high resolution 3D face assets (diffuse albedo, specular albedo, normals) and appropriate multimodal branch discriminator
please check out Paper and project pages. don’t forget to join 22,000+ ML SubReddit, Discord channeland email newsletterShare the latest AI research news, cool AI projects, and more. If you have any questions regarding the article above or missed something, feel free to email me. Asif@marktechpost.com
🚀 Check out 100’s of AI Tools at the AI Tools Club
Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his Bachelor of Science in Data Science and Artificial Intelligence from the Indian Institute of Technology (IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and he is passionate about building solutions around it. He loves connecting with people and collaborating on interesting projects.