Even though the output is ultimately consumed by a human viewer, 3D Gaussian splatting (3DGS) techniques often rely on ad hoc combinations of pixel-level losses, resulting in blurry rendering. To address this, we systematically explore perceptual optimization strategies for 3DGS by exploring different sets of distortion losses. We conduct a first-of-its-kind large-scale human subjective study of 3DGS, including 39,320 pairwise evaluations across multiple datasets and 3DGS frameworks. The normalized version of Wasserstein Distortion (called WD-R) emerges as the clear winner, excelling at restoring fine textures without increasing the number of splats. WD-R is preferred by evaluators with more than 2.3 times the original 3DGS loss and more than 1.5 times the current best method, Perceptual-GS. WD-R also consistently achieves state-of-the-art LPIPS, DISTS, and FID scores across a variety of datasets and generalizes across recent frameworks such as Mip-Splatting and Scaffold-GS. Replacing the original loss with WD-R consistently improves the perceived quality within a similar resource budget (number of splats for Mip-Splatting, model size for Scaffold-GS), leading to a more preferred reconstruction. Human raters were 1.8x and 3.6x, respectively. We also found that this carried over to the task of 3DGS scene compression, resulting in approximately 50% bitrate savings for comparable perceptual metric performance.
- † New York University (Tandon School of Engineering)
- ‡ Equal contribution

Figure 1: Optimized 3DGS representation and compression framework using 2D distortion and rate-distortion objectives. Perceptual loss is included as part of the training framework.

Figure 2: Bayesian Elo scores indoor scenes (deep blending, indoor Mip-NeRF 360), outdoor scenes (tanks and temples, outdoor Mip-NeRF 360, BungeeNeRF), and a 3DGS representation method for all scenes combined. WD-R and WD achieve the highest scores for all settings (within 95% confidence intervals).
