Sensible gradient coherence allows robust generalization

Scientists are increasingly relying on neural emulators to approximate solutions to complex physical systems, but ensuring that these models reliably capture the underlying symmetries remains a major challenge. Los Alamos National Laboratory’s James Amarel, Robyn Miller, and Nicolas Hegartner, along with colleagues Migliori, Castleton, and Skurikhin, have developed a new diagnostic method that goes beyond simple tests of forward-pass behavior to assess how well neural networks internalize physical symmetries. Their work introduces an “influence-based” method to measure how parameter updates propagate between symmetry-related states, effectively exploring the local geometry of loss situations. By applying this technique to a fluid emulator, the research team demonstrates that “gradient coherence” is important for learning to generalize across symmetry transformations and identify basins compatible with those symmetries, providing a powerful new way to evaluate surrogate models and ensure that you have truly learned the physics.

This innovative approach goes beyond a simple test for forward-pass homoscedasticity and directly examines learning dynamics to determine whether the model is truly learning the underlying physics or just memorizing patterns in the training data. This research focuses on understanding how neural networks represent physical systems governed by equations such as the Navier-Stokes equations that exhibit symmetries such as translation, rotation, and reflection. These symmetries define equivalence classes of solutions, or “trajectories.” This means that physically equivalent states should be treated similarly by a well-trained emulator.

The researchers hypothesized that a model that truly internalized the solution operators would seamlessly propagate information across these trajectories, as evidenced by the alignment of the slopes of the loss functions when evaluated with symmetry-related inputs. This consistency indicates that training updates constructively influence each other and prevent separation of physically equivalent configurations, an important indicator of robust generalization. Measuring this mutual influence therefore provides a diagnostic tool that goes beyond standard forward pass checks and reveals how physically consistent training updates are. This study contributes to three main areas of machine learning research: interpretability, generalization theory, and scientific machine learning.
By leveraging influence functions, the team is advancing interpretability techniques to explore the dynamics of training beyond simple output analysis. Furthermore, this study frames symmetry learning as a basin selection problem within a loss landscape dominated by orbital gradient coherence, informing generalization theory. Within the realm of scientific machine learning, this provides a principled diagnostic for assessing whether a neural emulator has truly learned the symmetries inherent in the solution operators. This innovative approach provides a concise framework for assessing the consistent behavior of symmetries through both forward pass consistency and probing learning dynamics, providing a powerful tool for evaluating and improving the generalization capabilities of neural emulators.

Gradient coherence reveals symmetric learning in emulators

This work pioneers a method to quantify a model’s ability to generalize across symmetric trajectories by framing symmetry learning as a basin selection problem within a loss landscape. The researchers designed a symmetry-aware gradient diagnostic that leverages influence functions to directly investigate training dynamics, extending previous explainability frameworks beyond simply analyzing forward-pass behavior. This approach uses the Lie derivative of the cost function along the gradient direction caused by each individual test example, expressed as Vμ = −χμν∂νCx, to define a vector field representing the contribution of each example to the loss. To extend the scope, the analysis was extended to models trained on velocity fields generated by the Navier-Stokes equations using NS-BB, NS-Gauss, and NS-Sines initial conditions representing qualitatively different feature spaces. The model was trained in distributed mode on two 40GB A100 GPUs using Lux. jl and zygot. jl has three seeds that control initialization and partitioning of the dataset, and results are reported with quantile range bars to capture variation.

Gradient coherence reveals symmetry internalization in emulators

Scientists have achieved a new method for evaluating symmetry internalization in neural emulators of partial differential equation solving operators. Experiments revealed that the proposed trajectory-wise gradient coherence is a local property of the loss landscape of the trained model, providing a new technique for determining whether a surrogate model internalizes the symmetry properties of known solution operators. The results show that symmetry neglect can manifest not only in the representation space but also in the local geometry and can prevent a consistent update structure across symmetry-related inputs. Each state snapshot consisted of a 128 × 128 grid of mass density, Cartesian momentum density, and energy density.

The data shows that despite fewer parameters, ViT consistently outperforms UNets on test metrics, suggesting a more efficient learning process. Measurements confirm that the influence function, expressed as the Lie derivative of the cost along the gradient direction, effectively quantifies whether a gradient update caused by one example reduces or increases the loss of another example. When applied to symmetry-related inputs, the neural tangent kernel acts as an analog of Fisher information and measures whether the learning signal propagates coherently along the symmetry trajectory. The researchers evaluated the influence function across six test mini-batches and standardized the resulting influence matrix by normalizing with respect to the empirical variance, establishing uniformity as a baseline for unstructured stochastic variation, and deviations indicating influence beyond random noise.

Symmetric learning linked to gradient coherence improves generalization

This finding proves that homoscedastic error at testing is related to how training dynamics distribute effects across symmetric trajectories, and reveals the difference between models that truly learn shared structure and those that simply assemble local estimators. Although symmetry-agnostic architectures can achieve low errors, they may allocate influences unevenly and may result in accurate interpolation without internalizing the underlying physics, leading to situations where high prediction accuracy does not guarantee robustness under symmetric transformations. Conversely, apparently equivariant layers like UNets promote data efficiency and generalization through uniform gradient coupling, but this can slow down optimization. The authors acknowledge that their diagnosis characterizes symmetry learning related to the training distribution, meaning that biases in the data representation can influence the measured coherence.

They also note that while the transformer is scalable, it can converge rapidly with special gradients that sacrifice faithful symmetry representations. Future research should focus on developing approximate or relaxed symmetric mechanisms that balance the need for generalization with the flexibility required for efficient optimization, potentially combining the strengths of transformer and equivariant modeling. This study provides valuable diagnostics for building confidence in scientific machine learning systems, especially when robustness under symmetry is paramount.

👉 More information
🗞 Loss landscape geometry and learning symmetries: Or what influence functions reveal about robust generalization.
🧠ArXiv: https://arxiv.org/abs/2601.20172

Source link

打开Binance账户 commented on Venture capital is opening the gates for defense tech: Can you be more specific about the content of your
注册 commented on Apple Stops Human Support on X: Your point of view caught my eye and was very inte
god of كازينو commented on Apple and Salesforce respond to YouTube video complaints: Hello Dear, are you actually visiting this web pag
创建免费账户 commented on CX Decoded Podcast Episode 2: AI Empowered CX: Real Conversations, Real Results: Shri Nandan, Comcast: Thank you for your sharing. I am worried that I la
开设Binance账户 commented on Driving Innovation & Making a Lasting Impact: Can you be more specific about the content of your

Sensible gradient coherence allows robust generalization

Gradient coherence reveals symmetric learning in emulators

Gradient coherence reveals symmetry internalization in emulators

Symmetric learning linked to gradient coherence improves generalization

RECENT POSTS

SK Hynix’s stock price aims to reach record high again, driven by AI boom

What’s holding back AI for private insurers?

AI cannot run on its own. So will your Quant team.

Gradient coherence reveals symmetric learning in emulators

Gradient coherence reveals symmetry internalization in emulators

Symmetric learning linked to gradient coherence improves generalization

Related Posts