MTL:POMSI framework optimizations resolve gradient conflicts and scale imbalances

Deep multitask learning (MTL) is a powerful paradigm that allows a single neural network to output multiple predictions by sharing an internal representation, which is important for applications such as autonomous driving and robotics. However, jointly optimizing these tasks is notoriously difficult due to two major interference problems. These are “gradient competition,” where updates from different tasks conflict with each other, and “scale imbalance,” where tasks with large gradient magnitudes dominate the learning process. These challenges often result in issues such as: Suboptimal performance, where some tasks are ignored or the model is unable to converge efficiently.

To overcome these limitations, a team from City University of Hong Kong and Southern University of Science and Technology developed the POMSI (Project Competing Gradients and Reduce Scale Imbalance) method. POMSI approaches the optimization environment from both a geometric and magnitude perspective. We first detect directional discrepancies between task gradients and apply projection techniques to ensure that they contribute positively to the shared parameters. At the same time, we adopt a learnable adjustment factor based on: The gradient similarity for rescaling task contributions prevents a single task from overshadowing other tasks by scaling the loss function.

Extensive evaluations on datasets such as NYUDv2 and PASCAL-Context reveal that POMSI consistently outperforms state-of-the-art methods such as GradNorm and PCGrad. POMSI achieves higher generalization accuracy and more stable training by effectively balancing diverse visual tasks, from semantic segmentation to depth estimation. This research provides a robust and flexible solution for modern MTL architectures and advances the development of highly integrated AI recognition systems.

https://link.springer.com/article/10.1007/s11704-024-40632-2

Source link