Corpus Aware Training (CAT) has been found to be effective in literature commonly known as the “tagging” approach by injecting corpus information into each training example by leveraging valuable corpus metadata during training. Cat-trained models inherently learn quality, domains and nuances between copara directly from data, allowing them to easily switch to different inference behaviors. To achieve the best evaluation, the CAT model pre-defined groups of high quality data before training begins, resulting in errors and inefficient. In this work, we propose optimal corpus awareness training (OCAT) that fine-tunes cat-trained models by freezing most model parameters and adjusting only a small set of corpus-related parameters. OCAT is lightweight, resilient to overfitting, and demonstrates effective in increasing model accuracy. WMT23 English to Chinese and English to German translation tasks were used as test grounds, showing +3.6 and +1.8 CHRF improvements over vanilla training, respectively. Furthermore, our approach is not very sensitive to hyperparameter settings and is standard or slightly better than other cutting edge fine-tuning techniques.
