Supervised Fine-Tuning (SFT) on domain-specific datasets is a common approach to adapt Large Language Models (LLMs) to specialized tasks but is often believed to degrade their general capabilities. In this work, we revisit this trade-off and present both empirical and theoretical insights. First, we show that SFT does not always hurt: using a smaller learning rate can substantially mitigate general performance degradation while preserving comparable target-domain performance. We then provide a theoretical analysis that explains these phenomena and further motivates a new method, Token-Adaptive Loss Reweighting (TALR). Building on this, and recognizing that smaller learning rates alone do not fully eliminate general-performance degradation in all cases, we evaluate a range of strategies for reducing general capability loss, including L2 regularization, LoRA, model averaging, FLOW, and our proposed TALR. Experimental results demonstrate that while no method completely eliminates the trade-off, TALR consistently outperforms these baselines in balancing domain-specific gains and general capabilities. Finally, we distill our findings into practical guidelines for adapting LLMs to new domains: (i) using a small learning rate to achieve a favorable trade-off, and (ii) when a stronger balance is further desired, adopt TALR as an effective strategy.
翻译:在领域特定数据集上进行监督微调(SFT)是使大语言模型(LLMs)适应专门任务的常用方法,但通常被认为会削弱其通用能力。本研究重新审视了这一权衡,并提出了实证与理论见解。首先,我们证明SFT并非总是有害:使用较小的学习率可以显著减轻通用性能的退化,同时保持相当的目标领域性能。随后,我们通过理论分析解释了这些现象,并进一步提出了一种新方法——令牌自适应损失重加权(TALR)。在此基础上,考虑到较小学习率本身并非在所有情况下都能完全消除通用性能退化,我们评估了一系列减少通用能力损失的策略,包括L2正则化、LoRA、模型平均、FLOW以及我们提出的TALR。实验结果表明,虽然没有任何方法能完全消除这一权衡,但TALR在平衡领域特定增益与通用能力方面始终优于这些基线方法。最后,我们将研究结果提炼为适应LLMs至新领域的实用指南:(i)使用较小的学习率以实现有利的权衡,以及(ii)当需要更强的平衡时,采用TALR作为一种有效策略。