Neural conditional language generation models achieve the state-of-the-art in Neural Machine Translation (NMT) but are highly dependent on the quality of parallel training dataset. When trained on low-quality datasets, these models are prone to various error types, including hallucinations, i.e. outputs that are fluent, but unrelated to the source sentences. These errors are particularly dangerous, because on the surface the translation can be perceived as a correct output, especially if the reader does not understand the source language. We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT. We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations. We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.
翻译:神经条件语言生成模型在神经机器翻译(NMT)中达到了最新水平,但高度依赖平行培训数据集的质量。当接受低质量数据集培训时,这些模型容易出现各种错误类型,包括幻觉,即流利但与源句无关的产出。这些错误特别危险,因为翻译在表面上可以被视为正确的输出,特别是当读者不理解源语言时。我们提出了一个案例研究,侧重于模型理解和常规化,以减少NMT中的幻觉。我们首先使用特征归属方法来研究产生幻觉的NMT模型的行为。然后我们利用这些方法提出新的损失功能,大大地帮助减少幻觉,而不需要从零开始对模型进行再培训。