Multihop Question Answering (QA) requires systems to identify and synthesize information from multiple text passages. While most prior retrieval methods assist in identifying relevant passages for QA, further assessing the utility of the passages can help in removing redundant ones, which may otherwise add to noise and inaccuracies in the generated answers. Existing utility prediction approaches model passage utility independently, overlooking a critical aspect of multihop reasoning: the utility of a passage can be context-dependent, influenced by its relation to other passages - whether it provides complementary information or forms a crucial link in conjunction with others. In this paper, we propose a lightweight approach to model contextual passage utility, accounting for inter-passage dependencies. We fine-tune a small transformer-based model to predict passage utility scores for multihop QA. We leverage the reasoning traces from an advanced reasoning model to capture the order in which passages are used to answer a question and obtain synthetic training data. Through comprehensive experiments, we demonstrate that our utility-based scoring of retrieved passages leads to improved reranking and downstream QA performance compared to relevance-based reranking methods.
翻译:多跳问答(QA)要求系统从多个文本篇章中识别并综合信息。尽管大多数先前的检索方法有助于识别与QA相关的篇章,但进一步评估这些篇章的效用有助于去除冗余内容,否则这些冗余内容可能会增加生成答案中的噪声和不准确性。现有的效用预测方法独立地对篇章效用进行建模,忽略了多跳推理的一个关键方面:篇章的效用可能是上下文相关的,受其与其他篇章关系的影响——无论它是提供补充信息,还是与其他篇章共同形成关键链接。在本文中,我们提出了一种轻量级方法来建模上下文篇章效用,考虑篇章间的依赖关系。我们微调了一个基于小型Transformer的模型,以预测多跳QA的篇章效用分数。我们利用来自先进推理模型的推理轨迹来捕捉用于回答问题的篇章使用顺序,并获取合成训练数据。通过全面的实验,我们证明,与基于相关性的重排序方法相比,基于效用的检索篇章评分能够带来改进的重排序和下游QA性能。