基于上下文归一化的检索增强生成长文本推理基础研究 (Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation)

Retrieval-Augmented Generation (RAG) has become an essential approach for extending the reasoning and knowledge capacity of large language models (LLMs). While prior research has primarily focused on retrieval quality and prompting strategies, the influence of how the retrieved documents are framed, i.e., context format, remains underexplored. We show that seemingly superficial choices, such as delimiters or structural markers in key-value extraction, can induce substantial shifts in accuracy and stability, even when semantic content is identical. To systematically investigate this effect, we design controlled experiments that vary context density, delimiter styles, and positional placement, revealing the underlying factors that govern performance differences. Building on these insights, we introduce Contextual Normalization, a lightweight strategy that adaptively standardizes context representations before generation. Extensive experiments on both controlled and real-world RAG benchmarks across diverse settings demonstrate that the proposed strategy consistently improves robustness to order variation and strengthens long-context utilization. These findings underscore that reliable RAG depends not only on retrieving the right content, but also on how that content is presented, offering both new empirical evidence and a practical technique for better long-context reasoning.

翻译：检索增强生成（RAG）已成为扩展大语言模型（LLM）推理与知识能力的重要方法。尽管先前研究主要聚焦于检索质量与提示策略，但检索文档的呈现方式（即上下文格式）的影响尚未得到充分探索。我们发现，即使语义内容完全相同，诸如分隔符或键值提取中的结构标记等表面选择，也可能导致准确性与稳定性的显著变化。为系统研究此效应，我们设计了控制实验，通过改变上下文密度、分隔符样式和位置布局，揭示影响性能差异的内在因素。基于这些发现，我们提出上下文归一化——一种在生成前自适应标准化上下文表示的轻量级策略。在多样化设置下的控制实验与真实世界RAG基准测试中，大量实验表明该策略能持续提升对顺序变化的鲁棒性，并增强长文本利用能力。这些发现强调：可靠的RAG不仅依赖于检索正确内容，更取决于内容的呈现方式，从而为优化长文本推理提供了新的实证依据与实用技术。