Multi-hop question answering over knowledge graphs remains computationally challenging due to the combinatorial explosion of possible reasoning paths. Recent approaches rely on expensive Large Language Model (LLM) inference for both entity linking and path ranking, limiting their practical deployment. Additionally, LLM-generated answers often lack verifiable grounding in structured knowledge. We present two complementary hybrid algorithms that address both efficiency and verifiability: (1) LLM-Guided Planning that uses a single LLM call to predict relation sequences executed via breadth-first search, achieving near-perfect accuracy (micro-F1 > 0.90) while ensuring all answers are grounded in the knowledge graph, and (2) Embedding-Guided Neural Search that eliminates LLM calls entirely by fusing text and graph embeddings through a lightweight 6.7M-parameter edge scorer, achieving over 100 times speedup with competitive accuracy. Through knowledge distillation, we compress planning capability into a 4B-parameter model that matches large-model performance at zero API cost. Evaluation on MetaQA demonstrates that grounded reasoning consistently outperforms ungrounded generation, with structured planning proving more transferable than direct answer generation. Our results show that verifiable multi-hop reasoning does not require massive models at inference time, but rather the right architectural inductive biases combining symbolic structure with learned representations.
翻译:由于推理路径的组合爆炸问题,基于知识图谱的多跳问答在计算上仍具有挑战性。现有方法依赖昂贵的大语言模型(LLM)推理进行实体链接和路径排序,限制了实际部署。此外,LLM生成的答案往往缺乏结构化知识的可验证基础。我们提出了两种互补的混合算法,以同时解决效率和可验证性问题:(1)LLM引导规划:通过单次LLM调用预测关系序列,并利用广度优先搜索执行,在确保所有答案均基于知识图谱的基础上,实现了接近完美的准确率(micro-F1 > 0.90);(2)嵌入引导神经搜索:通过一个轻量级的670万参数边评分器融合文本与图嵌入,完全消除了LLM调用,在保持竞争力的准确率的同时实现了超过100倍的加速。通过知识蒸馏,我们将规划能力压缩至一个40亿参数模型中,在零API成本下达到了与大模型相当的性能。在MetaQA数据集上的评估表明,基于图谱的推理始终优于无基础的生成方法,且结构化规划比直接答案生成具有更强的可迁移性。我们的结果表明,可验证的多跳推理在推理时无需依赖大规模模型,而应结合符号结构与学习表征的适当架构归纳偏置。