苏格拉底式学生：通过提问教会语言模型学习 (Socratic Students: Teaching Language Models to Learn by Asking Questions)

Large Language Models (LLMs) excel at static interactions, where they answer user queries by retrieving knowledge encoded in their parameters. However, in many real-world settings, such as educational tutoring or medical assistance, relevant information is not directly available and must be actively acquired through dynamic interactions. An interactive agent would recognize its own uncertainty, ask targeted questions, and retain new knowledge efficiently. Prior work has primarily explored effective ways for a teacher to instruct the student, where the teacher identifies student gaps and provides guidance. In this work, we shift the focus to the student and investigate effective strategies to actively query the teacher in seeking useful information. Across math and coding benchmarks, where baseline student models begin with near-zero performance, we show that student-led approaches consistently yield absolute Pass@k improvements of at least 0.5 over static baselines. To improve question quality, we train students using Direct Preference Optimization (DPO) with guidance from either self or stronger students. We find that this guided training enables smaller models to learn how to ask better questions, further enhancing learning efficiency.

翻译：大型语言模型（LLMs）在静态交互中表现出色，能够通过检索编码在其参数中的知识来回答用户查询。然而，在许多现实场景中，例如教育辅导或医疗协助，相关信息并非直接可得，必须通过动态交互主动获取。一个交互式智能体应能识别自身的不确定性，提出有针对性的问题，并高效地保留新知识。先前的研究主要探索了教师指导学生的有效方式，即教师识别学生的知识缺口并提供指导。在本研究中，我们将焦点转向学生，并研究主动向教师提问以获取有用信息的有效策略。在数学和编程基准测试中，基线学生模型初始表现接近零分，我们发现学生主导的方法相较于静态基线模型，在绝对Pass@k指标上持续带来至少0.5的提升。为提高提问质量，我们使用直接偏好优化（DPO）训练学生模型，并借助自身或更强学生的指导。我们发现，这种引导式训练能使较小模型学会如何提出更好的问题，从而进一步提升学习效率。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日