生成-后验证：一种基于小型语言模型的新型问题生成方法 (Generate-Then-Validate: A Novel Question Generation Approach Using Small Language Models)

We explore the use of small language models (SLMs) for automatic question generation as a complement to the prevalent use of their large counterparts in learning analytics research. We present a novel question generation pipeline that leverages both the text generation and the probabilistic reasoning abilities of SLMs to generate high-quality questions. Adopting a "generate-then-validate" strategy, our pipeline first performs expansive generation to create an abundance of candidate questions and refine them through selective validation based on novel probabilistic reasoning. We conducted two evaluation studies, one with seven human experts and the other with a large language model (LLM), to assess the quality of the generated questions. Most judges (humans or LLMs) agreed that the generated questions had clear answers and generally aligned well with the intended learning objectives. Our findings suggest that an SLM can effectively generate high-quality questions when guided by a well-designed pipeline that leverages its strengths.

翻译：本研究探讨了在分析学习研究中，将小型语言模型（SLMs）用于自动问题生成，以补充其大型语言模型（LLM）的普遍应用。我们提出了一种新颖的问题生成流程，该流程利用SLMs的文本生成和概率推理能力来生成高质量问题。采用“生成-后验证”策略，我们的流程首先进行扩展性生成以创建大量候选问题，并通过基于新型概率推理的选择性验证来精炼它们。我们进行了两项评估研究，一项由七位人类专家参与，另一项使用大型语言模型（LLM），以评估生成问题的质量。大多数评判者（人类或LLM）一致认为，生成的问题具有清晰的答案，并且总体上与预期的学习目标良好契合。我们的研究结果表明，当通过一个精心设计的流程引导并充分利用其优势时，SLM能够有效生成高质量问题。