As Artificial Intelligence (AI) becomes increasingly integrated into daily life, there is a growing need to equip the next generation with the ability to apply, interact with, evaluate, and collaborate with AI systems responsibly. Prior research highlights the urgent demand from K-12 educators to teach students the ethical and effective use of AI for learning. To address this need, we designed an Large-Language Model (LLM)-based module to teach prompting literacy. This includes scenario-based deliberate practice activities with direct interaction with intelligent LLM agents, aiming to foster secondary school students' responsible engagement with AI chatbots. We conducted two iterations of classroom deployment in 11 authentic secondary education classrooms, and evaluated 1) AI-based auto-grader's capability; 2) students' prompting performance and confidence changes towards using AI for learning; and 3) the quality of learning and assessment materials. Results indicated that the AI-based auto-grader could grade student-written prompts with satisfactory quality. In addition, the instructional materials supported students in improving their prompting skills through practice and led to positive shifts in their perceptions of using AI for learning. Furthermore, data from Study 1 informed assessment revisions in Study 2. Analyses of item difficulty and discrimination in Study 2 showed that True/False and open-ended questions could measure prompting literacy more effectively than multiple-choice questions for our target learners. These promising outcomes highlight the potential for broader deployment and highlight the need for broader studies to assess learning effectiveness and assessment design.
翻译:随着人工智能日益融入日常生活,培养下一代负责任地应用、交互、评估和协作AI系统的能力变得愈发迫切。先前研究强调,K-12教育工作者亟需教授学生如何以伦理且有效的方式利用AI辅助学习。为应对这一需求,我们设计了一个基于大语言模型的模块,用于教授提示词素养。该模块包含基于场景的刻意练习活动,学生可直接与智能LLM代理交互,旨在促进中学生负责任地使用AI聊天机器人。我们在11个真实中学课堂中进行了两轮教学部署,并评估了:1)基于AI的自动评分器的能力;2)学生使用AI进行学习的提示词表现与信心变化;3)学习与评估材料的质量。结果表明,基于AI的自动评分器能以令人满意的质量对学生撰写的提示词进行评分。此外,教学材料通过实践帮助学生提升了提示词技能,并使其对使用AI进行学习的看法产生了积极转变。进一步地,研究一的数据为研究二的评估修订提供了依据。对研究二中题目难度与区分度的分析表明,对于目标学习者而言,判断题与开放式问题比选择题能更有效地衡量提示词素养。这些积极结果凸显了该模块广泛部署的潜力,并指出需要开展更广泛的研究以评估学习效果与评估设计。