Early identification of at-risk students is critical for effective intervention in online learning environments. This study extends temporal prediction analysis to Week 20 (50% of course duration), comparing Decision Tree and Long Short- Term Memory (LSTM) models across six temporal snapshots. Our analysis reveals that different performance metrics matter at different intervention stages: high recall is critical for early intervention (Weeks 2-4), while balanced precision-recall is important for mid-course resource allocation (Weeks 8-16), and high precision becomes paramount in later stages (Week 20). We demonstrate that static demographic features dominate predictions (68% importance), enabling assessment-free early prediction. The LSTM model achieves 97% recall at Week 2, making it ideal for early intervention, while Decision Tree provides stable balanced performance (78% accuracy) during mid-course. By Week 20, both models converge to similar recall (68%), but LSTM achieves higher precision (90% vs 86%). Our findings also suggest that model selection should depend on intervention timing, and that early signals (Weeks 2-4) are sufficient for reliable initial prediction using primarily demographic and pre-enrollment information.
翻译:在在线学习环境中,早期识别学业风险学生对于实施有效干预至关重要。本研究将时序预测分析扩展至第20周(课程时长的50%),比较了六个时序快照下的决策树与长短期记忆(LSTM)模型。分析表明,不同干预阶段需关注不同的性能指标:高召回率对早期干预(第2-4周)至关重要,平衡的精确率-召回率对课程中期资源分配(第8-16周)十分重要,而高精确率在后期阶段(第20周)变得尤为关键。我们证明静态人口统计学特征在预测中占主导地位(重要性占比68%),可实现免评估的早期预测。LSTM模型在第2周达到97%的召回率,使其成为早期干预的理想选择,而决策树在课程中期提供稳定的平衡性能(78%准确率)。至第20周,两种模型的召回率趋近相同(68%),但LSTM获得更高的精确率(90%对比86%)。我们的研究结果还表明,模型选择应取决于干预时机,且早期信号(第2-4周)结合主要的人口统计学特征与入学前信息已足以进行可靠的初始预测。