Ensemble techniques in recommender systems have demonstrated accuracy improvements of 10-30%, yet their environmental impact remains unmeasured. While deep learning recommendation algorithms can generate up to 3,297 kg CO2 per paper, ensemble methods have not been sufficiently evaluated for energy consumption. This thesis investigates how ensemble techniques influence environmental impact compared to single optimized models. We conducted 93 experiments across two frameworks (Surprise for rating prediction, LensKit for ranking) on four datasets spanning 100,000 to 7.8 million interactions. We evaluated four ensemble strategies (Average, Weighted, Stacking/Rank Fusion, Top Performers) against simple baselines and optimized single models, measuring energy consumption with a smart plug. Results revealed a non-linear accuracy-energy relationship. Ensemble methods achieved 0.3-5.7% accuracy improvements while consuming 19-2,549% more energy depending on dataset size and strategy. The Top Performers ensemble showed best efficiency: 0.96% RMSE improvement with 18.8% energy overhead on MovieLens-1M, and 5.7% NDCG improvement with 103% overhead on MovieLens-100K. Exhaustive averaging strategies consumed 88-270% more energy for comparable gains. On the largest dataset (Anime, 7.8M interactions), the Surprise ensemble consumed 2,005% more energy (0.21 Wh vs. 0.01 Wh) for 1.2% accuracy improvement, producing 53.8 mg CO2 versus 2.6 mg CO2 for the single model. This research provides one of the first systematic measurements of energy and carbon footprint for ensemble recommender systems, demonstrates that selective strategies offer superior efficiency over exhaustive averaging, and identifies scalability limitations at industrial scale. These findings enable informed decisions about sustainable algorithm selection in recommender systems.
翻译:推荐系统中的集成技术已展现出10-30%的准确率提升,但其环境影响尚未得到量化。尽管深度学习推荐算法每篇论文可产生高达3,297千克二氧化碳排放,集成方法的能耗尚未得到充分评估。本研究探讨了集成技术与单一优化模型相比对环境影响的差异。我们在两个框架(用于评分预测的Surprise、用于排序的LensKit)上对四个数据集(涵盖10万至780万次交互)进行了93组实验,评估了四种集成策略(平均法、加权法、堆叠/排序融合法、最优模型组合法)相对于简单基线模型和优化单一模型的性能,并使用智能插座测量能耗。结果显示准确率与能耗呈非线性关系:集成方法在实现0.3-5.7%准确率提升的同时,能耗增加19-2,549%(具体取决于数据集规模和策略)。最优模型组合法展现出最佳能效:在MovieLens-1M数据集上实现0.96%的RMSE提升时能耗增加18.8%,在MovieLens-100K数据集上实现5.7%的NDCG提升时能耗增加103%。穷举平均策略在获得相近性能提升时能耗增加88-270%。在最大数据集(Anime,780万次交互)上,Surprise集成框架的能耗增加2,005%(0.21瓦时 vs. 0.01瓦时),仅带来1.2%的准确率提升,产生53.8毫克二氧化碳排放,而单一模型仅排放2.6毫克。本研究首次系统测量了集成推荐系统的能耗与碳足迹,证明选择性策略比穷举平均法具有更优能效,并揭示了工业级规模下的可扩展性局限。这些发现为推荐系统中可持续算法的选择提供了科学依据。