Clustered Federated Learning (CFL) has emerged as a powerful approach for addressing data heterogeneity and ensuring privacy in large distributed IoT environments. By clustering clients and training cluster-specific models, CFL enables personalized models tailored to groups of heterogeneous clients. However, conventional CFL approaches suffer from fragmented learning for training independent global models for each cluster and fail to take advantage of collective cluster insights. This paper advocates a shift to hierarchical CFL, allowing bi-level aggregation to train cluster-specific models at the edge and a unified global model at the cloud. This shift improves training efficiency yet might introduce communication challenges. To this end, we propose CFLHKD, a novel personalization scheme for integrating hierarchical cluster knowledge into CFL. Built upon multi-teacher knowledge distillation, CFLHKD enables inter-cluster knowledge sharing while preserving cluster-specific personalization. CFLHKD adopts a bi-level aggregation to bridge the gap between local and global learning. Extensive evaluations of standard benchmark datasets demonstrate that CFLHKD outperforms representative baselines in cluster-specific and global model accuracy and achieves a performance improvement of 3.32-7.57\%.
翻译:聚类联邦学习(CFL)已成为应对大规模分布式物联网环境中数据异构性和保障隐私的强大方法。通过聚类客户端并训练特定于聚类的模型,CFL能够为异构客户端群体提供个性化模型。然而,传统CFL方法在为每个聚类训练独立的全局模型时存在学习碎片化问题,且未能充分利用集体聚类的洞察力。本文主张转向分层CFL,通过双层聚合在边缘训练特定于聚类的模型,在云端训练统一的全局模型。这一转变提升了训练效率,但可能引入通信挑战。为此,我们提出CFLHKD——一种将分层聚类知识整合到CFL中的新型个性化方案。基于多教师知识蒸馏构建的CFLHKD,在保持聚类特定个性化的同时实现了跨聚类知识共享。CFLHKD采用双层聚合以弥合本地学习与全局学习之间的差距。对标准基准数据集的广泛评估表明,CFLHKD在聚类特定模型和全局模型精度上均优于代表性基线方法,实现了3.32%-7.57%的性能提升。