Federated learning (FL) enables multiple clients to jointly train a model by sharing only gradient updates for aggregation instead of raw data. Due to the transmission of very high-dimensional gradient updates from many clients, FL is known to suffer from a communication bottleneck. Meanwhile, the gradients shared by clients as well as the trained model may also be exploited for inferring private local datasets, making privacy still a critical concern in FL. We present Clover, a novel system framework for communication-efficient, secure, and differentially private FL. To tackle the communication bottleneck in FL, Clover follows a standard and commonly used approach-top-k gradient sparsification, where each client sparsifies its gradient update such that only k largest gradients (measured by magnitude) are preserved for aggregation. Clover provides a tailored mechanism built out of a trending distributed trust setting involving three servers, which allows to efficiently aggregate multiple sparse vectors (top-k sparsified gradient updates) into a dense vector while hiding the values and indices of non-zero elements in each sparse vector. This mechanism outperforms a baseline built on the general distributed ORAM technique by several orders of magnitude in server-side communication and runtime, with also smaller client communication cost. We further integrate this mechanism with a lightweight distributed noise generation mechanism to offer differential privacy (DP) guarantees on the trained model. To harden Clover with security against a malicious server, we devise a series of lightweight mechanisms for integrity checks on the server-side computation. Extensive experiments show that Clover can achieve utility comparable to vanilla FL with central DP, with promising performance.
翻译:联邦学习(FL)允许多个客户端仅通过共享梯度更新而非原始数据来协同训练模型。由于需要从众多客户端传输高维度的梯度更新,FL通常面临通信瓶颈问题。同时,客户端共享的梯度以及训练完成的模型也可能被用于推断私有本地数据集,使得隐私保护在FL中仍是一个关键问题。本文提出Clover,一种新颖的通信高效、安全且具备差分隐私保障的联邦学习系统框架。为应对FL中的通信瓶颈,Clover采用一种标准且广泛使用的方法——top-k梯度稀疏化,即每个客户端对其梯度更新进行稀疏化处理,仅保留幅度最大的k个梯度用于聚合。Clover设计了一种基于新兴分布式信任架构(包含三个服务器)的定制化机制,能够高效地将多个稀疏向量(经top-k稀疏化的梯度更新)聚合成一个稠密向量,同时隐藏每个稀疏向量中非零元素的值和索引。该机制在服务器端通信开销和运行时间上比基于通用分布式ORAM技术的基线方案提升了数个数量级,同时客户端通信成本也更低。我们进一步将该机制与轻量级分布式噪声生成机制相结合,为训练模型提供差分隐私(DP)保障。为增强Clover抵御恶意服务器的安全性,我们设计了一系列轻量级机制用于服务器端计算的完整性验证。大量实验表明,Clover能够达到与采用中心化DP的原始FL方案相当的模型效用,并展现出优越的性能。