Heavy-tailed noise has attracted growing attention in nonconvex stochastic optimization, as numerous empirical studies suggest it offers a more realistic assumption than standard bounded variance assumption. In this work, we investigate nonconvex-PL minimax optimization under heavy-tailed gradient noise in federated learning. We propose two novel algorithms: Fed-NSGDA-M, which integrates normalized gradients, and FedMuon-DA, which leverages the Muon optimizer for local updates. Both algorithms are designed to effectively address heavy-tailed noise in federated minimax optimization, under a milder condition. We theoretically establish that both algorithms achieve a convergence rate of $O({1}/{(TNp)^{\frac{s-1}{2s}}})$. To the best of our knowledge, these are the first federated minimax optimization algorithms with rigorous theoretical guarantees under heavy-tailed noise. Extensive experiments further validate their effectiveness.
翻译:在非凸随机优化中,重尾噪声日益受到关注,大量实证研究表明其比标准有界方差假设更为贴近现实。本文研究联邦学习框架下重尾梯度噪声条件下的非凸-PL极小极大优化问题。我们提出了两种新颖算法:Fed-NSGDA-M(融合归一化梯度技术)与FedMuon-DA(利用Muon优化器进行本地更新)。两种算法均设计用于在较温和条件下有效处理联邦极小极大优化中的重尾噪声。我们从理论上证明两种算法均能达到$O({1}/{(TNp)^{\frac{s-1}{2s}}})$的收敛速率。据我们所知,这是首个在重尾噪声条件下具有严格理论保证的联邦极小极大优化算法。大量实验进一步验证了其有效性。