Frequent parameter exchanges between clients and the edge server incur substantial communication overhead, posing a critical bottleneck in federated learning (FL). By exploiting the superposition property of wireless waveforms, over-the-air (OTA) computation enables simultaneous analog aggregation of local updates, thereby reducing communication latency and improving spectrum efficiency. However, its scalability is constrained by the limited number of available orthogonal waveform resources, which are typically far fewer than the model dimension. To address this, we propose AgeTop-$k$, an age-aware gradient sparsification strategy that performs compression through a two-stage selection process. Specifically, the edge server first selects candidate gradient entries based on their magnitudes, and then further prioritizes them according to the Age of Information (AoI), which quantifies the staleness of updates. AoI tracking is achieved efficiently by maintaining an age vector at the edge server. We derive theoretical convergence guarantees for non-convex loss functions and demonstrate the efficacy of AgeTop-$k$ through extensive simulations.
翻译:暂无翻译