We study the classic problem of prediction with expert advice under the constraint of local differential privacy (LDP). In this context, we first show that a classical algorithm naturally satisfies LDP and then design two new algorithms that improve it: RW-AdaBatch and RW-Meta. For RW-AdaBatch, we exploit the limited-switching behavior induced by LDP to provide a novel form of privacy amplification that grows stronger on easier data, analogous to the shuffle model in offline learning. Drawing on the theory of random walks, we prove that this improvement carries essentially no utility cost. For RW-Meta, we develop a general method for privately selecting between experts that are themselves non-trivial learning algorithms, and we show that in the context of LDP this carries no extra privacy cost. In contrast, prior work has only considered data-independent experts. We also derive formal regret bounds that scale inversely with the degree of independence between experts. Our analysis is supplemented by evaluation on real-world data reported by hospitals during the COVID-19 pandemic; RW-Meta outperforms both the classical baseline and a state-of-the-art \textit{central} DP algorithm by 1.5-3$\times$ on the task of predicting which hospital will report the highest density of COVID patients each week.
翻译:我们研究了在局部差分隐私(LDP)约束下的经典专家建议预测问题。在此背景下,我们首先证明一种经典算法天然满足LDP,随后设计了两种改进算法:RW-AdaBatch 和 RW-Meta。对于 RW-AdaBatch,我们利用LDP引发的有限切换行为,提出了一种新型隐私增强机制,该机制在数据更简单时效果更强,类似于离线学习中的混洗模型。基于随机游走理论,我们证明这种改进几乎不带来效用损失。对于 RW-Meta,我们开发了一种在自身即为非平凡学习算法的专家之间进行隐私选择的一般方法,并证明在LDP背景下该方法不会产生额外隐私成本。相比之下,先前研究仅考虑数据无关的专家。我们还推导了形式化的遗憾界,其规模与专家间独立程度成反比。我们通过基于COVID-19疫情期间医院报告的真实数据评估补充了理论分析:在预测每周哪家医院将报告最高COVID患者密度的任务中,RW-Meta 的性能分别超越经典基线算法和最先进的中心差分隐私算法1.5至3倍。