Multiple testing is widely applied across scientific fields, particularly in genomic and health data analysis, where protecting sensitive personal information is imperative. However, developing private multiple testing algorithms for super uniform $p$-values remains an open question, as privacy mechanisms introduce intricate dependence among the peeled $p$-values and disrupt their super uniformity, complicating post-selection inference. To address this, we introduce a general Super Uniform Private (SUP) multiple testing framework with three key components. First, we develop a novel \( p \)-value transformation that is compatible with diverse privacy regimes while retaining the super uniformity. Next, a reversed peeling algorithm is designed to reduce privacy budgets while facilitating inference. Then, we provide diverse rejection thresholds that are privacy-parameter-free and tailored for different Type-I errors, including the family-wise error rate (FWER) and the false discovery rate (FDR). Building upon these, we advance adaptive techniques to determine the peeling number and boost thresholds. Theoretically, we propose a technique overcoming the post-selection obstacle to Type-I error control, quantify the privacy-induced power loss of SUP relative to its non-private counterpart, and demonstrate that SUP surpasses existing private methods in terms of power. The results of extensive simulations and a real data application validate our theories.
翻译:多重检验广泛应用于科学领域,特别是在基因组和健康数据分析中,保护敏感个人信息至关重要。然而,针对超均匀$p$值开发私有多重检验算法仍是一个开放性问题,因为隐私机制会在剥离的$p$值之间引入复杂的依赖性,并破坏其超均匀性,从而使后选择推断复杂化。为解决这一问题,我们提出了一种通用的超均匀私有(SUP)多重检验框架,包含三个关键组成部分。首先,我们开发了一种新颖的$p$值变换方法,该方法兼容多种隐私机制,同时保持超均匀性。其次,设计了一种反向剥离算法,以减少隐私预算并促进推断。然后,我们提供了多种拒绝阈值,这些阈值与隐私参数无关,并针对不同的第一类错误(包括族错误率(FWER)和错误发现率(FDR))进行定制。在此基础上,我们进一步提出了自适应技术来确定剥离数量和提升阈值。理论上,我们提出了一种克服后选择对第一类错误控制障碍的技术,量化了SUP相对于其非私有对应方法因隐私引入的功效损失,并证明SUP在功效方面优于现有的私有方法。大量模拟和实际数据应用的结果验证了我们的理论。