In this paper, we propose a general framework for testing the conditional distribution equality in a two-sample problem, which is most relevant to covariate shift and causal discovery. Our framework is built on neural network-based generative methods and sample splitting techniques by transforming the conditional testing problem into an unconditional one. We introduce the generative classification accuracy-based conditional distribution equality test (GCA-CDET) to illustrate the proposed framework. We establish the convergence rate for the learned generator by deriving new results related to the recently-developed offset Rademacher complexity and prove the testing consistency of GCA-CDET under mild conditions.Empirically, we conduct numerical studies including synthetic datasets and two real-world datasets, demonstrating the effectiveness of our approach. Additional discussions on the optimality of the proposed framework are provided in the online supplementary material.
翻译:本文提出了一种用于双样本问题中条件分布相等性检验的通用框架,该框架与协变量偏移和因果发现高度相关。我们的框架基于神经网络生成方法和样本分割技术,通过将条件检验问题转化为无条件检验问题而构建。我们引入了基于生成分类准确率的条件分布相等性检验(GCA-CDET)以阐释所提出的框架。通过推导与最近发展的偏移Rademacher复杂度相关的新结果,我们建立了学习生成器的收敛速率,并在温和条件下证明了GCA-CDET的检验一致性。在实证方面,我们开展了包含合成数据集和两个真实世界数据集的数值研究,验证了该方法的有效性。关于所提出框架最优性的进一步讨论详见在线补充材料。