超越一刀切：面向差分隐私表格数据合成的神经网络方法 (Beyond One-Size-Fits-All: Neural Networks for Differentially Private Tabular Data Synthesis)

In differentially private (DP) tabular data synthesis, the consensus is that statistical models are better than neural network (NN)-based methods. However, we argue that this conclusion is incomplete and overlooks the challenge of densely correlated datasets, where intricate dependencies can overwhelm statistical models. In such complex scenarios, neural networks are more suitable due to their capacity to fit complex distributions by learning directly from samples. Despite this potential, existing NN-based algorithms still suffer from significant limitations. We therefore propose MargNet, incorporating successful algorithmic designs of statistical models into neural networks. MargNet applies an adaptive marginal selection strategy and trains the neural networks to generate data that conforms to the selected marginals. On sparsely correlated datasets, our approach achieves utility close to the best statistical method while offering an average 7$\times$ speedup over it. More importantly, on densely correlated datasets, MargNet establishes a new state-of-the-art, reducing fidelity error by up to 26\% compared to the previous best. We release our code on GitHub.\footnote{https://github.com/KaiChen9909/margnet}

翻译：在差分隐私（DP）表格数据合成领域，学界普遍认为统计模型优于基于神经网络（NN）的方法。然而，我们认为这一结论并不完整，且忽视了密集关联数据集的挑战——其中复杂的依赖关系可能使统计模型不堪重负。在此类复杂场景中，神经网络因其能够直接从样本中学习以拟合复杂分布的特性而更为适用。尽管存在这种潜力，现有的基于神经网络的算法仍存在显著局限。为此，我们提出MargNet，将统计模型成功的算法设计融入神经网络中。MargNet采用自适应边际选择策略，并训练神经网络生成符合选定边际的数据。在稀疏关联数据集上，我们的方法实现了与最佳统计方法相近的效用，同时平均速度提升达7倍。更重要的是，在密集关联数据集上，MargNet确立了新的最优性能，相较于先前最佳方法，保真度误差降低高达26%。我们已在GitHub上开源代码。\\footnote{https://github.com/KaiChen9909/margnet}

相关内容

神经网络

关注 0

人工神经网络（Artificial Neural Network，即ANN ），是20世纪80 年代以来人工智能领域兴起的研究热点。它从信息处理角度对人脑神经元网络进行抽象，建立某种简单模型，按不同的连接方式组成不同的网络。在工程与学术界也常直接简称为神经网络或类神经网络。神经网络是一种运算模型，由大量的节点（或称神经元）之间相互联接构成。每个节点代表一种特定的输出函数，称为激励函数（activation function）。每两个节点间的连接都代表一个对于通过该连接信号的加权值，称之为权重，这相当于人工神经网络的记忆。网络的输出则依网络的连接方式，权重值和激励函数的不同而不同。而网络自身通常都是对自然界某种算法或者函数的逼近，也可能是对一种逻辑策略的表达。最近十多年来，人工神经网络的研究工作不断深入，已经取得了很大的进展，其在模式识别、智能机器人、自动控制、预测估计、生物、医学、经济等领域已成功地解决了许多现代计算机难以解决的实际问题，表现出了良好的智能特性。

【NeurIPS2024】超越冗余：信息感知的无监督多重图结构学习

专知会员服务

27+阅读 · 2024年9月29日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

WWW2021 | 同源共流：一个优化框架统一与解释图神经网络

专知会员服务

30+阅读 · 2021年2月26日