A central challenge in machine learning is to understand how noise or measurement errors affect low-rank approximations, particularly in the spectral norm. This question is especially important in differentially private low-rank approximation, where one aims to preserve the top-$p$ structure of a data-derived matrix while ensuring privacy. Prior work often analyzes Frobenius norm error or changes in reconstruction quality, but these metrics can over- or under-estimate true subspace distortion. The spectral norm, by contrast, captures worst-case directional error and provides the strongest utility guarantees. We establish new high-probability spectral-norm perturbation bounds for symmetric matrices that refine the classical Eckart--Young--Mirsky theorem and explicitly capture interactions between a matrix $A \in \mathbb{R}^{n \times n}$ and an arbitrary symmetric perturbation $E$. Under mild eigengap and norm conditions, our bounds yield sharp estimates for $\|(A + E)_p - A_p\|$, where $A_p$ is the best rank-$p$ approximation of $A$, with improvements of up to a factor of $\sqrt{n}$. As an application, we derive improved utility guarantees for differentially private PCA, resolving an open problem in the literature. Our analysis relies on a novel contour bootstrapping method from complex analysis and extends it to a broad class of spectral functionals, including polynomials and matrix exponentials. Empirical results on real-world datasets confirm that our bounds closely track the actual spectral error under diverse perturbation regimes.
翻译:机器学习中的一个核心挑战是理解噪声或测量误差如何影响低秩逼近,尤其是在谱范数下。这一问题在差分隐私低秩逼近中尤为重要,其目标是在确保隐私的同时保留数据导出矩阵的前$p$阶结构。先前的研究通常分析Frobenius范数误差或重构质量的变化,但这些度量可能高估或低估真实的子空间失真。相比之下,谱范数捕捉了最坏情况下的方向误差,并提供了最强的效用保证。我们为对称矩阵建立了新的高概率谱范数扰动界,这些界改进了经典的Eckart–Young–Mirsky定理,并显式地捕获了矩阵$A \in \mathbb{R}^{n \times n}$与任意对称扰动$E$之间的相互作用。在温和的特征值间隙和范数条件下,我们的界为$\|(A + E)_p - A_p\|$提供了尖锐的估计,其中$A_p$是$A$的最佳秩-$p$逼近,改进幅度可达$\sqrt{n}$倍。作为应用,我们推导了差分隐私主成分分析(PCA)的改进效用保证,解决了文献中的一个开放问题。我们的分析依赖于复分析中的一种新颖的轮廓自举方法,并将其扩展到包括多项式和矩阵指数在内的广泛谱泛函类。在真实世界数据集上的实证结果证实,我们的界在不同扰动机制下紧密跟踪实际的谱误差。