Text embedding inversion attacks reconstruct original sentences from latent representations, posing severe privacy threats in collaborative inference and edge computing. We propose TextCrafter, an optimization-based adversarial perturbation mechanism that combines RL learned, geometry aware noise injection orthogonal to user embeddings with cluster priors and PII signal guidance to suppress inversion while preserving task utility. Unlike prior defenses either non learnable or agnostic to perturbation direction, TextCrafter provides a directional protective policy that balances privacy and utility. Under strong privacy setting, TextCrafter maintains 70 percentage classification accuracy on four datasets and consistently outperforms Gaussian/LDP baselines across lower privacy budgets, demonstrating a superior privacy utility trade off.
翻译:文本嵌入反演攻击通过从潜在表示中重构原始句子,在协同推理与边缘计算场景中构成严重的隐私威胁。本文提出TextCrafter,一种基于优化的对抗扰动机制,该方法融合强化学习生成的、与用户嵌入正交的几何感知噪声注入,结合聚类先验与个人可识别信息(PII)信号引导,在抑制反演攻击的同时保持任务效用。与以往非可学习或忽略扰动方向的防御方法不同,TextCrafter提供了一种方向性保护策略,实现隐私与效用的平衡。在强隐私设定下,TextCrafter在四个数据集上保持70%的分类准确率,并在较低隐私预算条件下持续优于高斯噪声/局部差分隐私基线,展现出更优的隐私-效用权衡特性。