Text embedding inversion attacks reconstruct original sentences from latent representations, posing severe privacy threats in collaborative inference and edge computing. We propose TextCrafter, an optimization-based adversarial perturbation mechanism that combines RL learned, geometry aware noise injection orthogonal to user embeddings with cluster priors and PII signal guidance to suppress inversion while preserving task utility. Unlike prior defenses either non learnable or agnostic to perturbation direction, TextCrafter provides a directional protective policy that balances privacy and utility. Under strong privacy setting, TextCrafter maintains 70 percentage classification accuracy on four datasets and consistently outperforms Gaussian/LDP baselines across lower privacy budgets, demonstrating a superior privacy utility trade off.
翻译:文本嵌入反演攻击通过潜在表示重构原始语句,在协同推理与边缘计算场景中构成严重隐私威胁。本文提出TextCrafter——一种基于优化的对抗扰动机制,该方法融合强化学习生成的几何感知噪声注入(与用户嵌入空间正交)、聚类先验及个人可识别信息信号引导,在抑制反演攻击的同时保持任务效用。相较于现有非可学习或扰动方向无关的防御方案,TextCrafter提供了一种平衡隐私与效用的方向性保护策略。在强隐私约束下,TextCrafter在四个数据集上保持70%的分类准确率,并在较低隐私预算条件下持续优于高斯噪声/本地差分隐私基线,展现出更优的隐私-效用权衡特性。