Concept erasure, which fine-tunes diffusion models to remove undesired or harmful visual concepts, has become a mainstream approach to mitigating unsafe or illegal image generation in text-to-image models.However, existing removal methods typically adopt a unidirectional erasure strategy by either suppressing the target concept or reinforcing safe alternatives, making it difficult to achieve a balanced trade-off between concept removal and generation quality. To address this limitation, we propose a novel Bidirectional Image-Guided Concept Erasure (Bi-Erasing) framework that performs concept suppression and safety enhancement simultaneously. Specifically, based on the joint representation of text prompts and corresponding images, Bi-Erasing introduces two decoupled image branches: a negative branch responsible for suppressing harmful semantics and a positive branch providing visual guidance for safe alternatives. By jointly optimizing these complementary directions, our approach achieves a balance between erasure efficacy and generation usability. In addition, we apply mask-based filtering to the image branches to prevent interference from irrelevant content during the erasure process. Across extensive experiment evaluations, the proposed Bi-Erasing outperforms baseline methods in balancing concept removal effectiveness and visual fidelity.
翻译:概念擦除通过微调扩散模型以移除不良或有害的视觉概念,已成为缓解文本到图像模型中不安全或非法图像生成的主流方法。然而,现有的移除方法通常采用单向擦除策略,即要么抑制目标概念,要么强化安全替代概念,难以在概念移除与生成质量之间实现平衡的权衡。为克服这一局限,本文提出一种新颖的双向图像引导概念擦除(Bi-Erasing)框架,同步执行概念抑制与安全增强。具体而言,基于文本提示与对应图像的联合表示,Bi-Erasing引入两个解耦的图像分支:负向分支负责抑制有害语义,正向分支为安全替代概念提供视觉引导。通过联合优化这两个互补方向,本方法在擦除效能与生成可用性之间取得了平衡。此外,我们在图像分支上应用基于掩码的过滤机制,以防止擦除过程中无关内容的干扰。在广泛的实验评估中,所提出的Bi-Erasing在平衡概念移除效果与视觉保真度方面均优于基线方法。