线性概念移除下任务相关信息的保持 (Preserving Task-Relevant Information Under Linear Concept Removal) - 专知论文

会员服务 ·

0

方差 · 投影 · 嵌入 · 后处理 · 衰减 ·

Preserving Task-Relevant Information Under Linear Concept Removal

翻译：线性概念移除下任务相关信息的保持

Floris Holstege,Shauli Ravfogel,Bram Wouters

from arxiv, Published at NeurIPS 2025

Modern neural networks often encode unwanted concepts alongside task-relevant information, leading to fairness and interpretability concerns. Existing post-hoc approaches can remove undesired concepts but often degrade useful signals. We introduce SPLINCE-Simultaneous Projection for LINear concept removal and Covariance prEservation - which eliminates sensitive concepts from representations while exactly preserving their covariance with a target label. SPLINCE achieves this via an oblique projection that 'splices out' the unwanted direction yet protects important label correlations. Theoretically, it is the unique solution that removes linear concept predictability and maintains target covariance with minimal embedding distortion. Empirically, SPLINCE outperforms baselines on benchmarks such as Bias in Bios and Winobias, removing protected attributes while minimally damaging main-task information.

翻译：现代神经网络常在编码任务相关信息的同时嵌入不必要的概念，引发公平性与可解释性问题。现有事后处理方法虽能移除不良概念，但常导致有用信号衰减。本文提出SPLINCE——线性概念移除与协方差保持的同步投影方法，该方法可从表征中消除敏感概念，同时精确保持其与目标标签的协方差。SPLINCE通过斜投影实现这一目标，该投影能'剪除'非期望方向，同时保护重要的标签相关性。理论上，该方法是唯一能在消除线性概念可预测性的同时，以最小嵌入失真维持目标协方差的解。实证研究表明，在Bias in Bios和Winobias等基准测试中，SPLINCE优于基线方法，在移除受保护属性的同时对主任务信息损伤最小。

0

相关内容

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

22+阅读 · 2023年5月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

【CVPR2020】跨模态哈希的无监督知识蒸馏

【CVPR2020】跨模态哈希的无监督知识蒸馏

专知会员服务

61+阅读 · 2020年6月25日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知会员服务

112+阅读 · 2019年11月25日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

30+阅读 · 2018年7月12日

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

Choquet期望下极限定理及其收敛速度的刻画

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

随机系数和带跳的线性随机微分系统的H2/H∞控制

国家自然科学基金

0+阅读 · 2014年12月31日

Debiased Inference for High-Dimensional Regression Models Based on Profile M-Estimation

Arxiv

0+阅读 · 12月12日

Assumption-Lean Post-Integrated Inference with Surrogate Control Outcomes

Arxiv

0+阅读 · 12月12日

Staying on the Manifold: Geometry-Aware Noise Injection

Arxiv

0+阅读 · 12月8日

Mechanistic Interpretability for Transformer-based Time Series Classification

Arxiv

0+阅读 · 11月26日

NIRVAR: Network Informed Restricted Vector Autoregression

Arxiv

0+阅读 · 11月10日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

22+阅读 · 2023年5月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

【CVPR2020】跨模态哈希的无监督知识蒸馏

【CVPR2020】跨模态哈希的无监督知识蒸馏

专知会员服务

61+阅读 · 2020年6月25日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知会员服务

112+阅读 · 2019年11月25日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

30+阅读 · 2018年7月12日

相关论文

Debiased Inference for High-Dimensional Regression Models Based on Profile M-Estimation

Arxiv

0+阅读 · 12月12日

Assumption-Lean Post-Integrated Inference with Surrogate Control Outcomes

Arxiv

0+阅读 · 12月12日

Staying on the Manifold: Geometry-Aware Noise Injection

Arxiv

0+阅读 · 12月8日

Mechanistic Interpretability for Transformer-based Time Series Classification

Arxiv

0+阅读 · 11月26日

NIRVAR: Network Informed Restricted Vector Autoregression

Arxiv

0+阅读 · 11月10日

相关基金

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

Choquet期望下极限定理及其收敛速度的刻画

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

随机系数和带跳的线性随机微分系统的H2/H∞控制

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员