Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity and biases that may disproportionately affect some members of a group. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential type of algorithmic discrimination that may be exhibited by machine learning models, even when these systems achieve group fairness objectives. We study this issue through the lens of gender bias in occupation classification. We quantify SNoB by measuring how an algorithm's predictions are associated with conformity to inferred gender norms. When predicting if an individual belongs to a male-dominated occupation, this framework reveals that "fair" classifiers still favor biographies written in ways that align with inferred masculine norms. We compare SNoB across algorithmic fairness methods and show that it is frequently a residual bias, and post-processing approaches do not mitigate this type of bias at all.
翻译:许多现代机器学习算法通过对与性别或种族等敏感属性有关的粗略定义群体实行公平限制,减轻了偏见。然而,这些算法很少考虑到群体内部的异质性和偏见,而这种差异和偏见可能对某个群体的某些成员造成不成比例的影响。在这项工作中,我们把ScienceNorm Bias(SNoB)定性为一种微妙但随之而来的算法歧视,这种歧视可以通过机器学习模式表现出来,即使这些系统实现了群体公平目标。我们通过职业分类中的性别偏见的透镜来研究这一问题。我们用SNoB来量化SNoB,衡量一种算法的预测如何与推断的性别规范相符。在预测一个人是否属于男性主导的职业时,这一框架显示“公平”分类者仍然偏爱以与推断的男性规范相一致的方式书写生物学。我们将SNoB与算法方法进行比较,并表明它往往是一种残余的偏差,而后处理方法根本不减轻这种偏差。