The need to assess LLMs for bias and fairness is critical, with current evaluations often being narrow, missing a broad categorical view. In this paper, we propose evaluating the bias and fairness of LLMs from a group fairness lens using a novel hierarchical schema characterizing diverse social groups. Specifically, we construct a dataset, GFAIR, encapsulating target-attribute combinations across multiple dimensions. Moreover, we introduce statement organization, a new open-ended text generation task, to uncover complex biases in LLMs. Extensive evaluations of popular LLMs reveal inherent safety concerns. To mitigate the biases of LLMs from a group fairness perspective, we pioneer a novel chainof-thought method GF-THINK to mitigate biases of LLMs from a group fairness perspective. Experimental results demonstrate its efficacy in mitigating bias and achieving fairness in LLMs. Our dataset and codes are available at https://github.com/surika/Group-Fairness-LLMs.
翻译:评估大语言模型的偏见与公平性至关重要,而现有评估方法往往较为局限,缺乏广泛的范畴化视角。本文提出通过一种新颖的层次化分类框架来刻画多元社会群体,并基于此从群体公平性视角评估大语言模型的偏见与公平性。具体而言,我们构建了涵盖多维度目标-属性组合的数据集GFAIR。此外,我们引入了陈述组织这一新型开放式文本生成任务,以揭示大语言模型中复杂的偏见模式。对主流大语言模型的广泛评估揭示了其内在的安全隐患。为从群体公平性角度缓解大语言模型的偏见,我们首创了一种新颖的思维链方法GF-THINK。实验结果表明该方法能有效减轻偏见并提升大语言模型的公平性。我们的数据集与代码已开源:https://github.com/surika/Group-Fairness-LLMs。