Gender-bias stereotypes have recently raised significant ethical concerns in natural language processing. However, progress in detection and evaluation of gender bias in natural language understanding through inference is limited and requires further investigation. In this work, we propose an evaluation methodology to measure these biases by constructing a challenge task that involves pairing gender-neutral premises against a gender-specific hypothesis. We use our challenge task to investigate state-of-the-art NLI models on the presence of gender stereotypes using occupations. Our findings suggest that three models (BERT, RoBERTa, BART) trained on MNLI and SNLI datasets are significantly prone to gender-induced prediction errors. We also find that debiasing techniques such as augmenting the training dataset to ensure a gender-balanced dataset can help reduce such bias in certain cases.
翻译:最近,在自然语言处理过程中,性别偏见的陈规定型观念引起了重大的伦理问题,然而,通过推论发现和评价自然语言理解中的性别偏见的进展有限,需要进一步调查。在这项工作中,我们提出一种评价方法,通过构建一项挑战性任务来衡量这些偏见,该任务涉及将性别中立的前提与性别特定假设相匹配。我们利用我们的挑战性任务来调查关于使用职业存在性别陈规定型观念的最新NLI模式。我们的研究结果表明,三种模式(BERT、RoBERTA、BARTA)在MNLI和SNLI数据集方面受过培训,这三种模式(BERT、RoBERTA、BARTA)极易发生性别引起的预测错误。我们还发现,诸如加强培训数据集以确保性别均衡数据集等偏向性技术有助于在某些情况下减少这种偏向。