We introduce GANDiff FR, the first synthetic framework that precisely controls demographic and environmental factors to measure, explain, and reduce bias with reproducible rigor. GANDiff FR unifies StyleGAN3-based identity-preserving generation with diffusion-based attribute control, enabling fine-grained manipulation of pose around 30 degrees, illumination (four directions), and expression (five levels) under ceteris paribus conditions. We synthesize 10,000 demographically balanced faces across five cohorts validated for realism via automated detection (98.2%) and human review (89%) to isolate and quantify bias drivers. Benchmarking ArcFace, CosFace, and AdaFace under matched operating points shows AdaFace reduces inter-group TPR disparity by 60% (2.5% vs. 6.3%), with illumination accounting for 42% of residual bias. Cross-dataset evaluation on RFW, BUPT, and CASIA WebFace confirms strong synthetic-to-real transfer (r 0.85). Despite around 20% computational overhead relative to pure GANs, GANDiff FR yields three times more attribute-conditioned variants, establishing a reproducible, regulation-aligned (EU AI Act) standard for fairness auditing. Code and data are released to support transparent, scalable bias evaluation.
翻译:我们提出了GANDiff FR,这是首个能够精确控制人口统计与环境因素、以可重复的严谨性测量、解释并减少偏差的合成框架。GANDiff FR将基于StyleGAN3的身份保持生成与基于扩散的属性控制相结合,在保持其他条件不变的情况下,实现了对姿态(约30度)、光照(四个方向)和表情(五个等级)的细粒度操控。我们合成了10,000张跨五个群体、人口统计平衡的人脸图像,并通过自动检测(98.2%)和人工评审(89%)验证了其真实性,以隔离和量化偏差驱动因素。在匹配操作点下对ArcFace、CosFace和AdaFace进行基准测试表明,AdaFace将组间真正例率差异降低了60%(2.5%对比6.3%),其中光照因素占剩余偏差的42%。在RFW、BUPT和CASIA WebFace数据集上的跨数据集评估证实了强大的合成到真实数据迁移能力(相关系数r=0.85)。尽管相对于纯GAN方法有约20%的计算开销,GANDiff FR能生成三倍多的属性条件变体,为公平性审计建立了一个可重复、符合监管要求(欧盟《人工智能法案》)的标准。代码和数据已公开,以支持透明、可扩展的偏差评估。