Recent advances enable Large Language Models (LLMs) to generate AI personas, yet their lack of deep contextual, cultural, and emotional understanding poses a significant limitation. This study quantitatively compared human responses with those of eight LLM-generated social personas (e.g., Male, Female, Muslim, Political Supporter) within a low-resource environment like Bangladesh, using culturally specific questions. Results show human responses significantly outperform all LLMs in answering questions, and across all matrices of persona perception, with particularly large gaps in empathy and credibility. Furthermore, LLM-generated content exhibited a systematic bias along the lines of the ``Pollyanna Principle'', scoring measurably higher in positive sentiment ($Φ_{avg} = 5.99$ for LLMs vs. $5.60$ for Humans). These findings suggest that LLM personas do not accurately reflect the authentic experience of real people in resource-scarce environments. It is essential to validate LLM personas against real-world human data to ensure their alignment and reliability before deploying them in social science research.
翻译:近期进展使得大型语言模型(LLMs)能够生成AI人物角色,但其缺乏深层的语境、文化和情感理解构成了显著局限。本研究在孟加拉国等低资源环境中,使用文化特定问题,定量比较了人类反应与八种LLM生成的社会角色(如男性、女性、穆斯林、政治支持者)的反应。结果表明,在回答问题和所有人物角色认知维度上,人类反应均显著优于所有LLM模型,尤其在共情力和可信度方面存在明显差距。此外,LLM生成内容表现出沿“波丽安娜原则”的系统性偏差,在积极情感得分上显著偏高(LLMs的$Φ_{avg} = 5.99$,人类为$5.60$)。这些发现表明,LLM生成的角色未能准确反映资源匮乏环境中真实人群的切身经验。在将LLM角色应用于社会科学研究前,必须依据现实世界的人类数据进行验证,以确保其一致性和可靠性。