Since generative artificial intelligence (AI) tools such as OpenAI's ChatGPT became widely available, researchers have used them in the writing process. The consensus of the academic publishing community is that such usage must be declared in the published article. Academ-AI documents examples of suspected undeclared AI usage in the academic literature, discernible primarily due to the appearance in research papers of idiosyncratic verbiage characteristic of large language model (LLM)-based chatbots. This analysis of the first 768 examples collected reveals that the problem is widespread, penetrating the journals, conference proceedings, and textbooks of highly respected publishers. Undeclared AI seems to appear in journals with higher citation metrics and higher article processing charges (APCs), precisely those outlets that should theoretically have the resources and expertise to avoid such oversights. An extremely small minority of cases are corrected post publication, and the corrections are often insufficient to rectify the problem. The 768 examples analyzed here likely represent a small fraction of the undeclared AI present in the academic literature, much of which may be undetectable. Publishers must enforce their policies against undeclared AI usage in cases that are detectable; this is the best defense currently available to the academic publishing community against the proliferation of undisclosed AI. This is an updated version of a previous preprint.
翻译:自OpenAI的ChatGPT等生成式人工智能(AI)工具广泛普及以来,研究人员已在写作过程中加以使用。学术出版界的共识是,此类使用必须在已发表文章中予以声明。Academ-AI记录了学术文献中疑似未声明AI使用的案例,这些案例主要通过研究论文中出现基于大语言模型(LLM)的聊天机器人所特有的非常规措辞而得以识别。对首批收集的768个案例的分析表明,该问题普遍存在,已渗透至备受尊敬的出版商的期刊、会议论文集和教科书中。未声明的AI使用似乎更常见于引用指标较高、文章处理费(APCs)较高的期刊,而这些出版渠道理论上本应具备避免此类疏漏的资源与专业知识。极少数案例在发表后得以更正,但更正措施往往不足以彻底解决问题。此处分析的768个案例可能仅代表了学术文献中未声明AI使用的一小部分,其中大部分或许无法被检测到。出版商必须针对可检测的案例执行其禁止未声明AI使用的政策;这是当前学术出版界应对未公开AI泛滥的最佳防御手段。本文为先前预印本的更新版本。