Peer-review venues have increasingly adopted open reviewing policies that publicly release anonymized reviews and permit public commenting. Venues have adopted a variety of policies, and there is still ongoing debate about the benefits and drawbacks of decisions. To inform this debate, we surveyed 2,385 reviewers, authors, and other peer-review participants in machine learning to understand their experiences and opinions. Our key findings are: (a) Preferences: Over 80% of respondents support releasing reviews for accepted papers and allowing public comments. However, only 27.1% support releasing rejected manuscripts. (b) Benefits: Respondents cite improved public understanding (75.3%) and reviewer education (57.8%), increased fairness (56.6%), and stronger incentives for high-quality reviews (48.0%). (c) Challenges: The top concern is resubmission bias, where rejection history biases future reviewers (ranked top impact of open reviewing by 41% of respondents, and mentioned in over 50% of free responses). Other challenges include fear of reviewer de-anonymization (33.2%) and potential commenting abuse. (d) AI and open peer review: Participants believe open policies deter "AI slop" submissions (71.9%) and AI-generated reviews (38.9%). Respondents are split regarding peer-review venues generating official AI reviews, with 56.0% opposed and 44.0% supportive. Finally, we use AI to annotate 4,244 reviews from ICLR (fully open) and NeurIPS (partially open). We find that the fully open venue (ICLR) has higher levels of correctness and completeness than the partially open venue (NeurIPS). The effect size is small for correctness and very small for completeness, and both are statistically significant. We also find that there is no statistically significant difference in the level of substantiation. We release the full dataset at https://github.com/justinpayan/OpenReviewAnalysis.
翻译:同行评审平台日益采纳开放评审政策,公开匿名评审意见并允许公众评论。各平台采取了多样化的政策,关于决策利弊的争论仍在持续。为推进讨论,我们调查了2,385位机器学习领域的评审人、作者及其他同行评审参与者,以了解其经验与观点。主要发现如下:(a) 偏好:超过80%的受访者支持公开已录用论文的评审意见并允许公众评论,但仅27.1%支持公开被拒稿件。(b) 益处:受访者认为开放评审有助于提升公众理解(75.3%)、促进评审人教育(57.8%)、增强公平性(56.6%)以及强化高质量评审的激励(48.0%)。(c) 挑战:最受关注的是重投稿偏见,即论文被拒历史可能影响后续评审(41%受访者将其列为开放评审的首要影响,超过50%的自由回答提及此问题)。其他挑战包括评审人身份泄露风险(33.2%)及潜在的评论滥用。(d) 人工智能与开放同行评审:参与者认为开放政策能抑制“AI粗制滥造”投稿(71.9%)和AI生成评审(38.9%)。对于评审平台生成官方AI评审意见,受访者态度分化:56.0%反对,44.0%支持。最后,我们利用AI对ICLR(完全开放)和NeurIPS(部分开放)的4,244份评审进行标注分析。发现完全开放平台(ICLR)在评审正确性与完整性上均高于部分开放平台(NeurIPS),其中正确性效应量较小而完整性效应量极小,但两者均具有统计学显著性。同时,在论证充分性方面未发现显著差异。完整数据集已发布于https://github.com/justinpayan/OpenReviewAnalysis。