Artificial Intelligence as Author: Can Scientific Reviewers Recognize GPT- 4o–Generated Manuscripts?

CLINICAL QUESTION

Can peer reviewers and editors reliably detect AI-generated manuscripts in scientific publishing?

BOTTOM LINE

Most reviewers could not tell that a manuscript was written entirely by GPT-4o. This highlights an urgent need for clear disclosure policies, reviewer training, and robust AI detection tools.

BACKGROUND: Generative AI tools such as ChatGPT are increasingly used to support academics in scientific writing. While they may streamline drafting, data analysis, and editing, concerns persist regarding plagiarism, fabricated data, ethical issues, and inaccurate content. Although AI detection tools exist, no definitive mechanism ensures the detection of AI-generated data. The ability (or inability) of reviewers to recognize AI-generated text has direct implications for the credibility of peer review.

STUDY DESIGN: The study was conducted between November 1 and December 1, 2024. GPT-4o was instructed to generate a full retrospective study manuscript on predictors of survival and return of spontaneous circulation in out-of-hospital cardiac arrest. The model created a synthetic dataset of nearly 1,000 patients, performed statistical analyses, and drafted a manuscript (~1,500 words, 20 references) revised through multiple AI-prompted iterations to ensure CONSORT compliance. Fourteen experienced SCI-E journal reviewers (H-index ≥5) assessed the manuscript as if serving as editors and reviewers. They were told in advance that the manuscript may have been AI-generated and later asked if they recognized this.

SETTING: Hitit University Erol Olçok Education and Research Hospital (Turkey) and Hamad Medical Corporation (Qatar), with international reviewers.

SYNOPSIS: At the editorial stage, 42.9% rejected the manuscript, 42.9% forwarded it to review, and 14.3% accepted it outright. As peer reviewers, 42.9% recommended rejection, 28.6% suggested major revisions, and 28.6% recommended acceptance after minor revisions (no outright acceptances). Notably, 78.6% did not recognize the manuscript as AI-generated. Those who did cited template-like phrasing, superficial discussion, repetitive language, and unusual statistical formatting. Limitations of this study included a small sample size, an English-only context, reviewers’ variable familiarity with AI, and a possible Hawthorne effect: Knowing AI might be involved may have made reviewers more cautious. Lastly, the authors emphasized vulnerabilities in peer review and recommended mandatory AI disclosure policies, reviewer training, and deployment of AI detection tools such as “Gotcha GPT,” which has reported 97%- 99% accuracy.

CITATION: Öztürk A, et al. Artificial intelligence as author: can scientific reviewers recognize GPT-4o-generated manuscripts? Am J Emerg Med. 2025;97:216-219. doi: 10.1016/j. ajem.2025.07.034.

COMMENT: This study examined the ability of peer reviewers to detect AI-generated content. The findings revealed that 78.6% of the reviewers did not realize the manuscript had been generated by an artificial intelligence model, and many of them had passed the manuscript on for acceptance. This study suggests that editors need to look into AI detection, but all studies on AI detection software show that it can be easily evaded.—Eric Gantwerker, MD

Pages: 1 2 | Single Page

You Might Also Like

Explore This Issue

You Might Also Like:

Leave a Reply Cancel reply