Acta Neurochirurgica, cilt.167, sa.1, 2025 (SCI-Expanded)
Objective: This study evaluates the reliability and accuracy of AI-generated text detection tools in distinguishing human-authored academic content from AI-generated texts, highlighting potential challenges and ethical considerations in their application within the scientific community. Methods: This study analyzed the detectability of AI-generated academic content using abstracts and introductions created by ChatGPT versions 3.5, 4, and 4o, alongside human-written originals from the pre-ChatGPT era. Articles were sourced from four high impact neurosurgery journals and categorized into four categories: originals and generated by ChatGPT 3.5, ChatGPT 4, and ChatGPT 4o. AI-output detectors (GPTZero, ZeroGPT, Corrector App) were employed to classify 1,000 texts as human- or AI-generated. Additionally, plagiarism checks were performed on AI-generated content to evaluate uniqueness. Results: A total of 250 human-authored articles and 750 ChatGPT-generated texts were analyzed using three AI-output detectors (Corrector, ZeroGPT, GPTZero). Human-authored texts consistently had the lowest AI likelihood scores, while AI-generated texts exhibited significantly higher scores across all versions of ChatGPT (p < 0.01). Plagiarism detection revealed high originality for ChatGPT-generated content, with no significant differences among versions (p > 0.05). ROC analysis demonstrated that AI-output detectors effectively distinguished AI-generated content from human-written texts, with areas under the curve (AUC) ranging from 0.75 to 1.00 for all models. However, none of the detectors achieved 100% reliability in distinguishing AI-generated content. Conclusions: While models like ChatGPT enhance content creation and efficiency, they raise ethical concerns, particularly in fields demanding trust and precision. AI-output detectors exhibit moderate to high success in distinguishing AI-generated texts, but false positives pose risks to researchers. Improving detector reliability and establishing clear policies on AI usage are critical to mitigate misuse while fully leveraging AI’s benefits.