Optimizing AI for surgical aftercare: Claude 3.5 Sonnet outperforms ChatGPT-5.0 in otoplasty

Sari, Elif; Ahmadov, Natig; Muradova, Antiga

doi:10.1186/s43163-026-01038-y

Optimizing AI for surgical aftercare: Claude 3.5 Sonnet outperforms ChatGPT-5.0 in otoplasty

Sari E., Ahmadov N., Muradova A.

Egyptian Journal of Otolaryngology, cilt.42, sa.1, 2026 (ESCI, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 42 Sayı: 1
Basım Tarihi: 2026
Doi Numarası: 10.1186/s43163-026-01038-y
Dergi Adı: Egyptian Journal of Otolaryngology
Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus
Anahtar Kelimeler: Artificial intelligence, ChatGPT-5.0, Claude 3.5 Sonnet, Otoplasty, Postoperative care
Ankara Üniversitesi Adresli: Evet

Özet

Background: Artificial intelligence (AI) language models are increasingly used in surgical aftercare, yet their performance varies across platforms. The objective of this study is to compare the effectiveness of large language models in providing accurate, clinically relevant guidance for postoperative otoplasty. Methods: Ten commonly encountered postoperative otoplasty questions were presented to both models. The generated answers were independently assessed by ten ENT specialists using structured Likert-based instruments and predefined clinical evaluation. To evaluate reliability and inter-model differences, a range of advanced statistical techniques was applied, including t-tests, effect size calculations, sensitivity and specificity analyses, mixed-effects models, and regression-based modeling. Results: Claude 3.5 Sonnet outperformed ChatGPT-5.0 across all evaluation metrics (p < 0.001); mixed-effects modeling showed a positive model effect (β = 0.752), question-level ROC analysis demonstrated complete separation (AUC = 1.00), PCA supported a dominant single factor explaining 70.86% of variance in clinician ratings, and inter-rater agreement was higher for Claude 3.5 Sonnet. Conclusion: Claude 3.5 Sonnet model exhibited higher accuracy and clinical relevance in postoperative otoplasty management, with robust statistical validation supporting its reliability in surgical aftercare.