Diagnostic Accuracy of a Machine Learning-Derived Appendicitis Score in Children: A Multicenter Validation Study


Aydın E., Sarnıç T. E., Türkmen İ. U., Khanmammadova N., ATEŞ U., Öztan M. O., ...Daha Fazla

Children, cilt.12, sa.7, 2025 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 12 Sayı: 7
  • Basım Tarihi: 2025
  • Doi Numarası: 10.3390/children12070937
  • Dergi Adı: Children
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, CINAHL, Directory of Open Access Journals
  • Anahtar Kelimeler: appendicitis, clinical decision support, diagnosis, machine learning, pediatrics, random forest
  • Ankara Üniversitesi Adresli: Evet

Özet

Background: Accurate diagnosis of acute appendicitis in children remains challenging due to variable presentations and limitations of existing clinical scoring systems. While machine learning (ML) offers a promising approach to enhance diagnostic precision, most prior studies have been limited by small sample sizes, single-center data, or a lack of external validation. Methods: This prospective, multicenter study included 8586 pediatric patients to develop a machine learning-based diagnostic model using routinely available clinical and hematological parameters. A separate, prospectively collected external validation cohort of 3000 patients was used to assess model performance. The Random Forest algorithm was selected based on its superior performance during model comparison. Diagnostic accuracy, sensitivity, specificity, Area Under Curve (AUC), and calibration metrics were evaluated and compared with traditional scoring systems such as Pediatric Appendicitis Score (PAS), Alvarado, and Appendicitis Inflammatory Response Score (AIRS). Results: The ML model outperformed traditional clinical scores in both development and validation cohorts. In the external validation set, the Random Forest model achieved an AUC of 0.996, accuracy of 0.992, sensitivity of 0.998, and specificity of 0.993. Feature-importance analysis identified white blood cell count, red blood cell count, and mean platelet volume as key predictors. Conclusions: This large, prospectively validated study demonstrates that a machine learning-based scoring system using commonly accessible data can significantly improve the diagnosis of pediatric appendicitis. The model offers high accuracy and clinical interpretability and has the potential to reduce diagnostic delays and unnecessary imaging.