Medical Ultrasonography, cilt.28, sa.1, ss.16-26, 2026 (SCI-Expanded, Scopus)
Aims: A computer-aided diagnosis (CAD) system for automated evaluation of developmental dysplasia of the hip (DDH) via ultrasound, integrating Deep Learning (DL) for anatomical segmentation and performing αβangle calculations utilizing the Graf Method is presented. A custom image processing method excludes the inferior ilium's curvature during the baseline definition, enhancing accuracy and replicating radiologists' real-world workflow. Materials and methods: Our dataset comprised 452 raw images from 370 newborns. For ‘validation’+“test”, ‘nv=91’+“nte=45” ≡136 images were reserved (never augmented). Remaining 316 images were augmented to ntr=632 with (0%↔25%) random brightness manipulation for training. Totally (632+136)=768 images were annotated and split with the following true numbers and percentage: ‘train’, “validation”, test ≡‘632’, “91”, 45 ≡‘82%’, “12%”, 6%. U-Net, MaskR-CNN, YOLOv8 and YOLOv11 were used for segmentation. αβwere measured using Method-I (centroid/orientation) and Method-II (Hough transform). An extended set of performance metrics – Precision, Recall, IoU, Dice, mAP – was calculated. Bland-Altman and Intraclass Correlation Coefficient (ICC) analyses compared CAD outputs with expert measurements. Results: YOLOv11 showed the best segmentation performance (Precision:0.990, Recall:0.993, IoU:0.983, Dice:0.990, mAP:0.991). ICCα, ICCβ calculated using Method-I and Method-II were 0.895, 0.907 and 0.929, 0.952, respectively, with Method-II outperforming Method-I. Conclusion: A clinically-aligned-CAD-system that integrates anatomical segmentation and αβmeasurement – a combination rarely addressed in literature is introduced. By providing a comprehensive and standardized set of metrics, this work overcomes a common bottleneck in DL studies, namely heterogeneity in metric reporting, enabling better cross-study comparisons. Following curvature exclusion, obtained ICCs outperformed previous studies, demonstrating improved inter-rater reliability and strong agreement with expert radiologists, offering both technical robustness and clinical applicability in DDH assessment.