AI-Generated vs. Traditional STL Models in CBCT Imaging: A Pilot Study on Measurements Accuracy and Reliability


Ersalici I., Aksoy S., Kamiloglu B., ORHAN K.

JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025 (SCI-Expanded) identifier identifier

Özet

The aim of this study is to assess the reliability of linear measurements obtained from STL models and three-dimensional hard-tissue models of the maxilla and mandible, both derived from CBCT images. The STL models are generated using both a software program and a web-based AI diagnostic tool, and these measurements are compared to those from the hard tissue models. One hundred CBCT scans were included in this study. DICOM files were imported into Maxilim (R) software to create hard-tissue models. An AI algorithm and Mimics software were also used to generate STL images. Five mandibular and three maxillary measurements were taken. Pairwise comparisons were made by performing the Tukey test, and absolute agreement among the three programs was assessed by using the intraclass correlation coefficient (ICC). The repeated measurements demonstrated high reliability for mandibular measurements (ICC: 0.902-0.999), while maxillary measurements showed more variability (ICC: 0.456-0.997), with poor reliability in DFPM using Mimics-STL (p = 0.071). ICC and Pearson correlation values were moderate for DIM, while others were good to excellent. Maxillary distances were less reliable, particularly for DFPM (Mimics-STL vs. Maxilim) and DSN (Mimics-STL vs. AI-STL). ANOVA revealed significant differences in DCP, DSN, DFI, DMF, and DFPM, with Maxilim yielding the highest mean values, except for DMF. 3D hard-tissue models provided higher measurement values than STL models. The significant variability observed in STL maxillary measurements suggests that anatomical complexity and segmentation algorithms influence measurement consistency. These findings highlight the importance of carefully selecting segmentation methodologies in clinical and research settings.