Deep Learning Based Multi Modal Approach for Pathological Sounds Classification Patolojik Seslerin Siniflandirilmasi Amaciyla Derin Ogrenme Temelli Coklu Model Yaklasimi

ANKIŞHAN H., Kocoglu A.

28th Signal Processing and Communications Applications Conference, SIU 2020, Gaziantep, Türkiye, 5 - 07 Ekim 2020, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/siu49456.2020.9302067
Basıldığı Şehir: Gaziantep
Basıldığı Ülke: Türkiye
Anahtar Kelimeler: decision level fusion, deep learning based multi modal, feature level fusion, pathological sounds classification
Ankara Üniversitesi Adresli: Hayır

Özet

Automatic detection of voice disorders is very important because it makes the diagnosis process simpler, cheaper and less time consuming. In the literature, there are many studies available on the analysis of voice disorders based on the characteristics of the voice and subdividing the result of this analysis. In general, these studies have been carried out in order to subdivide the sound into pathological - normally sub - groups by means of certain classifiers as a result of subtraction of the features on frequency, time or hybrid axis. In contrast to existing approaches, in this study, a multiple- deep learning model using feature level fusion is proposed to distinguish pathological-normal sounds from each other. First, a feature vector (HOV) on the hybrid axis was obtained from the raw sound data. Then two CNN models were used. The first model has used raw audio data and the second model has used HOV as an input. Feature data in both model SoftMax layers were obtained as a matrix, and canonical correlation analysis (Canonical Correlation Analysis (CCA) was applied at feature level fusion. The new obtained feature vector was used as an input for multiple support vector machines (M-SVMs), Decision Tree (DTC) and naive bayes (NBC) classifiers. When the experimental results are examined, it is seen that the new multi-model based deep learning architecture provides superior success in classifying pathological sound data. With the results of the study, it will be possible to automatically detect and classify the pathology of these patients according to the proposed system.