Unsupervised Learning for Characterizing Type IV Secreted Effectors


AÇICI K., Asuroglu T.

4th International Conference on Applied Artificial Intelligence, ICAPAI 2024, Halden, Norveç, 16 Nisan 2024 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/icapai61893.2024.10541181
  • Basıldığı Şehir: Halden
  • Basıldığı Ülke: Norveç
  • Anahtar Kelimeler: clustering, feature reduction, feature selection, T4SE, unsupervised learning
  • Ankara Üniversitesi Adresli: Evet

Özet

Type IV secretion systems (T4SSs) are employed by pathogenic bacteria to inject proteins known as Type IV secreted effectors (T4SEs) into both prokaryotic and eukaryotic cells. These effectors play a crucial role in bacterial virulence by disrupting host cell functions and immune responses. While extensive research has focused on classifying T4SEs, the application of unsupervised learning techniques in this domain remains unexplored. In this study, we applied six unsupervised machine learning algorithms to a dataset of T4SEs and non-effectors to identify distinct clusters. Our findings suggest that unsupervised learning holds potential for gaining a deeper understanding of T4SS mechanisms and the diverse properties of T4SEs. Among the clustering algorithms that utilized in this study, it has been observed that the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm generally achieved the highest performance values in terms of average silhouette score and Davies-Bouldin Index (DBI) evaluation metrics. The Laplacian score feature selection algorithm and Principal Component Analysis (PCA) were found to have a positive effect on performance when used in conjunction with amino acid composition (AAC) as a feature extraction method. Additionally, it can be concluded that glycine (G) and lysine (K) are informative amino acids in the clustering algorithms that lead to the formation of two clusters.