Prediction of tumour pathological subtype from genomic profile using sparse logistic regression with random effects


Creative Commons License

KAYMAZ Ö., Alqahtani K., Wood H. M., Gusnanto A.

JOURNAL OF APPLIED STATISTICS, cilt.48, sa.4, ss.605-622, 2021 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 48 Sayı: 4
  • Basım Tarihi: 2021
  • Doi Numarası: 10.1080/02664763.2020.1738358
  • Dergi Adı: JOURNAL OF APPLIED STATISTICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Aerospace Database, Business Source Elite, Business Source Premier, CAB Abstracts, Veterinary Science Database, zbMATH
  • Sayfa Sayıları: ss.605-622
  • Anahtar Kelimeler: Tumour, lung cancer, pathological subtype, logistic regression, sparse solution, hierarchical likelihood, VARIABLE SELECTION, SHRINKAGE, SIZE
  • Ankara Üniversitesi Adresli: Evet

Özet

The purpose of this study is to highlight the application of sparse logistic regression models in dealing with prediction of tumour pathological subtypes based on lung cancer patients' genomic information. We consider sparse logistic regression models to deal with the high dimensionality and correlation between genomic regions. In a hierarchical likelihood (HL) method, it is assumed that the random effects follow a normal distribution and its variance is assumed to follow a gamma distribution. This formulation considers ridge and lasso penalties as special cases. We extend the HL penalty to include a ridge penalty (called 'HLnet') in a similar principle of the elastic net penalty, which is constructed from lasso penalty. The results indicate that the HL penalty creates more sparse estimates than lasso penalty with comparable prediction performance, while HLnet and elastic net penalties have the best prediction performance in real data. We illustrate the methods in a lung cancer study.