Robust model selection in linear regression models using information complexity


GÜNEY Y., Bozdogan H., ARSLAN O.

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, cilt.398, 2021 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 398
  • Basım Tarihi: 2021
  • Doi Numarası: 10.1016/j.cam.2021.113679
  • Dergi Adı: JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, INSPEC, MathSciNet, Metadex, zbMATH, DIALNET, Civil Engineering Abstracts
  • Anahtar Kelimeler: Information complexity criterion, Regression analysis, Robust model selection, CRITERION, ESTIMATORS
  • Ankara Üniversitesi Adresli: Evet

Özet

In recent years, in the literature of linear regression models, robust model selection methods have received increasing attention when the datasets contain even a small fraction of outliers. Outliers can have a serious impact on statistical inference and the choice of models using model selection criteria. Most of the existing robust information based model selection methods are confined to robust AIC and robust AIC-type criteria. However, the penalty function used in AIC-type criteria happens to be insufficient to measure the overall model complexity in the presence of outliers and when the model is misspecified. Furthermore, in the literature, there is not much attention paid to the overall model complexity despite its importance. To overcome these problems, this paper proposes a robust version of the information-theoretic measure of complexity (ICOMP) criterion due to Bozdogan based on robust alternatives to the maximum likelihood (ML) method. Unlike the AIC criterion, ICOMP penalizes not only the number of free parameters but also the profusion of the model complexity to reduce the effect of outliers. The proposed robust ICOMP criteria are based on robust M, S, and MM estimation methods. Numerical examples are shown using a large-scale Monte Carlo simulation to study the performance of the proposed criteria. Also, a real numerical example is provided to detect outliers and to choose the best subset of predictors. Our results show the flexibility and the versatility of the proposed new approach. (C) 2021 Elsevier B.V. All rights reserved.