Fuzzy regression functions with a noise cluster and the impact of outliers on mainstream machine learning methods in the regression setting


Chakravarty S., Demirhan H., Başer F.

APPLIED SOFT COMPUTING, cilt.96, 2020 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 96
  • Basım Tarihi: 2020
  • Doi Numarası: 10.1016/j.asoc.2020.106535
  • Dergi Adı: APPLIED SOFT COMPUTING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
  • Anahtar Kelimeler: Artificial neural network, Fuzzy regression function, Outlier, Robustness, Support vector machine, SUPPORT VECTOR MACHINE, ALGORITHM, MODEL
  • Ankara Üniversitesi Adresli: Evet

Özet

The presence of outliers in the dependent and/or independent features distorts predictions with machine learning techniques and may lead to erroneous conclusions. It is important to implement methods that are robust against the outliers to make reliable predictions and to know the accuracy of the existing methods when data is contaminated with outliers. The focus of this study is to propose a robust fuzzy regression functions (FRFN) approach against the outliers and evaluate the performance of the proposed and several mainstream machine learning approaches in the presence of outliers for the regression problem. The proposed FRFN approach is based on fuzzy k-means clustering with a noise cluster. We compare the accuracy of Artificial Neural Networks (ANN), Support Vector Machines (SVM) and the proposed FRFN approaches with different training algorithms kernel functions via simulated and real benchmark datasets. In total, accuracies of 36 ANN, SVM, and FRNF implementations with training algorithms and kernel and loss functions have been evaluated and compared to each other with samples containing outliers via a Monte Carlo simulation setting. It is observed in both Monte Carlo simulations and applications with benchmark dataset that FRFN with ANN trained with Bayes regularization algorithm and FRFN with SVM with Gaussian kernel outperforms the classical implementations of ANN and SVMs under the existence of outliers. The proposed noise cluster implementation considerably increases the robustness of fuzzy regression functions against outliers. (C) 2020 Elsevier B.V. All rights reserved.