Bias in human data: A feedback from social sciences


TAKAN S., Ergün D., Getir Yaman S., Kılınççeker O.

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, cilt.13, sa.4, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Derleme
  • Cilt numarası: 13 Sayı: 4
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1002/widm.1498
  • Dergi Adı: WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Computer & Applied Sciences, INSPEC, Library, Information Science & Technology Abstracts (LISTA)
  • Anahtar Kelimeler: artificial intelligence, cultivation theory, data bias, fairness, machine learning, new media, social computing, social science, TELEVISION
  • Ankara Üniversitesi Adresli: Evet

Özet

The fairness of human-related software has become critical with its widespread use in our daily lives, where life-changing decisions are made. However, with the use of these systems, many erroneous results emerged. Technologies have started to be developed to tackle unexpected results. As for the solution to the issue, companies generally focus on algorithm-oriented errors. The utilized solutions usually only work in some algorithms. Because the cause of the problem is not just the algorithm; it is also the data itself. For instance, deep learning cannot establish the cause-effect relationship quickly. In addition, the boundaries between statistical or heuristic algorithms are unclear. The algorithm's fairness may vary depending on the data related to context. From this point of view, our article focuses on how the data should be, which is not a matter of statistics. In this direction, the picture in question has been revealed through a scenario specific to "vulnerable and disadvantaged" groups, which is one of the most fundamental problems today. With the joint contribution of computer science and social sciences, it aims to predict the possible social dangers that may arise from artificial intelligence algorithms using the clues obtained in this study. To highlight the potential social and mass problems caused by data, Gerbner's "cultivation theory" is reinterpreted. To this end, we conduct an experimental evaluation on popular algorithms and their data sets, such as Word2Vec, GloVe, and ELMO. The article stresses the importance of a holistic approach combining the algorithm, data, and an interdisciplinary assessment.This article is categorized under:Algorithmic Development > Statistics