Standardized Variable Distances: A distance-based machine learning method

Elen, Abdullah; Avuclu, Emre

Standardized Variable Distances: A distance-based machine learning method

dc.authorid	Elen, Abdullah/0000-0003-1644-0476
dc.contributor.author	Elen, Abdullah
dc.contributor.author	Avuclu, Emre
dc.date.accessioned	2024-09-29T15:55:04Z
dc.date.available	2024-09-29T15:55:04Z
dc.date.issued	2021
dc.department	Karabük Üniversitesi	en_US
dc.description.abstract	Today, machine learning algorithms are an important research area capable of analyzing and modeling data in any field. Information obtained through machine learning methods helps researchers and planners to understand and review systematic problems of their current strategies. Thus, it is very important to work fully in every field that facilitates human life, such as early and correct diagnosis, correct choice, fully functioning autonomous systems. In this paper, a novel machine learning algorithm for multiclass classification is presented. The proposed method is designed based on the Minimum Distance Classifier (MDC) algorithm. The MDC is variance-insensitive because it classifies input vectors by calculating their distances/similarities with respect to class-centroids (average value of input vectors of a class). As it is known, real-world data contains certain proportions of noise. This situation negatively affects the performance of the MDC. To overcome this problem, we developed a variance-sensitive model, which we call Standardized Variable Distances (SVD), considering the standard deviation and z-score (standardized variable) factors. To ensure the accuracy of the SVD, we used Wisconsin Breast Cancer Original (WBCO) and LED Display Domain (led7digit) datasets, which we obtained from UCI machine learning repository, with 5-fold cross validation. It was compared and analyzed classification performance of the SVD with Decision Tree (DT), Random Forest (RF), k-Nearest Neighbor (k-NN), Multinomial Logistic Regression (MLR), Naive Bayes (NB), Support Vector Machine (SVM), and the Minimum Distance Classifier (MDC), which are well-known in the literature. It has also been compared thirteen different studies using the same datasets over the past five years. Our results in the experimental studies have shown that the SVD can classify better than traditional and state-of-the-art methods, compared in this study. The proposed method reached over 97% classification accuracy (CACC), F-measure (FM) and area under the curve (AUC) on the WBCO dataset. On the led7digit dataset, approximately 74% CACC, 75.1% FM and 82.2% AUC scores were obtained. It has been observed that the classification scores obtained with the SVD are higher than other ML algorithms used in the experimental studies. (C) 2020 Elsevier B.V. All rights reserved.	en_US
dc.identifier.doi	10.1016/j.asoc.2020.106855
dc.identifier.issn	1568-4946
dc.identifier.issn	1872-9681
dc.identifier.scopus	2-s2.0-85095567192	en_US
dc.identifier.scopusquality	Q1	en_US
dc.identifier.uri	https://doi.org/10.1016/j.asoc.2020.106855
dc.identifier.uri	https://hdl.handle.net/20.500.14619/4428
dc.identifier.volume	98	en_US
dc.identifier.wos	WOS:000603365800011	en_US
dc.identifier.wosquality	Q1	en_US
dc.indekslendigikaynak	Web of Science	en_US
dc.indekslendigikaynak	Scopus	en_US
dc.language.iso	en	en_US
dc.publisher	Elsevier	en_US
dc.relation.ispartof	Applied Soft Computing	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Machine learning	en_US
dc.subject	Multiclass classifier	en_US
dc.subject	Distance-based classifier	en_US
dc.title	Standardized Variable Distances: A distance-based machine learning method	en_US
dc.type	Article	en_US

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Standardized Variable Distances: A distance-based machine learning method

Dosyalar

Koleksiyon