DETECTION OF THYROID DISEASE USING MACHINE LEARNING MODELS

ALSAADAWI, MUNTADHER ADNAN WAHEED

DSpace Home
→
LİSANSÜSTÜ EĞİTİM ENSTİTÜSÜ
→
Lisansüstü Eğitim Enstitüsü Yüksek LisansTezleri
→
View Item

dc.contributor.author	ALSAADAWI, MUNTADHER ADNAN WAHEED
dc.date.accessioned	2023-02-01T14:11:06Z
dc.date.available	2023-02-01T14:11:06Z
dc.date.issued	2023-01
dc.identifier.uri	http://acikerisim.karabuk.edu.tr:8080/xmlui/handle/123456789/2444
dc.description.abstract	ABSTRACT Disease diagnosis and prognosis are among the most crucial uses of machine learning (ML) models. In recent years, ML models have played a crucial and persuasive role in disease diagnosis and classification. Thyroid disease is an issue for human health that needs attention since the thyroid gland regulates human metabolism and plays a crucial role in managing human health. This thesis presents a method for classifying thyroid disease using traditional ML models (K-nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (DT), Naive Bayes (NB), Logistics Regression (LR), and Multi-Layer Perceptron (MLP) and ensemble models (Random Forest (RF), XGBoost, Soft Vote, Stacking, and Bagging). The proposed method was trained and tested in two steps, first using all features of the dataset and then using the best-correlated features selected by the Recursive Feature Elimination (RFE) model. The highest accuracy (ACC) of traditional models with all features was found to be obtained by DT and MLP at 99.92% and 97.30%, respectively. Ensemble models obtained 100% of ACC in the XGboost and Bagging models. The RFE model was applied to the dataset and achieved 100% and 98.06% ACC in DT and NB, respectively. As for ensemble models, XGBoost and Bagging also achieved 100% of ACC, and the Stacking model achieved 99.53% of ACC. The proposed ensemble models outperformed the traditional models in terms of sensitivity, specificity, precision, F1 score, and Matthews Correlation Coefficient (MCC) as well as ACC. The proposed models were tested for overfitting using feature selection, cross-validation and comparison of training and test ACC. The time spent for training and prediction was found to be reasonable. ÖZET Hastalık teşhisi ve tahmini makine öğrenmesi modellerinin en önemli kullanım alanları arasında yer almaktadır. Son yıllarda, bu konuda makine öğrenmesi modelleri önemli ve ikna edici bir rol üstlenmiştir. Tiroid bezi insan metabolizmasını düzenlediği ve insan sağlığında önemli bir rol oynadığı için tiroid hastalığı insan sağlığı için dikkat edilmesi gereken bir sorundur. Bu tez, geleneksel makine öğrenmesi modelleri olan K-en Yakın Komşu, Destek Vektör Makinesi, Karar Ağacı, Naive Bayes, Lojistik Regresyon ve çok katmanlı perseptron ile topluluk öğrenme modelleri olan Rastgele Orman, XGBoost, Soft Vote, Stacking ve Bagging kullanarak tiroid hastalığını sınıflandırmak için bir yöntem sunmaktadır. Önerilen yöntem, önce veri kümesinin tüm özniteliklerini kullanarak ve ardından özyinelemeli öznitelik eleme yöntemi tarafından seçilen en iyi ilişkili özellikleri kullanarak iki adımda eğitilmiş ve test edilmiştir. Tüm özniteliklere sahip geleneksel modellerin en yüksek doğruluğu sırasıyla %99.92 ve %97.30 ile karar ağacı ve çok katmanlı perseptron tarafından elde edilmiştir. Toplu modeller için XGboost ve Bagging modelleri %100 doğruluk elde etmiştir. Özyinelemeli öznitelik eleme yöntemi veri setine uygulanmış ve geleneksel makine öğrenmesi modellerinden karar ağacı ile Naive Bayes modelleri sırasıyla %100 ve %98.06 doğruluk elde etmiştir. Topluluk modellerinden XGBoost ve Bagging %100 doğruluk ve Stacking modeli %99.53 doğruluk elde etmiştir. Önerilen topluluk modelleri doğruluk parametresi ile birlikte duyarlılık, özgüllük, kesinlik, F1 puanı ve Matthews Korelasyon Katsayısı açısından geleneksel modellerden daha iyi performans göstermiştir. Önerilen modeller, öznitelik eleme ve çapraz doğrulamanın yanında eğitim ve test doğruluklarının karşılaştırılması kullanılarak aşırı uyum için test edilmiştir. Eğitim ve tahmin işlemleri için harcanan zaman makul olarak değerlendirilmiştir.	en_EN
dc.language.iso	en	en_EN
dc.subject	Machine Learning, Ensemble models, Thyroid disease, Hypothyroidism, Hyperthyroidism.	en_EN
dc.subject	Makine öğrenmesi, Topluluk öğrenme, tiroid hastalığı, hipotiroidizm, hipertiroidizm.	en_EN
dc.title	DETECTION OF THYROID DISEASE USING MACHINE LEARNING MODELS	en_EN
dc.title.alternative	MAKİNE ÖĞRENİMİ MODELLERİ KULLANILARAK TİROİD HASTALIĞININ TESPİTİ	en_EN
dc.type	Thesis	en_EN