WINDOWS OS VULNERABILITY CLASSIFICATION USING MACHINE LEARNING TECHNIQUES

AL-SARRAY, NOORALHUDA ABDULHASAN HADI

DSpace Home
→
LİSANSÜSTÜ EĞİTİM ENSTİTÜSÜ
→
Lisansüstü Eğitim Enstitüsü Yüksek LisansTezleri
→
View Item

WINDOWS OS VULNERABILITY CLASSIFICATION USING MACHINE LEARNING TECHNIQUES

AL-SARRAY, NOORALHUDA ABDULHASAN HADI

URI: http://acikerisim.karabuk.edu.tr:8080/xmlui/handle/123456789/3377

Date: 2024-01

Abstract:

ABSTRACT The speedy development of technology and communication systems leads to the emergence of many challenges, especially in the field of data protection and maintaining information security. It is newly known as the field of cybersecurity, which includes a set of procedures and techniques that seek to maintain data security. Through this study, we have used machine learning to improving cybersecurity of the Windows system. We have used five machine learning classification algorithms (Random Forest, Logistic Regression, Naive Bayes, K-Nearest Neighbors, and SVM) to classify the Windows system's vulnerabilities. We have collected the dataset from two sites exploit-deb and NIST (National Institute of Standards and Technology). Several parameters were calculated during the study. The results revealed that the highest degree of accuracy was achieved when using the Random Forest algorithm (accuracy: 0,97%, precision: 0,97%, recall: 0,97%, F1-score: 0,97%, and Roc Auc score: 0,99%), which means achieving an accuracy of 97%. The results highlight the Random Forest algorithm's ability to solve the vulnerability classification problem. ÖZET Teknoloji ve iletişim sistemlerin hızla gelişmesi, özellikle veri koruma ve bilgi güvenliğini sağlama alanında birçok zorluğun ortaya çıkmasına neden olmaktadır. Veri güvenliğini sağlamayı amaçlayan bir dizi prosedür ve teknik içeren siber güvenlik alanı yeni bir alan olarak bilinmektedir. Bu çalışmada, Windows sisteminin siber güvenliğini iyileştirmek için makine öğrenimini kullanılmıştır. Windows sisteminin güvenlik açıklarını sınıflandırmak için beş makine öğrenimi sınıflandırma algoritması (Random Forest, Logistic Regression, Naive Bayes, K-Nearest Neighbors, SVM) kullanılmıştır. Veri seti exploit-deb ve NIST (National Institute of Standards and Technology) sitelerinden elde edilmiştir. Çalışma sırasında çeşitli parametreler hesaplanmıştır. Sonuçlar, en yüksek doğruluk derecesinin Rastgele Orman algoritması kullanıldığında elde edildiğini ortaya koymuştur (doğruluk: % 0,97, hassasiyet: % 0,97, hatırlama: 0,97, F1-skoru: %0,97 ve Roc Auc skoru: %0,99), %97'lik bir doğruluk elde edilmiştir. Sonuçlar, Rastgele Orman algoritmasının güvenlik açığı sınıflandırma problemini çözme yeteneğini vurgulamaktadır.

Show full item record