Fast Text Classification with Naive Bayes Method on Apache Spark
Küçük Resim Yok
Tarih
2017
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Ieee
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
The increase in the number of devices and users online with the transition of Internet of Things (IoT), increases the amount of large data exponentially. Classification of ascending data, deletion of irrelevant data, and meaning extraction have reached vital importance in today's standards. Analysis can be done in various variations such as Classification of text on text data, analysis of spam, personality analysis. In this study, fast text classification was performed with machine learning on Apache Spark using the Naive Bayes method. Spark architecture uses a distributed in-memory data collection instead of a distributed data structure presented in Hadoop architecture to provide fast storage and analysis of data. Analyzes were made on the interpretation data of the Reddit which is open source social news site by using the Naive Bayes method. The results are presented in tables and graphs
Açıklama
25th Signal Processing and Communications Applications Conference (SIU) -- MAY 15-18, 2017 -- Antalya, TURKEY
Anahtar Kelimeler
Machine learning, Text mining, Big data, Apache Spark, Classification, Naive Bayes
Kaynak
2017 25th Signal Processing and Communications Applications Conference (Siu)
WoS Q Değeri
N/A