Yazar "Hakdagli, O." seçeneğine göre listele
Listeleniyor 1 - 2 / 2
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe Fast text classification with Naive Bayes method on Apache Spark(Institute of Electrical and Electronics Engineers Inc., 2017) Ogul, I.U.; Ozcan, C.; Hakdagli, O.The increase in the number of devices and users online with the transition of Internet of Things (IoT), increases the amount of large data exponentially. Classification of ascending data, deletion of irrelevant data, and meaning extraction have reached vital importance in today's standards. Analysis can be done in various variations such as Classification of text on text data, analysis of spam, personality analysis. In this study, fast text classification was performed with machine learning on Apache Spark using the Naive Bayes method. Spark architecture uses a distributed in-memory data collection instead of a distributed data structure presented in Hadoop architecture to provide fast storage and analysis of data. Analyzes were made on the interpretation data of the Reddit which is open source social news site by using the Naive Bayes method. The results are presented in tables and graphs. © 2017 IEEE.Öğe Stream text data analysis on twitter using apache spark streaming(Institute of Electrical and Electronics Engineers Inc., 2018) Hakdagli, O.; Özcan, C.; Ogul, I.U.With today's developing technology, people's access to information and its production have reached a very fast level. These generated and obtained information are instantly created, entered into data systems and updated. Sources of streaming data can be transformed into valuable analysis results when they are handled with targeted methods. In this study, a text data field is determined to perform analysis on instantaneous generated data and Twitter, the richest platform for instant text data, is used. Twitter instantly generates a variety of data in large quantities and it presents it as open source using an API. A machine learning framework Apache Spark's stream analysis environment is used to analyze these resources. Situation analysis was performed using Support Vector Machine, Decision Trees and Logistic Regression algorithms presented under this environment. The results are presented in tables. © 2018 IEEE.