Fast text classification with Naive Bayes method on Apache Spark

Ogul, I.U.; Ozcan, C.; Hakdagli, O.

Fast text classification with Naive Bayes method on Apache Spark

Tarih

2017

Yazarlar

Ogul, I.U.

Ozcan, C.

Hakdagli, O.

Yayıncı

Institute of Electrical and Electronics Engineers Inc.

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

The increase in the number of devices and users online with the transition of Internet of Things (IoT), increases the amount of large data exponentially. Classification of ascending data, deletion of irrelevant data, and meaning extraction have reached vital importance in today's standards. Analysis can be done in various variations such as Classification of text on text data, analysis of spam, personality analysis. In this study, fast text classification was performed with machine learning on Apache Spark using the Naive Bayes method. Spark architecture uses a distributed in-memory data collection instead of a distributed data structure presented in Hadoop architecture to provide fast storage and analysis of data. Analyzes were made on the interpretation data of the Reddit which is open source social news site by using the Naive Bayes method. The results are presented in tables and graphs. © 2017 IEEE.

Açıklama

25th Signal Processing and Communications Applications Conference, SIU 2017 -- 15 May 2017 through 18 May 2017 -- Antalya -- 128703

Anahtar Kelimeler

Apache Spark, Big data, Classification, Machine learning, Naive Bayes, Text mining

Kaynak

2017 25th Signal Processing and Communications Applications Conference, SIU 2017

Scopus Q Değeri

N/A

Bağlantı

https://doi.org/10.1109/SIU.2017.7960721
https://hdl.handle.net/20.500.14619/9261

Koleksiyon

Scopus İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Fast text classification with Naive Bayes method on Apache Spark

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon