SentiDariPers: Sentiment Analysis of Dari-Persian Tweets Based on People’s Views and Opinion
Küçük Resim Yok
Tarih
2023
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Springer Science and Business Media Deutschland GmbH
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
In the research area of sentiment analysis, there is a noticeable gap when it comes to the Dari-Persian dialect. To bridge this gap, our research aimed to curate a comprehensive dataset encompassing people’s opinions in this specific language variant. This paper presents the development of a benchmark sentiment annotated dataset for the Dari dialect of Persian, which serves as an official language of Afghanistan. The dataset, named “SentiDariPers”, comprises 43,089 tweets posted between August 2021 and April 2023. It has been manually annotated with four sentiment classes: Negative, Positive, Neutral, and Mixed. We applied a range of models, such as Support Vector Machine (SVM), Long Short-Term Memory (LSTM), Bi-directional Long Short-Term Memory (Bi-LSTM), Gated Recurrent Unit (GRU), and Convolutional Neural Network (CNN). Additionally, we develop an ensemble model that combines different sets of sentiment classes for each system. We present a detailed comparative analysis of the results obtained from these models. Experimental findings demonstrate that the ensemble model achieves the highest accuracy 91%. We provide insights into the data collection and annotation process, offer relevant dataset statistics, discuss the experimental results, and provide further analysis of the data. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Açıklama
9th International Conference on Technologies and Innovation, CITI 2023 -- 13 November 2023 through 16 November 2023 -- Guayaquil -- 303389
Anahtar Kelimeler
Dari-Persian, Dataset Creation, Deep Learning, Sentiment Analysis
Kaynak
Communications in Computer and Information Science
WoS Q Değeri
Scopus Q Değeri
Q4
Cilt
1873 CCIS