Sentiment Analysis on Twitter Data: Comparative Study on Different Approaches

Full Text (PDF, 292KB), PP.1-13

Views: 0 Downloads: 0

Author(s)

Abdur Rahman 1,* Mobashir Sadat 2 Saeed Siddik 2

1. Centre for Advanced Research in Sciences, University of Dhaka, Dhaka, Bangladesh

2. Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2021.04.01

Received: 8 Jan. 2021 / Revised: 15 Mar. 2021 / Accepted: 29 Apr. 2021 / Published: 8 Aug. 2021

Index Terms

Sentiment Analysis, Machine Learning, Twitter Data Comparative Analysis

Abstract

Social media has become incredibly popular these days for communicating with friends and for sharing opinions. According to current statistics, almost 2.22 billion people use social media in 2016, which is roughly one third of the world population and three times of the entire population in Europe. In social media people share their likes, dislikes, opinions, interests, etc. so it is possible to know about a person’s thoughts about a specific topic from the shared data in social media. Since, twitter is one of the most popular social media in the world; it is a very good source for opinion mining and sentiment analysis about different topics. In this research, SVM with different kernel functions and Adaboost are experimented using CPD and Chi-square feature extraction techniques to explore the best sentiment classification model. The reported average accuracy of Adaboost for Chi-square and CPD are 70.2% and 66.9%. The SVM radial basis kernel and polynomial kernel with Chi-square n-grams reported average accuracy of 73.73% and 68.67% respectively. Among the performed experimentation, SVM sigmoid kernel with Chi-square n-grams provided the maximum accuracy that is 74.4%.

Cite This Paper

Abdur Rahman, Mobashir Sadat, Saeed Siddik, "Sentiment Analysis on Twitter Data: Comparative Study on Different Approaches", International Journal of Intelligent Systems and Applications(IJISA), Vol.13, No.4, pp.1-13, 2021. DOI: 10.5815/ijisa.2021.04.01

Reference

[1]Shaheen M, Awan SM, Hussain N, Gondal ZA. Sentiment Analysis on Mobile Phone Reviews Using Supervised Learning Techniques. International Journal of Modern Education and Computer Science. 2019 Jul 1;11(7).
[2]Yasin Görmez, Yunus E. Işık, Mustafa Temiz, Zafer Aydın, "FBSEM: A Novel Feature-Based Stacked Ensemble Method for Sentiment Analysis’ Comments in E-Government", International Journal of Information Technology and Computer Science, Vol.12, No.6, pp.11-22, 2020. 
[3]Pak, Alexander, and Patrick Paroubek. "Twitter as a Corpus for Sentiment Analysis and Opinion Mining." LREc. Vol. 10. 2010.
[4]Wang, Hao, et al. "A system for real-time twitter sentiment analysis of 2012 us presidential election cycle." Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics, 2012.
[5]Kouloumpis, Efthymios, Theresa Wilson, and Johanna D. Moore. "Twitter sentiment analysis: The good the bad and the omg!." Icwsm 11 (2011): 538-541.
[6]Silva, Nádia Félix Felipe da, Eduardo Raul Hruschka, and Estevam Rafael Hruschka Junior. "Biocom_Usp: tweet sentiment analysis with adaptive boosting ensemble." International Workshop on Semantic Evaluation, 8th. ACL Special Interest Group on the Lexicon-SIGLEX, 2014.
[7]Barbosa, Luciano, and Junlan Feng. "Robust sentiment detection on twitter from biased and noisy data." Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010.
[8]Hossam Elzayady, Khaled M. Badran, Gouda I. Salama, "Arabic Opinion Mining Using Combined CNN - LSTM Models", International Journal of Intelligent Systems and Applications, Vol.12, No.4, pp.25-36, 2020.
[9]Asur, Sitaram, and Bernardo A. Huberman. "Predicting the future with social media." Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on. Vol. 1. IEEE, 2010.
[10]Oghina A, Breuss M, Tsagkias M, De Rijke M. Predicting imdb movie ratings using social media. InEuropean Conference on Information Retrieval 2012 Apr 1 (pp. 503-507). Springer, Berlin, Heidelberg.
[11]Simeon M, Hilderman R. Categorical proportional difference: A feature selection method for text categorization. InProceedings of the 7th Australasian Data Mining Conference-Volume 87 2008 Nov 27 (pp. 201-208).
[12]Yang Y, Pedersen JO. A comparative study on feature selection in text categorization. InIcml 1997 Jul 8 (Vol. 97, No. 412-420, p. 35).
[13]Kolchyna O, Souza TT, Treleaven12 PC, Aste12 T. Methodology for twitter sentiment analysis. arXiv preprint arXiv:1507.00955. 2015 Jul.
[14]Alsolamy AA, Siddiqui MA, Khan IH. A Corpus Based Approach to Build Arabic Sentiment Lexicon. International Journal of Information Engineering and Electronic Business. 2019 Nov 1;11(6).
[15]Hallsmar F, Palm J. Multi-class sentiment classification on twitter using an emoji training heuristic.
[16]Mukhtar N, Khan MA, Chiragh N, Nazir S. Identification and handling of intensifiers for enhancing accuracy of Urdu sentiment analysis. Expert Systems. 2018;35:e12317.
[17]Strohm, F., 2017. The Impact of intensifiers, diminishers and negations on emotion expressions (Bachelor's thesis).
[18]Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences. 1997 Aug 1;55(1):119-39.
[19]Boser BE, Guyon IM, Vapnik VN. A training algorithm for optimal margin classifiers. InProceedings of the fifth annual workshop on Computational learning theory 1992 Jul 1 (pp. 144-152).
[20]Bahar Nazlı, Yasemin Gültepe, Hayriye Altural. " Classification of Coronary Artery Disease Using Different Machine Learning Algorithms ", International Journal of Education and Management Engineering, Vol.10, No.4, pp.1-7, 2020.
[21]Prabhu, N. "Gauge groups and data classification." Applied mathematics and computation 138.2 (2003): 267-289.