Augmenting Sentiment Analysis Prediction in Binary Text Classification through Advanced Natural Language Processing Models and Classifiers

PDF (545KB), PP.16-31

Views: 0 Downloads: 0

Author(s)

Zhengbing Hu 1 Ivan Dychka 2 Kateryna Potapova 3 Vasyl Meliukh 3,*

1. School of Computer Science, Hubei University of Technology, Wuhan, China

2. Computer Systems Software Department, Igor Sikorsky Kyiv Polytechnic Institute, Kyiv, Ukraine

3. Department of System Programming and Specialized Computer Systems, Igor Sikorsky Kyiv Polytechnic Institute, Kyiv, Ukraine

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2024.02.02

Received: 3 Jan. 2024 / Revised: 15 Feb. 2024 / Accepted: 10 Mar. 2024 / Published: 8 Apr. 2024

Index Terms

Binary Text Classification, Natural Language Processing, Deep Learning, Ensemble Methods, Neural Networks

Abstract

Sentiment analysis is a critical component in natural language processing applications, particularly for text classification. By employing state-of-the-art techniques such as ensemble methods, transfer learning and deep learning architectures, our methodology significantly enhances the robustness and precision of sentiment predictions. We systematically investigate the impact of various NLP models, including recurrent neural networks and transformer-based architectures, on sentiment classification tasks. Furthermore, we introduce a novel ensemble method that combines the strengths of multiple classifiers to improve the predictive ability of the system. The results demonstrate the potential of integrating state-of-the-art Natural Language Processing (NLP) models with ensemble classifiers to advance sentiment analysis. This lays the foundation for a more advanced comprehension of textual sentiments in diverse applications.

Cite This Paper

Zhengbing Hu, Ivan Dychka, Kateryna Potapova, Vasyl Meliukh, "Augmenting Sentiment Analysis Prediction in Binary Text Classification through Advanced Natural Language Processing Models and Classifiers", International Journal of Information Technology and Computer Science(IJITCS), Vol.16, No.2, pp.16-31, 2024. DOI:10.5815/ijitcs.2024.02.02

Reference

[1]Y. Wang, G. Huang, J. Li, H. Li, Y. Zhou and H. Jiang, "Refined Global Word Embeddings Based on Sentiment Concept for Sentiment Analysis," in IEEE Access, vol. 9, pp. 37075-37085, 2021. DOI: 10.1109/ACCESS.2021.3062654.
[2]Y. Huang, R. Wang, B. Huang, B. Wei, S. L. Zheng and M. Chen, "Sentiment Classification of Crowdsourcing Participants’ Reviews Text Based on LDA Topic Model," in IEEE Access, vol. 9, pp. 108131-108143, 2021.
[3]B. Pang, L. Lee, “Opinion Mining and Sentiment Analysis”, Foundations and Trends® in Information Retrieval, 2(1–2), 1-135, 2008. 
[4]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019. 
[5]Y. Kim, “Convolutional Neural Networks for Sentence Classification”, Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751, 2014. 
[6]W. Song, H. Li, Q. Yu, W. Li, Bingxin Zhang, Qiujuan Zhang, Zhigang Liu, "The Multimedia Sentiment Model Based on Online Homestay Reviews ", International Journal of Engineering and Manufacturing, Vol.10, No.4, pp.13-23, 2020.
[7]J. Hirschberg, C. D. Manning, “Advances in natural language processing”, Science 349, 261-266, 2015.
[8]N. Rizun, Y. Taranenko, W. Waloszek, “Improving the Accuracy in Sentiment Classification in the Light of Modelling the Latent Semantic Relations”, Information 9, no. 12:307., 2018.
[9]S.-J. Wang, A. Mathew, Y. Chen, L.-F. Xi, L. Ma, J. Lee, “Empirical analysis of support vector machine ensemble classifiers”, Expert Systems with Applications, Volume 36, Issue 3, Part 2, Pages 6466-6476, 2009.
[10]L. Zhang, S. Wang, B. Liu, Bing, “Deep Learning for Sentiment Analysis: A Survey”, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2018.
[11]A. Gillioz, J. Casas, E. Mugellini and O. A. Khaled, "Overview of the Transformer-based Models for NLP Tasks", 15th Conference on Computer Science and Information Systems (FedCSIS), pp. 179-183, 2020. 
[12]A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training”, 2018.
[13]L. Breiman, “Random Forests”, Machine Learning 45, 5–32, 2001. DOI: 10.1023/A:1010933404324
[14]T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, arXiv preprint, 2013. arXiv:1301.3781.
[15]L. A. Mullen., K. Benoit, O. Keyes, D. Selivanov, J. Arnold, “Fast, Consistent Tokenization of Natural Language Text”, The Journal of Open Source Software 3(23):655, 2018. 
[16]J. Pennington, R. Socher, and C. Manning, “GloVe: Global Vectors for Word Representation”, In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014.
[17]D. Chen and C. Manning, “A Fast and Accurate Dependency Parser using Neural Networks”, In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 740–750, 2014.
[18]G. Lample, M, Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural Architectures for Named Entity Recognition”, In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 260–270, 2016. 
[19]C. Zhao, S. Sahni, “String correction using the Damerau-Levenshtein distance”, BMC Bioinformatics 20 (Suppl 11), 277, 2019. DOI: 10.1186/s12859-019-2819-0
[20]P. Santoso, P. Yuliawati, R. Shalahuddin, A. Wibawa, “Damerau Levenshtein Distance for Indonesian Spelling Correction”, Journal Informatika. 13. 11, 2019. 
[21]I. Gupta., S. Mittal, A. Tiwari, P. Agarwal, A. Kumar, Singh, “TIDF-DLPM: Term and Inverse Document Frequency based Data Leakage Prevention Model”, arXiv.org, 2022. 
[22]Y. Goldberg, “A primer on neural network models for natural language processing”, J. Artif. Int. Res. 57, 1, 345–420, 2016. DOI: 10.1613/jair.4992
[23]A. Jurek, Y. Bi, S. Wu, and C. Nugent, “A survey of commonly used ensemble-based classification techniques”, The Knowledge Engineering Review, 29(5), 551–581, 2014.
[24]M. Hossin, and M. N. Sulaiman, “A review on evaluation metrics for data classification evaluations”, International journal of data mining & knowledge management process, 5(2), 1, 2015.
[25]K. Potapova, M. Nalyvaichuk, V. Meliukh, S. Gurynenko, K. Koliada, A. Scherbyna, “Semantic role labelling and analysis in economic and cybersecurity contexts using natural language processing classifiers”, Economic and cyber security. Kharkiv: PC TECHNOLOGY CENTER, 88–122, 2023.