Emoji Prediction Using Emerging Machine Learning Classifiers for Text-based Communication

Full Text (PDF, 493KB), PP.37-43

Views: 0 Downloads: 0

Author(s)

Sayan Saha 1,* Kakelli Anil Kumar 1

1. School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India, 632014.

* Corresponding author.

DOI: https://doi.org/10.5815/ijmsc.2022.01.04

Received: 1 Jun. 2021 / Revised: 25 Jun. 2021 / Accepted: 18 Jul. 2021 / Published: 8 Feb. 2022

Index Terms

Google GBoard, Twitter, n-gram, Natural Language Processing (NLP), Sentimental Analysis, Machine Learning, Classifiers

Abstract

We aim to extract emotional components within statements to identify the emotional state of the writer and assigning emoji related to the emotion. Emojis have become a staple part of everyday text-based communication. It is normal and common to construct an entire response with the sole use of emoji. It comes as no surprise, therefore, that effort is being put into the automatic prediction and selection of emoji appropriate for a text message. Major companies like Apple and Google have made immense strides in this, and have already deployed such systems into production (for example, the Google Gboard). The proposed work is focused on the problem of automatic emoji selection for a given text message using machine learning classification algorithms to categorize the tone of a message which is further segregated through n-gram into one of seven distinct categories. Based on the output of the classifier, select one of the more appropriate emoji from a predefined list using natural language processing (NLP) and sentimental analysis techniques. The corpus is extracted from Twitter. The result is a boring text message made lively after being annotated with appropriate text messages

Cite This Paper

Sayan Saha, Kakelli Anil Kumar," Emoji Prediction Using Emerging Machine Learning Classifiers for Text-based Communication ", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.8, No.1, pp. 37-43, 2022. DOI: 10.5815/ijmsc.2022.01.04

Reference

[1]Chunhua Zhang, Xiaojian Shao, Dewei Li, “Knowledge-based Support Vector Classification Based on C-SVC”, Procedia Computer Science 17:1083-1090, December 2013. 

[2]Daniel Dichiu, Irina Rancea, "Using Machine Learning Algorithms for Author Profiling In Social Media Notebook", PAN at CLEF 2016.

[3]S. R. Safavian, D. Landgrebe, "A survey of decision tree classifier methodology," in IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 3, pp. 660-674, doi: 10.1109/21.97458, 1991.

[4]Gerard Biau, “Analysis of a Random Forests Model”, Journal of Machine Learning Research 13, 1063-1095 Submitted 10/10, Published 4/12, 2012.

[5]Nurendra Choudhary, Rajat Singh, Vijjini Anvesh Rao, Manish Shrivastava, Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction, 2018. 

[6]Ronen Feldman, Techniques and applications for sentiment analysis, Communications of the ACM, April 2013.

[7]Jin Wang, Liang-Chih Yu, K. Robert Lai, Xuejie Zhang, Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 225–230, Berlin, Germany, August 7-12, 2016.

[8]M. Thamarai, S P. Malarvizhi, “House Price Prediction Modeling Using Machine Learning", International Journal of Information Engineering and Electronic Business (IJIEEB), Vol.12, No.2, pp. 15-20, 2020. DOI: 10.5815/ijieeb.2020.02.03

[9]Mingrui “Ray” Zhang, Ruolin Wang, Xuhai Xu, Qisheng Li, Ather Sharif, Jacob O. Wobbrock, "Voicemoji: Emoji Entry Using Voice for Visually Impaired People", Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, May 2021, Article No.: 37Mingrui “Ray” Zhang, Ruolin Wang, Xuhai Xu, Qisheng Li, Ather Sharif, Jacob O. Wobbrock, "Voicemoji: Emoji Entry Using Voice for Visually Impaired People", Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, May 2021, Article No.: 37, https://dl.acm.org/doi/10.1145/3411764.3445338

[10]Kotaro Oomori, Akihisa Shitara, Tatsuya Minagawa, Sayan Sarcar, Yoichi Ochiai, "A Preliminary Study on Understanding Voice-only Online Meetings Using Emoji-based Captioning for Deaf or Hard of Hearing Users", The 22nd International ACM SIGACCESS Conference on Computers and Accessibility, October 2020 Article No.: 54, https://doi.org/10.1145/3373625.3418032

[11]Alattar, Fetheya N., "Happy, Sad or Pizza: A Review of Emoji Effects on Reading Times and their Relation to Mood" (2021). University Honors Theses. Paper 1087. https://doi.org/10.15760/honors.1114

[12]L. Yang, Y. Li, J. Wang and R. S. Sherratt, "Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment Lexicon and Deep Learning," in IEEE Access, vol. 8, pp. 23522-23530, 2020, doi: 10.1109/ACCESS.2020.2969854.

[13]H. T. Phan, V. C. Tran, N. T. Nguyen and D. Hwang, "Improving the Performance of Sentiment Analysis of Tweets Containing Fuzzy Sentiment Using the Feature Ensemble Model," in IEEE Access, vol. 8, pp. 14630-14641, 2020, doi: 10.1109/ACCESS.2019.2963702.

[14]B. Narendra, K. Uday Sai, G. Rajesh, K. Hemanth, M. V. Chaitanya Teja, K. Deva Kumar, "Sentiment Analysis on Movie Reviews: A Comparative Study of Machine Learning Algorithms and Open Source Technologies", August 2016, MECS, DOI: 10.5815/ijisa.2016.08.08

[15]Anjali Dadhich, Blessy Thankachan, "Sentiment Analysis of Amazon Product Reviews Using Hybrid Rule-based Approach", April 2021, MECS, DOI: 10.5815/ijem.2021.02.04

[16]Golam Mostafa, Ikhtiar Ahmed, Masum Shah Junayed, "Investigation of Different Machine Learning Algorithms to Determine Human Sentiment Using Twitter Data", April 2021, MECS,  DOI: 10.5815/ijitcs.2021.02.04