Efficient Feature Extraction in Sentiment Classification for Contrastive Sentences

Full Text (PDF, 708KB), PP.54-62

Views: 0 Downloads: 0

Author(s)

Sonu Lal Gupta 1,* Anurag Singh Baghel 1

1. Sentiment analysis, Sentiment classification, Contrastive sentences, Review subjectivity, Polarity detection, Machine learning, Lexicon

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2018.05.07

Received: 29 Jan. 2018 / Revised: 10 Mar. 2018 / Accepted: 10 Apr. 2018 / Published: 8 May 2018

Index Terms

Sentiment analysis, Sentiment classification, Contrastive sentences, Review subjectivity, Polarity detection, Machine learning, Lexicon

Abstract

Sentiment Classification is a special task of Sentiments Analysis in which a text document is assigned into some category like positive, negative, and neutral on the basis of some subjective information contained in documents. This subjective information called as sentiment features are highly responsible for efficient sentiment classification. Thus, Feature extraction is essentially an important task for sentiment classification at any level. This study explores most relevant and crucial features for sentiment classification and groups them into seven categories, named as, Basic features, Seed word features, TF-IDF, Punctuation based features, Sentence based features, N-grams, and POS lexicons. This paper proposes two new sentence based features which are helpful in assigning the overall sentiment of contrastive sentences and on the basis of proposed features; two algorithms are developed to find the sentiment of contrastive sentences. The dataset of TripAdvisor is used to evaluate our proposed features. Obtained results are compared with several state-of-the-art studies using various features on the same dataset and achieve superior performance.

Cite This Paper

Sonu Lal Gupta, Anurag Singh Baghel, "Efficient Feature Extraction in Sentiment Classification for Contrastive Sentences", International Journal of Modern Education and Computer Science(IJMECS), Vol.10, No.5, pp. 54-62, 2018. DOI:10.5815/ijmecs.2018.05.07

Reference

[1]B. Narendra et al., "Sentiment Analysis on Movie Reviews: A Comparative Study of Machine Learning Algorithms and Open Source Technologies," International Journal of Intelligent Systems and Applications, vol. 8, pp. 66-70, 2016.
[2]Bing Liu, "Sentiment Analysis and opinion mining," Synthesis lectures on human language technologies, vol. 5, no. 1, pp. 1-167, 2012.
[3]Subhabrata Mukherjee and Pushpak Bhattacharyya, "Sentiment analysis: A literature survey," arXiv preprint arXiv:1304.4520, 2013.
[4]Akshi Kumar and Teeja Mary Sebastian, "Sentiment analysis: A perspective on its past, present and future," International Journal of Intelligent Systems and Applications, vol. 4, no. 10, pp. 1-14, 2012.
[5]Samina Khalid, Tehmina Khalil, and Shamila Nasreen, "A Survey of Feature Selection and Feature Extraction techniques in machine learning," in In Science and Information Conference (SAI), 2014, pp. 372-378.
[6]Samina Khalid, Tehmina Khalil, and Shamila Nasreen, "A review on feature extraction and feature selection for handwritten character recognition," International Journal of Advanced Computer Science and Applications, vol. 6, no. 2, pp. 204-215, 2015.
[7]Muhammad Zubair Asghar, Aurangzeb Khan, Shakeel Ahmad, and Fazal Masud Kundi, "A Review of Feature Extraction in Sentiment Analysis," Journal of Basic and Applied Scientific Research, vol. 4, no. 3, pp. 181-186, 2014.
[8]Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani, "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining," In LREC, vol. 10, pp. 2200-2204, 2010.
[9]Liu Lizhen, Song Wei, Wang Hanshi, Li Chuchu, and Lu Jingli, "A Novel Feature-based Method for Sentiment Analysis of Chinese product reviews," China communications, vol. 11, no. 3, pp. 154-164, 2014.
[10]Xiaowen Ding, Bing Liu, and Philip S. Yu, "A holistic lexicon-based approach to opinion mining," in WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining, 2008, pp. 231-240.
[11]Pallavi Sharma and Nidhi Mishra, "Feature level Sentiment Analysis on Movie Reviews," in 2nd International Conference on Next Generation Computing Technologies (NGCT-2016), 2016, pp. 306-311.
[12]Ana Valdivia, M. Victoria Luzón, and Francisco Herrera, "Sentiment Analysis on TripAdvisor: Are There Inconsistencies in User Reviews?," in International Conference on Hybrid Artificial Intelligence Systems. HAIS, vol. 10334, 2017, pp. 15-25.
[13]Yanyan Meng. (2012) Sentiment analysis: A study on product features. Dissertations, Theses, and Student Research from the College of Business. 28.
[14]Yelena Mejova and Padmini Srinivasan, "Exploring Feature Definition and Selection for Sentiment Classifiers," in Fifth International AAAI Conference on Weblogs and Social Media, 2011, pp. 546-549.
[15]Gizem Gezici, Rahim Dehkharghani, Berrin Yanikoglu, Dilek Tapucu, and Yucel Saygin, "SU-Sentilab : A Classification System for Sentiment Analysis in Twitter," in In SemEval@ NAACL-HLT, 2013, pp. 471-477.
[16]Munir Ahmad and Shabib Aftab, "Analyzing the Performance of SVM for Polarity Detection with Different Datasets," International Journal of Modern Education and Computer Science, vol. 9, no. 10, pp. 29-36, 2017.
[17]Akhilesh Kumar Singh, Deepak Kumar Gupta, and Raj Mohan Singh, "Sentiment Analysis of Twitter User Data on Punjab Legislative Assembly Election, 2017," International Journal of Modern Education and Computer Science, vol. 9, no. 9, pp. 60-68, 2017.
[18]A Setiyoko, I G W S Dharma, and T Haryanto, "Recent development of feature extraction and classification multispectral/hyperspectral images: a systematic literature review," In Journal of Physics: Conference Series, vol. 801, no. 1, 2017.
[19]Stefan Gindl, Albert Weichselbraun, and Arno Scharl, "Cross-Domain Contextualization of Sentiment Lexicons," in Proceeding of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, 2010, pp. 771-776.
[20]Dmitriy Bespalov, Bing Bai, Yanjun Qi, and Ali Shokoufandeh, "Sentiment classification based on supervised latent n-gram analysis," in CIKM '11 Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 375-382.
[21]Dietmar Gräbnera, Markus Zanker, Günther Fliedl, and Matthias Fuchs, "Classification of Customer Reviews based on Sentiment Analysis," in 19th Conference on Information and Communication Technologies in Tourism (ENTER), 2012, pp. 460-470.
[22]Gizem Gezici, Berrin Yanikoglu, Dilek Tapuc, and Yucel Saygın, "New Features for Sentiment Analysis: Do sentences matter?," in CEUR Workshop Proceedings, 2012, pp. 5-15.
[23]Basant Agarwal and Namita Mittal, "Prominent feature extraction for review analysis: an empirical study," Journal of Experimental & Theoretical Artificial Intelligence, vol. 28, no. 3, pp. 485-498, 2014.
[24]Jiayuan Ding, Yongquan Dong, Tongfei Gao, Zichen Zhang, and Yali Liu, "Sentiment Analysis of Chinese Micro-blog based on Classification and Rich Features," in Web Information Systems and Applications Conference, vol. 13th, 2016, pp. 61-66.
[25]Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, "Thumbs up?: sentiment classification using machine learning techniques," in In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, 2002, pp. 79-86.
[26]George A. Miller, "WordNet: A Lexical Database for English," Communications of the ACM, vol. Vol. 38, No. 11, pp. 39-41, 1995.
[27]Minqing Hu and Bing Liu, "Mining and summarizing customer reviews," in In Proceedings of the 10th ACM SIGKKD International Conference on Knowledge Discovery and Data Mining, Seattle,Washington, USA, 2004, pp. 168-177.
[28]Peter D. Turney, "Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews," in In Proceedings of the 40th annual meeting on association for computational linguistics, 2002, pp. 417-424.
[29]Farah Benamara, Carmine Cesarano, Antonio Picariello, and Diego Reforgiato, "Sentiment analysis: Adjectives and adverbs are better than adjectives alone," In ICWSM, pp. 1-7, 2007.
[30]Hong-yu Zhang, Pu Ji, Jian-qiang Wang, and Xiao-hong Chen, "A novel decision support model for satisfactory restaurants utilizing social information: A case study of TripAdvisor.com," Tourism Management, vol. 59, pp. 281-297, 2017.
[31]Raymond Yiu Keung Lau, Chun Lam Lai, Peter B. Bruza, and Kam F. Wong, "Leveraging web 2.0 data for scalable semi-supervised learning of domain-specific sentiment lexicons," in CIKM '11 Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 2457-2460.