A public opinion classification algorithm based on micro-blog text sentiment intensity: Design and implementation

Full Text (PDF, 317KB), PP.48-54

Views: 0 Downloads: 0

Author(s)

Xin Mingjun 1,* Wu Hanxiang 1 Li Weimin 1 Niu Zhihua 1

1. School of Computer Engineering and Science, Shanghai University, Shanghai 20072, China

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2011.03.07

Received: 5 Aug. 2010 / Revised: 10 Dec. 2010 / Accepted: 17 Jan. 2011 / Published: 8 Apr. 2011

Index Terms

Micro-blog, sentiment intensity, public opinion classification algorithm

Abstract

On the features of short content and nearly real-time broadcasting velocity of micro-blog information, our lab constructed a public opinion corpus named MPO Corpus. Then, based on the analysis of the status of the network public opinion, it proposes an approach to calculate the sentiment intensity from three levels on words, sentences and documents respectively in this paper. Furthermore, on the basis of the MPO Corpus and HowNet Knowledge-base and sentiment analysis set, the feature words’ semantic information is brought into the traditional vector space model to represent micro-blog documents. At the same time, the documents are classified by the subjects and sentiment intensity. Therefore, the experiment result indicates that the proposed method improves the efficiency and accuracy of the micro-blog content classification,the public opinion characteristics analysis and supervision in this paper. Thus, it provides a better technical support for content auditing and public opinion monitoring for micro-blog platform.

Cite This Paper

Xin Mingjun, Wu Hanxiang, Li Weimin, Niu Zhihua, "A public opinion classification algorithm based on micro-blog text sentiment intensity:Design and implementation", International Journal of Computer Network and Information Security(IJCNIS), vol.3, no.3, pp.48-54, 2011. DOI:10.5815/ijcnis.2011.03.07

Reference

[1]Wikipedia [R]. http://en.wikipedia.org/wiki/Micro-blog.
[2]A.-H.Tan and P.Yu, A Comparative Study on Chinese Text Categorization Methods, PRICAI 2000 Workshop on Text and Web Ming, Melbourne,pp.24-35,August 2000
[3]Wu,Hanxiang, Xin Minjun, An approach to micro-blog sentiment intensity computing based on public opinion Corpus, unpublished.
[4]Qin Zhenhua, Xin Mingjun, Niu Zhihua. A Content Tendency Judgment Algorithm for Micro-blog Platform[C]. IEEE International Conference on Intelligent Computing and Intelligent Systems, in Xiamen, China, 2010.
[5]Chang Yi, Zhang Xin, Research and Implementation of Text Categorization System based on keyword Expressions, unpublished.
[6]Jian-Yun Nie, Jiangfeng Gao, Jian Zhang and Ming Zhou, On the Use of Words and N-grams for Chinese Information Retrieval. Proceeding of the fifth international workshop on Information retrieval with Asian languages, pages: 141-148, November, 2000, Hong Kong,China.
[7]LI Xuelei, ZHANG DONGMO, A Text Categorization Method Based on VSM, Computer Engineering, October 2003, Vol.29 No.17, pages: 90-92.
[8]K.L. Kwok, Comparing Representations in Chinese Information Retrieval. SIGIR’97, PAGES: 34-41, Philadelphia, Pennsylvania, United States.
[9]Zheng Wei, WANG Rui, Comparative Study of Feature Selection in Chinese Text Categorization, Journal of Hebei Norh University (Natural Science Edition) Dec.2007. Vol. 23 No. 6 pages: 51-64.
[10]Thorsten Joachims, A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Cateorization.Pages:143-151, Proceeding of ICML-97, 14th International Conference on Machine Learning.
[11]How net’s Home Page. http:// www.keenage.com
[12]Qun LIU, Sujian LI, Word Similarity Computing Based on How-net [C], the 3rd Chinese Lexical semantics workshop, in Taipei, 2002
[13]Yiming Yang, A Evaluation of Statistical Approaches to Text Categorization. Journal of Information Retrieval, Vol1 No. 1/2.69-90.
[14]Yiming Yan and Xin Liu. A re-examination of text categorization methods. Proceedings of SIGIR’99, pages: 42-49
[15]Liu Ying, The Application of Naïve Bayes in Text Classification Preprocessing, Computer and Information Technology, Dec.2010, Vol.18 No.6, pages:26-27.
[16]Jingnian Chen, HouKuan Huang, Shengfeng Tian, Youli Qu. Feature selection for text classification with Naïve Bayes. Expert Systems with Application 36(2009)5432-5435.
[17]Kim, S., Han, K., Rim, H., &Myaeng, s. (2006). Some effective techniques for Naïve Bayes text classification. IEEE Transactions on Knowledge and Data Engineering, 18(11), 1457-1466.
[18]SU Li-hua, ZHU Zhang-hua, BAI Wen-hua, Term Weighting Algorithm in Text Categorization Based on VSM, Computer Knowledge and Technology, Nov 2010, Vol.6 No.33, pp.9327-9329
[19]HU Xue-gang, DONG Xue-chun, XIE Fei, Method of Chinese text categorization based on the word vector space model, Journal of Hefei university of Technology, Oct. 2007, Vol.30 No.10, pages:1262-1264 .
[20]ZHANG Yun-liang, ZHANG Quan, Research of Automatic Text Categorization Based on Sentence Category VSM, Computer Engineering, Nov 2007, Vol.33 No.22, pages:45-47
[21]HUANG Xuan-Jing, XIA Ying-Ju, WU Li-De, A Text Filtering System Based on Vector Space method, Journal of Software, Vol.14, No.3, 2003, pages: 435-442.
[22]PANG Jian-feng, BU Dong-bo, BAI Shuo, Research and Implementation of Text Categorization System Based on VSM, Application Research of Computers, No 9, 2001, pages.