Natural Language Processing based Hybrid Model for Detecting Fake News Using Content-Based Features and Social Features

Full Text (PDF, 943KB), PP.1-10

Views: 0 Downloads: 0

Author(s)

Shubham Bauskar 1,* Vijay Badole 1 Prajal Jain 1 Meenu Chawla 1

1. Department of Computer Science and Engineering, Maulana Azad National Institute of Technology, Bhopal, 462003, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2019.04.01

Received: 24 Jan. 2019 / Revised: 13 Mar. 2019 / Accepted: 22 Apr. 2019 / Published: 8 Jul. 2019

Index Terms

Fake News Detection, Machine Learning Classifier, Natural Language Processing, Probabilistic Classifiers.

Abstract

Internet acts as the best medium for proliferation and diffusion of fake news. Information quality on the internet is a very important issue, but web-scale data hinders the expert’s ability to correct much of the inaccurate content or fake content present over these platforms. Thus, a new system of safeguard is needed. Traditional Fake news detection systems are based on content-based features (i.e. analyzing the content of the news) of the news whereas most recent models focus on the social features of news (i.e. how the news is diffused in the network). This paper aims to build a novel machine learning model based on Natural Language Processing (NLP) techniques for the detection of ‘fake news’ by using both content-based features and social features of news. The proposed model has shown remarkable results and has achieved an average accuracy of 90.62% with F1 Score of 90.33% on a standard dataset.

Cite This Paper

Shubham Bauskar, Vijay Badole, Prajal Jain, Meenu Chawla, "Natural Language Processing based Hybrid Model for Detecting Fake News Using Content-Based Features and Social Features", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.11, No.4, pp. 1-10, 2019. DOI:10.5815/ijieeb.2019.04.01

Reference

[1]K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu (2017), “Fake News Detection on Social Media: A Data Mining Perspective”, ACM SIGKDD Explorations Newsletter, vol. 19(1), pp. 22-36.
[2]K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu (2018), “FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media”, arXiv preprint arXiv:1809.01286.
[3]K. Shu, S. Wang, and H. Liu (2017), “Exploiting Tri-Relationship for Fake News Detection”, CoRR arXiv preprint arXiv:1712.07709.
[4]M. L. D. Vedova, E. Tacchini, S. Moret, G. Ballarin, M. DiPierro and L. de Alfaro (2018), “Automatic Online Fake News Detection Combining Content and Social Signals”, 2018 22nd Conference of Open Innovations Association (FRUCT), Jyvaskyla, 2018, pp. 272-279.
[5]B. Riedel, I. Augenstein, G. P. Spithourakis, and S. Riedel (2017), “A simple but tough-to-beat baseline for the Fake News Challenge stance detection task”, arXiv preprint arXiv:1707.03264.
[6]H. Ahmed, I. Traore, and S. Saad (2017), “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques”, Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments, ser. Lecture Notes in Computer Science, Springer, pp. 127–138.
[7]S. Badaskar, S. Agarwal, and S. Arora (2008), “Identifying Real or Fake Articles: Towards better Language Modeling”, IJCNLP, pp. 817–822.
[8]E. Tacchini, G. Ballarin, M. L. Della Vedova, S. Moret, and L. de Alfaro (2017), “Some Like it Hoax: Automated Fake News Detection in Social Networks”, Proceedings of the Second Workshop on Data Science for Social Good, Skopje, Macedonia, vol. 1960.
[9]A. Bessi, M. Coletto, G. A. Davidescu, A. Scala, G. Caldarelli, and W. Quattrociocchi (2015), “Science vs Conspiracy: Collective Narratives in the Age of Misinformation”, PLOS ONE, vol. 10(2), pp. e0118093.
[10]H. Allcott and M. Gentzkow (2017), “Social Media and Fake News in the 2016 Election”, Journal of Economic Perspectives, vol. 31(2), pp. 211–236.
[11]H. Karimi, P. C. Roy, S. S. Sadiya, and J. Tang (2018), “Multi-Source Multi-Class Fake News Detection”, Proceedings of the 27th International Conference on Computational Linguistics, New Mexico, USA, pp. 1546–1557.
[12]M. A. Little, G. Varoquaux, S. Saeb, L. Lonini, A. Jayaraman, D. Mohr, and K. Kording (2017), “Using and understanding cross-validation strategies. Perspectives on Saeb et al”, GigaScience, vol. 6.
[13]S. Arlot, and A. Celisse (2010), “A survey of cross-validation procedures for model selection”, Statistics Surveys, vol. 4, pp. 40-79.
[14]M. Balmas (2012), “When Fake News Becomes Real: Combined Exposure to Multiple News Sources and Political Attitudes of Inefficacy, Alienation, and Cynicism”, Communication Research, vol. 41(3), pp. 430-454.
[15]S. Jang, and K. K. Joon (2018). “Third person effects of fake news: Fake news regulation and media literacy interventions”, Computers in Human Behavior, vol. 80, pp. 295-302.
[16]A. Guess, J. Nagler, and J. Tucker (2019), “Less than you think: Prevalence and predictors of fake news dissemination on Facebook”, Science Advances, vol. 5(1), pp.
[17]K. Shu, D. Mahudeswaran, and H. Liu (2018), “FakeNewsTracker: a tool for fake news collection, detection, and visualization”, Computational and Mathematical Organization Theory, vol. 25(1), pp. 60-71.