Rahim Khan

Work place: School of Software, Xinjiang University, Urumqi 830008, China

E-mail: rahim0333@yahoo.com

Website:

Research Interests: Software Creation and Management, Software Engineering, Computational Learning Theory, Natural Language Processing, Computer Architecture and Organization

Biography

Rahim Khan is a student of Master degree program in software engineering from Xinjiang University China. He has done his B.Sc. Computer Science from the University of Swat Pakistan. His core areas of interest are software engineering, machine learning, and natural language processing.

Author Articles
Extractive based Text Summarization Using K-Means and TF-IDF

By Rahim Khan Yurong Qian Sajid Naeem

DOI: https://doi.org/10.5815/ijieeb.2019.03.05, Pub. Date: 8 May 2019

The quantity of information on the internet is massively increasing and gigantic volume of data with numerous compositions accessible openly online become more widespread. It is challenging nowadays for a user to extract the information efficiently and smoothly. As one of the methods to tackle this challenge, text summarization process diminishes the redundant information and retrieves the useful and relevant information from a text document to form a compressed and shorter version which is easy to understand and time-saving while reflecting the main idea of the discussed topic within the document. The approaches of automatic text summarization earn a keen interest within the Text Mining and NLP (Natural Language Processing) communities because it is a laborious job to manually summarize a text document. Mainly there are two types of text summarization, namely extractive based and abstractive based. This paper focuses on the extractive based summarization using K-Means Clustering with TF-IDF (Term Frequency-Inverse Document Frequency) for summarization. The paper also reflects the idea of true K and using that value of K divides the sentences of the input document to present the final summary. Furth more, we have combined the K-means, TF-IDF with the issue of K value and predict the resulting system summary which shows comparatively best results.

[...] Read more.
Other Articles