Survey on Word Sense Disambiguation: An Initiative towards an Indo-Aryan Language

Jumi Sarmah 1 Shikhar Kumar Sarma 1

1. Department of Information Technology, Gauhati University, Guwahati, Assam, 781014, India

Received: 14 Jan. 2016 / Revised: 1 Mar. 2016 / Accepted: 8 Apr. 2016 / Published: 8 May 2016

Index Terms

Assamese, Lexical Ambiguity, Natural Language Processing, Word Sense Disambiguation


Resolution of lexical ambiguity, commonly known as Word Sense Disambiguation (WSD) task is to distinguish the correct sense among the set of senses for an ambiguous term depending on the particular context automatically. It plays the vital role as it acts as an intermediate phase to many Natural Language Processing (NLP) applications like Machine Translation, Information Retrieval, Speech Processing, Hypertext navigation, Parts-of -Speech tagging. Existing literature reveals that there are various approaches for lexical ambiguity resolution-Knowledge based, Corpus based. In recent years, many WSD systems is being developed in Indian languages like Hindi, Malayalam, Manipuri, Nepali, Kannada but no such automated system has yet emerged for the Indo-Aryan language- Assamese. Our future work aims to develop a model for the WSD problem which is fast, optimal and efficient in terms of accuracy and scalability. This paper presents a survey report made in this research topic discussing the WSD problem, various approaches along with their algorithms. Moreover it also list out the various NLP applications which would be efficient when disambiguation system is merged. Evaluation measures used to determine the WSD performance are also discussed here. 

Jumi Sarmah, Shikhar Kumar Sarma,"Survey on Word Sense Disambiguation: An Initiative towards an Indo-Aryan Language", International Journal of Engineering and Manufacturing(IJEM), Vol.6, No.3, pp.37-52, 2016. DOI: 10.5815/ijem.2016.03.04


