An Enhanced Rough Set based Feature Grouping Approach for Supervised Feature Selection

Full Text (PDF, 565KB), PP.71-82

Views: 0 Downloads: 0

Author(s)

Rubul Kumar Bania 1,*

1. Department of Computer Applications, North-Eastern Hill University, Tura Campus,Meghalaya,India

* Corresponding author.

DOI: https://doi.org/10.5815/ijmsc.2018.01.05

Received: 8 Aug. 2017 / Revised: 14 Sep. 2017 / Accepted: 16 Oct. 2017 / Published: 8 Jan. 2018

Index Terms

Feature selection, lower approximation, fuzzy set, rough set

Abstract

Selection of useful information from a large data collection is an important and challenging problem. Feature selection refers to the problem of selecting relevant features from a given dataset which produces the most predictive outcome as the original features maintain before the selection. Rough set theory (RST) and its extension are the most successful mathematical tools for feature selection from a given dataset. This paper starts with an outline of the fundamental concepts behind the rough set and fuzzy rough set based feature grouping techniques which are related to supervise feature selection. Supervised Quickreduct (QR) and fuzzy-rough feature grouping Quickreduct (FQR) algorithms are highlighted here. Then an enhanced version of FQR method is proposed here which is based on rough set dependency criteria with feature significance measure that select a minimal subset of features. Also, the termination condition of the base method is modified. Experimental studies of the algorithms are carried out on five public domain benchmark datasets available in UCI machine learning repository. JRip and J48 classifier are used to measure the classification accuracy. The performance of the proposed method is found to be satisfactory in comparison with other methods.

Cite This Paper

Rubul Kumar Bania,"An Enhanced Rough Set based Feature Grouping Approach for Supervised Feature Selection", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.4, No.1, pp.71-82, 2018. DOI: 10.5815/ijmsc.2018.01.05

Reference

[1]Z. Pawlak. Rough Sets, International J. Computer and Information Sciences, Vol. 11, No.5, pp.341-356, 1982. 

[2]R. Bello and J. L. Verdegay, Rough sets in the Soft Computing environment. Elsevier Information sciences, Vol. 212, pp.1-14, 2012.

[3]R.W. Swiniarski and A. Skowron. Rough Set Methods in Feature Selection and Recognition, Pattern Recognition Letters, Vol. 24, No. 6, pp. 833-849, 2003.

[4]Richard Jensen and Qiang Shen. Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches”, IEEE transactions on knowledge and data engineering, Vol.16, No.12, pp.1457-1471, 2004. 

[5]R .Jensen and Q .Shen. Using Fuzzy Dependency - Guided Attribute Grouping in Feature Selection, Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Proc. Ninth Int’l Conf.  pp.250-255, 2003.

[6]L. Liang. An efficient rough feature selection algorithm with a multi-granulation view. International Journal of Approximate Reasoning , Vol.53, pp.912–926, 2012.

[7]C. Velayutham and K. Thangavel. Improved Rough set algorithms for optimal attribute reduct.  Journal of Electronic Science and Technology, Vol. 9, No.2, pp.108-117, 2011.

[8]R.K.Bania, Comparative Review on Classical Rough Set Theory based Feature Selection Methods. International Journal of Computer Applications. Vol.114, No.19, pp.31-35. 2015.

[9]D. Dubois and H. Prade, Rough fuzzy sets and fuzzy rough sets. International Journal of General Systems. Vol-17, No.2, pp.191-209,1990.

[10]N. M. Parthalain and Q .Shen, On rough sets, their recent extensions and applications. The Knowledge Engineering Review, Cambridge University Press, Vol. 25, No.4, pp.365–395, 2010.

[11]WEKA http://www.cs.waikato.ac.nz/ml/weka.

[12]J.Han and M.Kamber. Data Mining Concepts and Techniques 3rd Edition Morgan Kaufmann Publishers. 2012.

[13]M. Sokolova and G. Lapalme. A systematic analysis of performance measures for classification tasks. Information Processing and Management, Elsevier. Vol.42. pp. 427–437, 2009.

[14]E. Sallam, T. Medhat, A.Ghanem and M. E. Ali, Handling Numerical Missing Values via Rough Sets. International journal of Mathematical Sciences and Computing, Vol.2, pp.22-36, 2017.

[15]D.B.Patil and Y.V. Dongre, A Fuzzy Approach for Text Mining. International journal of Mathematical Sciences and Computing, Vol.4, pp.34-43, 2015. 

[16]C.L.Blake and C.J.Merz. UCI Repository of machine learning databases. [Online].Available: http://www.ics.uci.edu/~mlearn/