Active Selection Constraints for Semi-supervised Clustering Algorithms

Full Text (PDF, 602KB), PP.23-30

Views: 0 Downloads: 0

Author(s)

Walid Atwa 1,* Abdulwahab Ali Almazroi 2

1. University of Jeddah, College of Computing and Information Technology at Khulais, Department of Information Technology, Jeddah, Saudi Arabia

2. Computer Science Department, Faculty of Computers and Information, Menoufia University, 32511, Egypt

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2020.06.03

Received: 6 Apr. 2020 / Revised: 28 May 2020 / Accepted: 25 Jun. 2020 / Published: 8 Dec. 2020

Index Terms

Active learning, semi-supervised clustering, pairwise constraints

Abstract

Semi.-supervised clustering algorithms aim to enhance the performance of clustering using the pairwise constraints. However, selecting these constraints randomly or improperly can minimize the performance of clustering in certain situations and with different applications. In this paper, we select the most informative constraints to improve semi-supervised clustering algorithms. We present an active selection of constraints, including active must.-link (AML) and active cannot.-link (ACL) constraints. Based on Radial-Bases Function, we compute lower-bound and upper-bound between data points to select the constraints that improve the performance. We test the proposed algorithm with the base-line methods and show that our proposed active pairwise constraints outperform other algorithms.

Cite This Paper

Walid Atwa, Abdulwahab Ali Almazroi, "Active Selection Constraints for Semi-supervised Clustering Algorithms", International Journal of Information Technology and Computer Science(IJITCS), Vol.12, No.6, pp.23-30, 2020. DOI:10.5815/ijitcs.2020.06.03

Reference

[1]Wagstaff, K. and Cardie, C. 2000. Clustering with instance-level constraints, In proceedings of the 17th ICML. 1103–1110

[2]Davidson, I and Ravi, S. S. 2005. Clustering with constraints: feasibility issues and the k-means algorithm, In proceedings of the 5th SDM, 2005. 138–149.

[3]Atwa, W., & Li, K. (2015). Semi-supervised Clustering Method for Multi-density Data. In International Conference on Database Systems for Advanced Applications (pp. 313-319). Springer.

[4]Craenendonck, T.,  Blockeel, “Constraint-based clustering selection”, Machine Learning. 2017 Oct 1;106(9-10):1497-521.

[5]Yu Z, Luo P, You J, Wong HS, Leung H, Wu S, Zhang J, Han G. Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Transactions on Knowledge and Data Engineering. 2016 Mar 1;28(3):701-14.

[6]Atwa, W., & Li, K. (2014). Active query selection for constraint-based clustering algorithms. In International Conference on Database and Expert Systems Applications (pp. 438-445). Springer.

[7]Jiang, H., Ren, Z., Xuan, J. and Wu, X. 2013. Extracting elite pairwise constraints for clustering. Neurocomputing. 99, 1 (January 2013), 124–133.

[8]Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. Proceedings of the 9th international conference on machine learning, pp 19–26

[9]Tang W, Xiong H, Zhong S, Wu J (2007) Enhancing semi-supervised clustering: a feature projection perspective. Proceedings of the 13th international conference on knowledge discovery and data mining. pp 707–716

[10]Bar-Hillel A, Hertz, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. Proceedings of the 12th international conference on machine learning, pp 11–18 

[11]Atwa, W. and Emam, M. 2019. Improving Semi-Supervised Clustering Algorithms with Active Query Selection. Advances in Systems Science and Applications. 19, 4 (Dec. 2019), 25-44.

[12]Basu, S., Banerjee, A. and Mooney, R.J. 2004. Active semi-supervision for pairwise constrained clustering. In proceedings of the SIAM International Conference on Data Mining. 333–344.

[13]Mallapragada, P.K., Jin, R. and Jain, A.K. 2008. Active query selection for semi-supervised clustering. In proceedings of the International Conference on Pattern Recognition. 1–4.

[14]Vu, V.V., Labroche, N. and Meunier, B. B. 2012. Improving constrained clustering with active query selection. Pattern Recognition. 45, 4, (Paris, France, April 2012), 1749–1758.

[15]Xiong, S., Azimi J. and Fern, Z. “Active Learning of Constraints for Semi-Supervised Clustering,” in IEEE Transactions on Knowledge and Data Engineering, 2013.

[16]Xiong S, Pei Y, Rosales R, Fern XZ. Active learning from relative comparisons. IEEE Transactions on Knowledge and Data Engineering. 2015 Dec 1;27(12):3166-75.

[17]C. Xiong, D. M. Johnson, and J. J. Corso, “Active clustering with model-based uncertainty reduction,” TPAMI, vol. 39, no. 1, pp. 5–17, 2017.

[18]K. Wei, R. Iyer, and J. Bilmes, “Submodularity in data subset selection and active learning,” in ICML, 2015, pp. 1954–1963.

[19]Li, Yanchao, Yong li Wang, Dong-Jun Yu, Ye Ning, Peng Hu, and Ruxin Zhao. "ASCENT: Active Supervision for Semi-supervised Learning." IEEE Transactions on Knowledge and Data Engineering (2019).

[20]Wang, X.  and Davidson, I. 2010. Active Spectral Clustering. In proceedings of the 10th IEEE International Conference on Data Mining. 561-568.