An Optimized K-means with Density and Distance-Based Clustering Algorithm for Multidimensional Spatial Databases

Full Text (PDF, 598KB), PP.70-82

Views: 0 Downloads: 0

Author(s)

K Laskhmaiah 1,* S Murali Krishna 2 B. Eswara Reddy 3

1. Department of Computer Science and Engineering, JNTUH, Hyderabad, Telangana, INDIA

2. Department of Information Technology, SVCE, Tirupati, Andhra Pradesh, INDIA

3. Department of Computer Science and Engineering, JNTUA, Ananthapuram,Andhra Pradesh, INDIA

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2021.06.06

Received: 25 Mar. 2021 / Revised: 12 May 2021 / Accepted: 28 Jun. 2021 / Published: 8 Dec. 2021

Index Terms

Spatial Data Mining, Overlapping, Weighted Convolutional Neural Networks (Weighted CNN) and clustering

Abstract

From massive and complex spatial database, the useful information and knowledge are extracted using spatial data mining. To analyze the complexity, efficient clustering algorithm for spatial database has been used in this area of research. The geographic areas containing spatial points are discovered using clustering methods in many applications. With spatial attributes, the spatial clustering problem have been designed using many approaches, but non-overlapping constraints are not considered. Most existing data mining algorithms suffer in high dimensions. With non-overlapping named as Non Overlapping Constraint based Optimized K-Means with Density and Distance-based Clustering (NOC-OKMDDC),a multidimensional optimization clustering is designed to solve this problem by the proposed system and the clusters with diverse shapes and densities in spatial databases are fast found. Proposed method consists of three main phases. Using weighted convolutional Neural Networks(Weighted CNN), attributes are reduced from the multidimensional dataset in this first phase. A partition-based algorithm (K-means) used by Optimized K-Means with Density and Distance-based Clustering (OKMDD) and several relatively small spherical or ball-shaped sub clusters are made by Clustering the dataset in this second phase. The optimal sub cluster count is performed with the help of Adaptive Adjustment Factor based Glowworm Swarm Optimization algorithm (AAFGSO). Then the proposed system designed an Enhanced Penalized Spatial Distance (EPSD) Measure to satisfy the non-overlapping condition. According to the spatial attribute values, the spatial distance between two points are well adjusted to achieving the EPSD. In third phase, to merge sub clusters the proposed system utilizes the Density based clustering with relative distance scheme. In terms of adjusted rand index, rand index, mirkins index and huberts index, better performance is achieved by proposed system when compared to the existing system which is shown by experimental result.

Cite This Paper

K Laskhmaiah, S Murali Krishna, B Eswara Reddy, "An Optimized K-means with Density and Distance-Based Clustering Algorithm for Multidimensional Spatial Databases", International Journal of Computer Network and Information Security(IJCNIS), Vol.13, No.6, pp.70-82, 2021. DOI: 10.5815/ijcnis.2021.06.06

Reference

[1] Arbind Kumar Singh and Manimannan, “Detecting Hot Spots on Crime Data Using Data Mining and Geographical Information System”, International Journal of Statistika and Mathematika, ISSN: 2277- 2790, E-ISSN: 2249-8605, Volume 8, Issue 1, 2013 pp 05-09.

[2] Bendechache,Malika,andMTaharKechadi "Distributed clustering algorithm for spatial data mining", 2015 2nd IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM), 2015.

[3] Parimala, M., Lopez, D., &Senthilkumar, N. C. (2011). A survey on density based clustering algorithms for mining large spatial databases. International Journal of Advanced Science and Technology, 31(1), 59-66.

[4] Mumtaz, K., &Duraiswamy, K. (2010). An analysis on density based clustering of multi-dimensional spatial data. Indian Journal of Computer Science and Engineering, 1(1), 8-12

[5] Barua, H. B., Das, D. K., &Sarmah, S. (2012). A density based clustering technique for large spatial data using polygon approach. TDCT, IOSR Journal of Computer Engineering (IOSRJCE) ISSN, 2278-0661

[6] Sharma, A., Gupta, R. K., & Tiwari, A. (2016). Improved Density Based Spatial Clustering of Applications of Noise Clustering Algorithm for Knowledge Discovery in Spatial Data. Mathematical Problems in Engineering, 2016.

[7] Gupta, R. K., & Tiwari, A. (2015). Density Based Spatial Clustering of Applications of Noise Clustering Algorithm for Knowledge Discovery in Spatial Data. Mathematical Problems in Engineering, 2015.

[8] Ahmed Fahim, "A Clustering Algorithm based on Local Density of Points", International Journal of Modern Education and Computer Science, Vol.9, No.12, pp. 9-16, 2017.

[9] Aksac, A., Özyer, T., &Alhajj, R. (2019). CutESC: Cutting Edge Spatial Clustering Technique based on Proximity Graphs. Pattern Recognition.

[10] Zhang, S., Xiao, K., Carranza, E. J. M., Yang, F., & Zhao, Z. (2019). Integration of auto-encoder network with density-based spatial clustering for geochemicalanomaly detection for mineral exploration. Computers & Geosciences.

[11] Cheng, Q., Lu, X., Liu, Z., Huang, J., & Cheng, G. (2016). Spatial clustering with density-ordered tree. Physica A: Statistical Mechanics and its Applications, 460, 188-200.

[12] Pereira, C. M., & de Mello, R. F. (2015). Persistent homology for time series and spatial data clustering. Expert Systems with Applications, 42(15-16), 6026-6038.

[13] Fateha Khanam Bappee, Amilcar Soares and Stan Matwin, “Predicting Crime Using Spatial Features”, March 2018.

[14] Muhammad Zulqarnain, Rozaida Ghazali, Muhammad Ghulam Ghouse, Yana Mazwin Mohmad Hassim, Irfan Javid, "Predicting Financial Prices of Stock Market using Recurrent Convolutional Neural Networks ", International Journal of Intelligent Systems and Applications, Vol.12, No.6, pp.21-32, 2020.

[15] Ahmed Fahim, "Finding the Number of Clusters in Data and Better Initial Centers for K-means Algorithm", International Journal of Intelligent Systems and Applications, Vol.12, No.6, pp.1-20, 2020.

[16] Vedaldi, A., &Lenc, K. (2015, October). Matconvnet: Convolutional neural networks for matlab. In Proceedings of the 23rd ACM international conference on Multimedia (pp. 689-692). ACM.

[17] Niepert, M., Ahmed, M., &Kutzkov, K. (2016, June). Learning convolutional neural networks for graphs. In International conference on machine learning (pp. 2014-2023).

[18] Wang, M., Liu, B., &Foroosh, H. (2017). Factorized convolutional neural networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 545-553).

[19] Huang, Z., & Zhou, Y. (2011). Using glowworm swarm optimization algorithm for clustering analysis. Journal of Convergence Information Technology, 6(2), 78-85.

[20] Aljarah, I., & Ludwig, S. A. (2013, June). A new clustering approach based on glowworm swarm optimization. In 2013 IEEE Congress on Evolutionary Computation (pp. 2642-2649). IEEE.

[21] Oramus, P. (2010). Improvements to glowworm swarm optimization algorithm. Computer Science, 11, 7.

[22] Parimala, M., Lopez, D., &Senthilkumar, N. C. (2011). A survey on density based clustering algorithms for mining large spatial databases. International Journal of Advanced Science and Technology, 31(1), 59-66.