Web Search Customization Approach Using Redundant Web Usage Data Association and Clustering

Full Text (PDF, 680KB), PP.35-42

Views: 0 Downloads: 0

Author(s)

N. Krishnaiah 1,* G. Narsimha 2

1. B V C Engineering College, Odalarevu-533210, A.P, India

2. JNTUH College of Engineering, Jagityal, Telangana, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijieeb.2016.04.05

Received: 9 Apr. 2016 / Revised: 2 May 2016 / Accepted: 10 Jun. 2016 / Published: 8 Jul. 2016

Index Terms

Clustering, Association, Web Search, Web Usage logs, Customization, Redundant data

Abstract

The massive growth of web consists of huge number of redundant information in related to some context. Due to which the need of information through a search provide high number of duplicate results which makes user to navigate number of sites to find the needed information. Users often miss their search pages when they browse the large and complex navigation of the web. Web customization is based on the use of the web logs can take advantage of the knowledge necessary to study the content and the structure of the internet to support. Searching information can be improvised in support of the implicit information generated by the web server in form logs for various web documents visited by users. This paper proposes a web search customization approach (WSCA) using redundant web usage data association and hierarchal clustering. Association generates a multilevel association for redundant data in the web navigation sites and clustering generates a cluster of frequent access patterns. The approach will improvise the real-time customization and also cost requirement for generating customized resources. The experiment evaluation shows an improvisation in precision rate in relevant to different queries against existing clustering approach.

Cite This Paper

N. Krishnaiah, G. Narsimha, "Web Search Customization Approach Using Redundant Web Usage Data Association and Clustering", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.4, pp.35-42, 2016. DOI:10.5815/ijieeb.2016.04.05

Reference

[1]Fatima, C. Luca and M. Hobbs, "Free-text user queries for semantic search", IEEE 13th International Conference on Industrial Informatics (INDIN), Page : 838 - 843, July 2015. 

[2]S. S. Bhaskar and B. Tidke, "A New Approach and Compressive Survey on Restructuring User Search Results by Using Feedback Session",Int. Conf. in Computing Communication Control and Automation (ICCUBEA), Page: 479 - 484, Feb. 2015.

[3]R. M. Kaingade and H. A. Tirmare, "Personalization of Web Search based on privacy protected and auto-constructed user profile", Int. Conference in Adv. in Computing, Comm. and Informatics (ICACCI), Page.818 - 823, Aug. 2015. 

[4]Bin Jiang, Jian Pei, Yufei Tao, Member, and Xuemin Lin, "Clustering Uncertain Data Based on Probability Distribution Similarity", IEEE Transactions On Knowledge And Data Engineering, Vol. 25, No. 4, April 2013.

[5]Wael K. Hanna, Aziza S. Aseem, M. B. Senousy,"Issues and Challenges of User Intent Discovery (UID) during Web Search", IJITCS, vol.7, no.7, pp.66-76, DOI: 10.5815/ijitcs.2015.07.08, 2015.

[6]Thi Thanh Sang Nguyen, Hai Yan Lu, and Jie Lu, "Web-Page Recommendation Based on Web Usage and Domain", Knowledge IEEE Transactions On Knowledge and Data Engineering, Vol. 26, No. 10, October 2014.

[7]Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, and Clement Yu, "Annotating Search Results from Web Databases", IEEE Transactions On Knowledge And Data Engineering, Vol. 25, No. 3, March 2013

[8]Jespersean S.E., Throhauge J., and Bach T., "A hybrid approach to Web Usage Mining, Data Warehousing and Knowledge Discovery", Springer Verlag Germany, pp73-82, 2002. 

[9]Mobasher, B., Dai, H., Luo, T., & Nakagawa, M., "Effective Personalization Based on Association Rule Discovery from Web Usage Data", Proceedings of the 3rd International Workshop on Web Information and Data Management, 2001.

[10]Chi E.H., Rosien A. and Heer J.," LumberJack:Intelligent Discovey and Analysis of Web User Traffic Composition", In Proceedings of ACMSIGKDD Workshop on Web Mining for Usage Patterns and User Profiles, Canada, ACM press, 2002.

[11]F. Akhlaghian, B. Arzanian and P. Moradi, "A Personalized Search Engine Using Ontology-Based Fuzzy Concept Networks", International Conference on Data Storage and Data Engineering (DSDE), Pages. 137 - 141, 2010.

[12]R. C. de Amorim, B. Mirkin, and J. Q. Gan, "Anomalous pattern based clustering of mental tasks with subject independent learning some preliminary results", Artificial Intelligence Research, 1(1):46{54, 2012.

[13]Omar Y. Alshamesti,Ismail M. Romi,"Optimal Clustering Algorithms for Data Mining", IJIEEB, vol.5, no.2, pp.22-27, DOI: 10.5815/ijieeb.2013.02.04, 2013.

[14]I.S. Dhillon, S. Mallela, and R. Kumar, "A Divisive Information Theoretic Feature Clustering Algorithm for Text Classification", J. Machine Learning Research, vol. 3, pp. 1265-1287, 2003.

[15]H.P. Kriegel and M. Pfeifle, "Hierarchical Density-Based Clustering of Uncertain Data", Proc. IEEE Int'l Conf. Data Mining (ICDM), 2005.

[16]Jugendra Dongre, Gend Lal Prajapati and S.V. Tokekar, " The Role of Apriori Algorithm for Finding the Association Rules in Data Mining", IEEE International Conference on Issues and Challenges in Intelligent Computing Techniques, 2014. 

[17]L. Cao, Y. Zhao, and C. Zhang, "Mining impact-targeted activity patterns in imbalanced data", IEEE Trans. Knowl. Data Eng., vol. 20, no. 8, pp. 1053-1066, Aug. 2008.

[18]Kyung-Joong Kim and Sung-Bae Cho, "A personalized Web search engine using fuzzy concept network with link structure", Joint 9th IFSA World Congress and 20th NAFIPS International Conference, Vol. 1, Pages. 81 - 86, 2005.

[19]Baker L.D. and McCallum A.K., "Distributional clustering of words for text classification", In Proceedings of the 21st Annual international ACM SIGIR Conf. on Research and Development in info. Retrieval, pp 96103, 1998.

[20]Hongwei Yang, "A Document Clustering Algorithm for Web Search Engine Retrieval System", IEEE International Conference on e-Education, e-Business, e-Management and e-Learning, 978-0-7695-3948-5/10, 2010.

[21]R. Cheng, D.V. Kalashnikov, and S. Prabhakar, "Evaluating Probabilistic Queries over Imprecise Data", Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), 2003.

[22]H. Cheng, X. Yan, J. Han, and C.-W. Hsu, "Discriminative frequent pattern analysis for effective classification", in Proc. ICDE, pp. 716-725, 2007.

[23]A. Jorge, "Hierarchical clustering for thematic browsing and summarization of large sets of association rules", in Proc. SDM, pp. 178-187, 2004.

[24]Liu, W. Hsu, and Y. Ma, "Integrating classification and association rule mining", in Proc. 4th Int. Conf. Knowl. Discov. Data Mining (KDD), pp. 80-86, 1998.

[25]M. Plasse, N. Niang, G. Saporta, A. Villeminot, and L. Leblond, "Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set", Comput. Statist. Data Anal., vol. 52, no. 1, pp. 596-613, Sep. 2007.

[26]Douglass R. Cutting, David R. Karger, Jan O. Pedersen, and John W. Tukey, "Scatter or Gather: A Cluster-based Approach to Browsing Large Document Collections", SIGIR '92, Pages 318 - 329, 1992. 

[27]Daphe Koller and Mehran Sahami, "Hierarchically classifying documents using very few words", Proceedings of the 14th International Conference on Machine Learning (ML), Nashville, Tennessee, Pages 170-178, 1997.

[28]L. Zhuang, and H. Dai. "A Maximal Frequent Itemset Approach for Document Clustering", Computer and Information Technology, CIT. The Fourth International Conference, pp. 970 - 977, 2004.

[29]Y. LI, and S.M. Chung., "Text Document Clustering Based on Frequent Word Sequences", In Proceedings of the. CIKM, 2005. Bremen, Germany, Nov.-5, 2005

[30]Lent, A. N. Swami, and J. Widom, "Clustering association rules", in Proc. ICDE, pp. 220-231, 1977.