An Efficient Algorithm in Mining Frequent Itemsets with Weights over Data Stream Using Tree Data Structure

Full Text (PDF, 675KB), PP.23-31

Views: 0 Downloads: 0

Author(s)

Hung Long Nguyen 1,* Thuy Nguyen Thi Thu 1 Giap Cu Nguyen 1

1. Informatics Department, Vietnam University of Commerce, Hanoi, Vietna

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2015.12.02

Received: 20 Mar. 2015 / Revised: 14 Jul. 2015 / Accepted: 5 Sep. 2015 / Published: 8 Nov. 2015

Index Terms

Data mining, frequent itemsets, data stream, weighted sliding window, weighted supports, tree data structure

Abstract

In recent years, the mining research over data stream has been prominent as they can be applied in many alternative areas in the real worlds. In [20], a framework for mining frequent itemsets over a data stream is proposed by the use of weighted slide window model. Two algorithms of single pass (WSW) and the WSW-Imp (improving one) using weighted sliding model were proposed in there to solve the data stream problems. The disadvantage of these algorithms is that they have to seek all data stream many times and generate a large set of candidates. In this paper, we have proposed a process of mining frequent itemsets with weights over a data stream. Based on the downward closure property and FP-Growth method [8,9] an alternative algorithm called WSWFP-stream has been proposed. This algorithm is proved working more efficiently regarding to computing time and memory aspects.

Cite This Paper

Long Nguyen Hung, Thuy Nguyen Thi Thu, Giap Cu Nguyen, "An Efficient Algorithm in Mining Frequent Itemsets with Weights over Data Stream Using Tree Data Structure", International Journal of Intelligent Systems and Applications(IJISA), vol.7, no.12, pp.23-31, 2015. DOI:10.5815/ijisa.2015.12.02

Reference

[1]Aggarwal C. (Ed.), Data Streams: Models and algorithms. Springer, (2007).
[2]Agrawal R., Srikant, R., Fast Algorithms for Mining Association Rules. In: 20th Int. Conf. on Very Large Data Bases (VLDB), pp. 487–499, (1994).
[3]Aneri P., Chaudhari M. B., Frequent pattern mining of continuous data over data streams, Int. Jour. for Technology Research Engineering, Vol. 1, Issue 9, pp. 935-940, (2014).
[4]Chang J.H., Lee W.S.: estWin, Online data stream mining of recent frequent itemsets by sliding window method. Journal of Information Sciences, Vol. 3, No. 2, pp. 76-90, (2005).
[5]Chi Y., Wang H., Yu P. S., Muntz R. R., Catch the moment: Maintaining closed frequent itemsets over a data stream sliding window. Knowledge and Information Systems, Vol. 10, No. 3, pp. 265–294, (2006).
[6]Fan W., Huang Y., Wang H., Yu, P. S. Active mining of data streams. In: Proceedings of the Fourth SIAM Int. Conf. on Data Mining, pp. 457-461, (2004).
[7]Giannella C., Han, J., Pei, J., Yan, X., & Yu, P. S., Mining frequent patterns in data streams at multiple time granularities. In: H. Kargupta, A. Joshi, K.Sivakumar, & Y. Yesha (Eds.), Next generation data mining, pp. 191–210, (2003).
[8]Han J., Kamber M., Data Mining: Concepts and Techniques, Morgan Kanufmann, (2000).
[9]Han J., Pei J., Yin Y., Mao R., Mining frequent patterns without candidate generation: a frequent-pattern tree approach, Data Mining and Knowledge Discovery 8, pp. 53–87, (2004).
[10]Jothimani K., Thanamani A. S., An overview of mining frequent itemsets over data streams using sliding window model, Int. Jour. Of Emerging Trend & Technology in Computer Science (IJETTCS), Vol. 1, Issue 1, pp. 86-89, (2012).
[11]Keming T., Caiyan D., Ling C., A novel strategy for mining frequent closed itemsets in data streams, Journal of Computer, Vol. 7, No. 7, pp. 1564-1573, (2012).
[12]Kuen-Fang J., Chao-Wei L., A sliding-window based adaptive approximating method to discover recent frequent itemsets from data streams, Proc. of the Int. Multiconference of Engineering and Computer Scientists (IMECS 2010), Vol. I, March 17-19, Hong Kong, (2010).
[13]Li Su, Hong-yan Liu, A new classfication algorithm for data stream, Int. Jour. Modern Education and Computer Science, Vol. 3, No. 4, pp, 32-39, (2011).
[14]Lin C.H., Chiu D.Y., Wu Y.H., Chen A.L.P., Mining frequent itemsets from data stream with a time-sensitive sliding window. In: 5th SIAM Int. Conf. on Data Mining, pp. 68-79, (2005).
[15]Mahmood D., Mohammad H. S., An efficient algorithm for mining frequent itemsets within large windows over data streams, Int. Jour. of Data Engineering, Vol. 2, Issue 3, pp. 119-125, (2011).
[16]Mahmood D., Mohammad H. S., Mehran T., An efficient sliding window based algorithm for adaptive frequent itemset mining over data streams, Journal of Information Science and Engineering 29, pp. 1001-1020, (2013).
[17]Manku G., Motwani R., Approximate frequency counts over data streams. In: Proceedings of the VLDB conference, pp. 346–357, (2002).
[18]Reshma Yusuf B., Chenna Reddy B., Mining data stream using option trees, Int. Jour. Network and Information Security, Vol. 4, No. 8, pp. 49-54, (2012)
[19]Shaik H., Murthy J. V. R., Anuradha Y., Chandra M., Mining frequent patterns from data streams using dynamic DP-tree? Int. Jour. of Computer Applications, Vol. 52, No. 19, pp. 23-27, (2012).
[20]Tsai P. S. M., Mining frequent itemsets in data Streams using the weighted sliding window model. Expert Systems with Applications, pp. 11617-11625, (2009).
[21]Vijayarani S., Sathya P., A survey on frequent pattern mining over data streams, Int. Jour. of Computer Science and Information Tech. & Sec. (IJCSITS), Vol. 2., No. 5, pp. 1046-1050, (2012).
[22]Vikas K., Sangita., A review on algorithm for mining frequent itemset over data stream, Int. Jour. of Data Advanced Research in Comp. Sci. and Software Engineering, Vol 3., Issue 4, pp. 917-919, (2013).
[23]Wang J., Zeng Y., SWFP-Miner: An efficient algorithm for mining weight frequent pattern over data streams, High Technology Letters, Vol. 3, No. 3, pp. 289-294, (2012).
[24]Yong C., Rong F. B., Chuan X., A new approach for maximal frequent sequential patterns mining over data streams, Int. Jour. of Digital Content Technology and its Applications, Vol. 5, No. 6, pp. 104-112, (2011)
[25]Younghee K., Wonyoung K., Ungmo K., Mining frequent itemsets with normalized weight in continuous data streams, Journal of Information Processing Systems, Vol. 6, No. 1, pp. 79-90, (2010).