Mining Frequent Itemsets with Weights over Data Stream Using Inverted Matrix

Full Text (PDF, 582KB), PP.63-71

Views: 0 Downloads: 0

Author(s)

Long Nguyen Hung 1,* Thuy Nguyen Thi Thu 1

1. Informatics Department, Vietnam University of Commerce, Hanoi, Vietnam

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2016.10.08

Received: 3 Nov. 2015 / Revised: 10 Mar. 2016 / Accepted: 15 Jun. 2016 / Published: 8 Oct. 2016

Index Terms

Data mining, frequent itemset with weight, data stream, sliding window, inverted matrix

Abstract

In recent years, the mining research over data stream has been prominent as they can be applied in many alternative areas in the real worlds. In this paper, we have proposed an algorithm called MFIWDSIM for mining frequent itemsets with weights over a data stream using Inverted Matrix [10]. The main idea is moving data stream to an inverted matrix saved in the computer disks so that the algorithms can mine on it many times with different support thresholds as well as alternative minimum weights. Moreover, this inverted matrix can be accessed to mine in different times for user's requirements without recalculation. By analyzing and evaluating, the MFIWDSIM can be seen as the better algorithm compared to WSWFP-stream [9] for mining frequent itemsets with weights over data stream.

Cite This Paper

Long Nguyen Hung, Thuy Nguyen Thi Thu, "Mining Frequent Itemsets with Weights over Data Stream Using Inverted Matrix", International Journal of Information Technology and Computer Science(IJITCS), Vol.8, No.10, pp.63-71, 2016. DOI:10.5815/ijitcs.2016.10.08

Reference

[1]Aggarwal C. In C. Aggarwal (Ed.), Data Streams: Models and algorithms. Springer, (2007).

[2]Agrawal R., Srikant, R., Fast Algorithms for Mining Association Rules. In: 20th Int’l. Conf. on Very Large Data Bases (VLDB), pp. 487–499, (1994).

[3]Aneri P., Chaudhari M. B., Frequent pattern mining of continuous data over data streams, Int’l. Jour. for Technology Research Engineering, Vol. 1, Issue 9, pp. 935-940, (2014).

[4]Cai, C.H., Fu, A.W.-C., Cheng, C.H., Kwong, W.W. (1998), Mining Association Rules with Weighted Items. In Proceedings of Int’l. Database Engineering and Applications Symposium (IDEAS 1998), Cardiff, Wales, UK, July 1998, pp. 68–77, (1998) .

[5]Giannella C., Han, J., Pei, J., Yan, X., & Yu, P. S., Mining frequent patterns in data streams at multiple time granularities. In H. Kargupta, A. Joshi, K.Sivakumar, & Y. Yesha (Eds.), Next generation data mining, pp. 191–210, (2003).

[6]Han J., and Kamber M., Data Mining: Concepts and Techniques, Morgan Kanufmann, (2000).

[7]Han J., Pei, J., Yin, Y., Mao, R., Mining frequent patterns without candidate generation: a frequent-pattern tree approach, Data Mining and Knowledge Discovery 8, pp. 53–87, (2004).

[8]Hung Long Nguyen, An Efficient Algorithm for Mining Weighted Frequent Itemsets Using Adaptive Weights, Int’l Jour. of Intelligent Systems and Applications, Vol. 7, No. 11, pp. 41-48, (2015).

[9]Long Nguyen Hung, Thuy Nguyen Thi Thu, Giap Cu Nguyen, An Efficient Algorithm in Mining Frequent Itemsets with Weights over Data Stream Using Tree Data Structure, Int’l Jour. of Intelligent Systems and Applications, Vol. 7, No. 12, pp. 23-31, (2015).

[10]Mohammad El-Hajj, Osmar R. Zaïane, Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining, In: Proc. 2003 Int’l. Conf. on Data Mining and Knowledge Discovery (ACM SIGMOD), pp. 109-118, August 24-27, 2003, (2003).

[11]Manku G., Motwani R. Approximate frequency counts over data streams. In: Proceedings of the VLDB conference, pp. 346–357, (2002).

[12]Tsai P. S. M., Mining frequent itemsets in data streams using the weighted sliding window model. Expert Systems with Applications, pp. 11617-11625, (2009).

[13]Vijayarani S., Sathya P., A survey on frequent pattern mining over data streams, Int’l. Jour. of Computer Science and Information Tech. & Sec. (IJCSITS), Vol. 2., No. 5, pp. 1046-1050, (2012).

[14]Vikas K., Sangita., A review on algorithm for mining frequent itemset over data stream, Int’l. Jour. of Data Advanced Research in Comp. Sci. and Software Engineering, Vol 3., Issue 4, pp. 917-919, (2013). 

[15]Wang J., Zeng Y., SWFP-Miner: An efficient algorithm for mining weight frequent pattern over data streams, High Technology Letters, Vol. 3, No. 3, pp. 289-294, (2012).

[16]Younghee K., Wonyoung K., Ungmo K., Mining frequent itemsets with normalized weight in continuous data streams, Journal of Information Processing Systems, Vol. 6, No. 1, pp. 79-90, (2010).