Improved Parallel Apriori Algorithm for Multi-cores

Full Text (PDF, 527KB), PP.18-23

Views: 0 Downloads: 0

Author(s)

Swati Rustogi 1,* Manisha Sharma 1 Sudha Morwal 1

1. Banasthali Vidyapith, Rajasthan, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2017.04.03

Received: 17 Mar. 2016 / Revised: 20 Aug. 2016 / Accepted: 19 Nov. 2016 / Published: 8 Apr. 2017

Index Terms

Multi-core, data mining, parallelism, Apriori

Abstract

Apriori algorithm is one of the most popular data mining techniques, which is used for mining hidden relationship in large data. With parallelism, a large data set can be mined in less amount of time. Apart from the costly distributed systems, a computer supporting multi core environment can be used for applying parallelism. In this paper an improved Apriori algorithm for multi-core environment is proposed. 
The main contributions of this paper are:
•An efficient Apriori algorithm that applies data parallelism in multi-core environment by reducing the time taken to count the frequency of candidate item sets.
•The performance of proposed algorithm is evaluated for multiple cores on basis of speedup.
•The performance of the proposed algorithm is compared with the other such parallel algorithm and it shows an improvement by more than 15% preliminary experiment.

Cite This Paper

Swati Rustogi, Manisha Sharma, Sudha Morwal, "Improved Parallel Apriori Algorithm for Multi-cores", International Journal of Information Technology and Computer Science(IJITCS), Vol.9, No.4, pp.18-23, 2017. DOI:10.5815/ijitcs.2017.04.03

Reference

[1]Agrawal, Rakesh, Tomasz ImieliƄski, and Arun Swami. "Mining association rules between sets of items in large databases." ACM SIGMOD Record 22.2 (1993), 207-216.

[2]Aflori, C., & Craus, M. (2007). Grid implementation of the Apriori algorithm. Advances in engineering software, 38(5), 295-300.

[3]Asha, P., & Jebarajan, T. (2015). Association Rule Mining and Refinement Using Shared Memory Multiprocessor Environment. In Artificial Intelligence and Evolutionary Algorithms in Engineering Systems (pp. 105-117). Springer India.

[4]Fan, B., & Luo, J. (2013). Spatially enabled emergency event analysis using a multi-level association rule mining method. Natural hazards, 67(2), 239-260.

[5]Holmes, D. W., Williams, J. R., & Tilke, P. (2010). An events based algorithm for distributing concurrent tasks on multi-core architectures. Computer Physics Communications, 181(2), 341-354.

[6]Garg, R., & Mishra, P. K. (2011). Exploiting parallelism in association rule mining algorithms. International Journal of Advanced Technology, 2, 222-232.

[7]Han, J., Pei, J., Yin, Y., & Mao, R. (2004). Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data mining and knowledge discovery, 8(1), 53-87.

[8]Ji, Y., Ying, H., Tran, J., Dews, P., & Massanari, R. M. (2015, November). Integrating association mining into relevance feedback for biomedical literature search. In International Conference on Bioinformatics and Biomedicine (BIBM), 2015 IEEE (pp. 531-536). IEEE.

[9]Jian, L., Wang, C., Liu, Y., Liang, S., Yi, W., & Shi, Y. (2013). Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). The Journal of Supercomputing, 64(3), 942-967.

[10]Neubarth, K., Goienetxea, I., Johnson, C., & Conklin, D. (2012). Association Mining of Folk Music Genres and Toponyms. In International Society for Music Information Retrieval Conference, Vol. 2012, p. 13th.

[11]Ravi, V. T., & Agrawal, G. (2009, May). Performance issues in parallelizing data-intensive applications on a multi-core cluster. In Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (pp. 308-315). IEEE Computer Society.

[12]Rathee, S., Kaul, M., & Kashyap, A. (2015, October). R-Apriori: an efficient apriori based algorithm on spark. In Proceedings of the 8th Workshop on Ph.D. Workshop in Information and Knowledge Management (pp. 27-34), ACM.

[13]Sengupta, D., Sood, M., Vijayvargia, P., & Naik, P. K. (2013). Association rule mining based study for identification of clinical parameters akin to occurrence of brain tumor. Bioinformation, 9(11), 555-9.

[14]Tank, D. M. (2014). Improved Apriori Algorithm for Mining Association Rules. International Journal of Information Technology and Computer Science (IJITCS), 6(7), 15.

[15]Tanna, P., & Ghodasara, Y. (2014). Using Apriori with WEKA for Frequent Pattern Mining. arXiv preprint arXiv:1406.7371.

[16]Yang, H., & Yang, C. C. (2015). Using Health-Consumer-Contributed Data to Detect Adverse Drug Reactions by Association Mining with Temporal Analysis. ACM Transactions on Intelligent Systems and Technology (TIST), 6(4), 55.

[17]Ye, Y., & Chiang, C. C. (2006, August). A parallel apriori algorithm for frequent itemsets mining. In Fourth International Conference on Software Engineering Research, Management and Applications (SERA'06) (pp. 87-94). IEEE.

[18]Yu, K. M., Zhou, J., Hong, T. P., & Zhou, J. L. (2010). A load-balanced distributed parallel mining algorithm. Expert Systems with Applications, 37(3), 2459-2464.

[19]Yu, K. M., & Zhou, J. (2010). Parallel TID-based frequent pattern mining algorithm on a PC Cluster and grid computing system. Expert Systems with Applications, 37(3), 2486-2494.

[20]Yu, K. M., & Wu, S. H. (2011, November). An efficient load balancing multi-core frequent patterns mining algorithm. In Trust, Security and Privacy in Computing and Communications (TrustCom), 2011 IEEE 10th International Conference on (pp. 1408-1412). IEEE.

[21]http://fimi.ua.ac.be/data/