The Empirical Comparison of the Supervised Classifiers Performances in Implementing a Recommender System using Various Computational Platforms

Full Text (PDF, 482KB), PP.11-20

Views: 0 Downloads: 0

Author(s)

Ali Mohammad Mohammadi 1,* Mahmood Fathy 1

1. School of computer science, institute for research in fundamental science (IPM), P.o.Box 19395-5746, Tehran, Iran

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2020.02.02

Received: 2 Aug. 2019 / Revised: 29 Aug. 2019 / Accepted: 12 Sep. 2019 / Published: 8 Apr. 2020

Index Terms

Distributed Machine learning, Supervised classifiers comparison, Recommender System, Apache Spark, Deep Multilayer Perceptron

Abstract

Recommender Systems (RS) help users in making appropriate decisions. In the area of RS research, many researchers focused on improving the performances of the existing methods, but most of them have not considered the potential of their employed methods in reaching the ultimate solution. In our view, the Machine Learning supervised approach as one of the existing techniques to create an RS can reach higher degrees of success in this field. Thus, we implemented a Collaborative Filtering recommender system using various Machine Learning supervised classifiers to study their performances. These classifiers implemented not only on a traditional platform but also on the Apache Spark platforms. The Caret package is used to implement the algorithms in the classical computational platform, and the H2O and Sparklyr are used to run the algorithms on the Spark Machine. Accordingly, we compared the performance of our algorithms with each other and with other algorithms from recent literature. Our experiments indicate the Caret-based algorithms are significantly slower than the Sparklyr and H2O based algorithms. Also, in the Spark platform, the runtime of the Sparklyr-based algorithm decreases with increasing the cluster size. However, the H2O-based algorithms run slower with increasing the cluster size. Moreover, the comparison of the results of our implemented algorithms with each other and with other algorithms from recent literature shows the Bayesian network is the fastest classifier between our implemented classifiers, and the Gradient Boost Model is the most accurate algorithm in our research. Therefore, the supervised approach is better than the other methods to create a collaborative filtering recommender system.

Cite This Paper

Ali Mohammad Mohammadi, Mahmood Fathy, "The Empirical Comparison of the Supervised Classifiers Performances in Implementing a Recommender System using Various Computational Platforms", International Journal of Intelligent Systems and Applications(IJISA), Vol.12, No.2, pp.11-20, 2020. DOI:10.5815/ijisa.2020.02.02

Reference

[1]T. Mahmood, F. Ricci, “Improving recommender systems with adaptive conversational strategies.” In Proceedings of the 20th ACM conference on Hypertext and hypermedia, ACM, pp. 73-82, 2009.
[2]U. Shardanand, P. Maes, “Social information filtering: algorithms for automating “word of mouth”.” In Chi, vol. 95, pp. 210-217, 1995.
[3]M. Elahi, F. Ricci, N. Rubens, “A survey of active learning in collaborative filtering recommender systems.” Computer Science Review, vol. 20, pp. 29-50, 2016.
[4]Aggarwal, C. Charu, “An introduction to recommender systems.” In Recommender systems, Springer, pp. 1-28, 2016.
[5]J. Bobadilla, F. Ortega, A. Hernando, A. Gutiérrez, “Recommender systems survey.” Knowledge-based systems, vol. 46, pp. 109-132, 2013.
[6]X. Su, T. M. Khoshgoftaar. “A survey of collaborative filtering techniques.” Advances in artificial intelligence, vol. 2009, 2009.
[7]Y. Koren, R. Bell, C. Volinsky, “Matrix factorization techniques for recommender systems” Computer, vol. 8, pp. 30-37, 2009.
[8]G. Guo, “Resolving data sparsity and cold start in recommender systems.” In International Conference on User Modeling, Adaptation, and Personalization, Springer, vol. 13, pp. 361-364, 2012.
[9]G. Guo, J. Zhang, D. Thalmann, “Merging trust in collaborative filtering to alleviate data sparsity and cold start” Knowledge-Based Systems, vol. 57, pp. 57-68, Feb 2014.
[10]Y. Zhu, J. Lin, S. He, B. Wang, Z. Guan, H. Liu, D. Cai, “Addressing the item cold-start problem by attribute-driven active learning.” IEEE Transactions on Knowledge and Data Engineering, 2019, https://doi.ieeecomputersociety.org/10.1109/TKDE.2019.2891530
[11]S. Feng, “Sparsity in Machine Learning: An Information Selecting Perspective” (2019). Doctoral Dissertations. 1550. https://scholarworks.umass.edu/dissertations_2/1550
[12]A. Spark, “Apache Spark: Lightning-fast cluster computing.” URL http://spark. Apache. org. 2016 Jun.
[13]F. Harper, J. Konstan, “The movielens datasets: History and context.” Acm transactions on interactive intelligent systems (tiis), vol. 5, no. 4, p. 19, 2016, https://grouplens.org/datasets/movielens/100k/
[14]J. R. Quinlan, “Decision trees as probabilistic classifiers.” In Proceedings of the Fourth International Workshop on Machine Learning, Morgan Kaufmann, pp. 31-37, 1987, https://doi.org/10.1016/B978-0-934613-41-5.50007-6
[15]A. Jehad, R. Khan, N. Ahmad, I. Maqsood, “Random forests and decision trees.” International Journal of Computer Science Issues (IJCSI), vol. 9, no. 5, p. 272, 2012.
[16]B. Schölkopf, A. J. Smola, F. Bach. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002.
[17]N. Friedman, D. Geiger, D. Goldszmidt, “Bayesian network classifiers.” Machine learning. Vol. 29(2-3), pp. 131-63, 1997.
[18]P. McCullagh, Generalized linear models. Routledge, 2019.
[19]A. Natekin, A. Knoll, “Gradient boosting machines, a tutorial.” Frontiers in neurorobotics, vol. 7, no. 21, 2013.
[20]L. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT press, 2016.
[21]L. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT press, 2016.
[22]J. Luraschi, K. Ushey, JJ. Allaire, “The Apache Software Foundation. sparklyr: R Interface to Apache Spark.” R package (2018).
[23]S. Aiello, E. Eckstrand, A. Fu, M. Landry, P. Aboyoun, Machine Learning with R and H2O. 2018.
[24]X. Yuan, L. Han, S. Qian, G. Xu, H. Yan, “Singular value decomposition based recommendation using imputed data.” Knowledge-Based Systems, vol. 163, pp. 485-94, 2019.
[25]A. Da Costa, M. Manzato, R. Campello, “Boosting collaborative filtering with an ensemble of co-trained recommenders.” Expert Systems with Applications, vol. 1, no. 115, pp.427-41, 2019.
[26]P. Singh, P. Pramanik, N. Debnath, N. Choudhury, “A Novel Neighborhood Calculation Method by Assessing Users’ Varying Preferences in Collaborative Filtering.” Proceedings of 34th International Confer, vol. 58, pp. 345-55, 2019.
[27]N. AL-Bakri, S. Hashim, “Collaborative Filtering Recommendation Model Based on k-means Clustering.” Al-Nahrain Journal of Science, vol. 22(1), pp. 74-79, 2019.
[28]L. Luo, H. Xie, Y. Rao, F. Wang, “Personalized recommendation by matrix co-factorization with tags and time information.” Expert Systems with Applications, vol. 119, pp. 311-21, 2019.
[29]N. Hazrati, B. Shams, S. Haratizadeh, “Entity representation for pairwise collaborative ranking using restricted Boltzmann machine.” Expert Systems with Applications, vol. 116, pp. 161-71, 2019.
[30]L. Zhang, T. Luo, F. Zhang, Y. Wu. “A recommendation model based on deep neural network.” IEEE Access, vol. 6, pp. 9454-63, Jan 2018.
[31]Y. Liu, H. Qu, W. Chen, SH. Mahmud, “An Efficient Deep Learning Model to Infer User Demographic Information From Ratings.” IEEE Access, vol. 7, pp. 53125-53135, 2019.
[32]H. Lee, J. Lee, “Scalable deep learning-based recommendation systems.” ICT Express, vol. 5, no. 2, pp. 84-88, 2019.
[33]W. Yan, D. Wang, M. Cao, J. Liu. “Deep Auto Encoder Model With Convolutional Text Networks for Video Recommendation.” IEEE Access, vol. 7, pp. 40333-40346, 2019.
[34]J. Barbieri, LG. Alvim, F. Braida, G. Zimbrão, “Autoencoders and recommender systems: COFILS approach.” Expert Systems with Applications, vol. 89, pp. 81-90, 2017.
[35]Z. Wu, H. Tian, X. Zhu, S. Wang. “Optimization matrix factorization recommendation algorithm based on rating centrality.” In International Conference on Data Mining and Big Data, pp. 114-125. Springer, Cham, 2018.
[36]X. Yao, B. Tan, C. Hu, W. Li, Z. Xu, Z. Zhang, “Recommend algorithm combined user-user neighborhood approach with latent factor model.” In International Conference on Mechatronics and Intelligent Robotics, pp. 275-280. Springer, Cham, 2017.
[37]Y. Ma, M. Gan, “A Random Forest Regression-based Personalized Recommendation Method.” In PACIS, p. 170, 2018.
[38]R. Berg, T. Kipf, M. Welling, “Graph convolutional matrix completion.” arXiv preprint arXiv:1706.02263 (2017).
[39]M. Fu, H. Qu, Z. Yi, L. Lu, Y. Liu, “A novel deep learning-based collaborative filtering model for recommendation system.” IEEE transactions on cybernetics, vol.49, no. 3, pp. 1084-1096, 2018.
[40]F. Gedikli, F. Bagdat, M. Ge, D. Jannach, “RF-REC: Fast and accurate computation of recommendations based on rating frequencies.” In 2011 IEEE 13th Conference on Commerce and Enterprise Computing, pp. 50-57. IEEE, 2011.
[41]A. Richter, T. Khoshgoftaar, S. Landset, T. Hasanin, “A multi-dimensional comparison of toolkits for machine learning with big data.” In 2015 IEEE International Conference on Information Reuse and Integration, pp. 1-8, IEEE, 2015.
[42]M. Kuhn, “The caret package.” R Foundation for Statistical Computing, Vienna, Austria. URL https://cran. r-project. org/package= caret. 2012 Nov 26.
[43]H. Wickham, R. Francois, L. Henry, K. Müller, “dplyr: A Grammar of Data Manipulation.” R package version 0.4. 3. R Found. Stat. Comput., Vienna. https://CRAN. R-project. org/package= dplyr. 2015 Nov 13.
[44]H. Wickham, R. Francois, L. Henry, K. Müller, “dplyr: A grammar of data manipulation.” R package version 0.4, 2015;3.
[45]White, Lyndon, Roberto Togneri, Wei Liu, Mohammed Bennamoun. “Introduction to Neural Networks for Machine Learning.” In Neural Representations of Natural Language, pp. 1-21. Springer, Singapore, 2019.
[46]J. Zhang, X. Shi, S. Zhao, I. King, “STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems.” arXiv preprint arXiv:1905.13129. 2019 May 27.
[47]J. Wang, P. Han, Y. Miao, F. Zhang, “A Collaborative Filtering Algorithm Based on SVD and Trust Factor.” In 2019 International Conference on Computer, Network, Communication and Information Systems (CNCI 2019) 2019 May. Atlantis Press.
[48]D. Li, C. Chen, Z. Gong, T. Lu, S. Chu, N. Gu, Collaborative Filtering with Noisy Ratings.” In Proceedings of the 2019 SIAM International Conference on Data Mining, 2019 May 6 (pp. 747-755). Society for Industrial and Applied Mathematics.
[49]A. Sahoo, C. Pradhan, B. Mishra, “SVD based Privacy Preserving Recommendation Model using Optimized Hybrid Item-based Collaborative Filtering.” In 2019 International Conference on Communication and Signal Processing (ICCSP) 2019 Apr 4 (pp. 0294-0298). IEEE.
[50]M. Ahamed, S. Afroge, “A Recommender System Based on Deep Neural Network and Matrix Factorization for Collaborative Filtering.” In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) 2019 Feb 7 (pp. 1-5). IEEE.
[51]J. Zhao, X. Geng, J. Zhou, Q. Sun, Y. Xiao, Z. Zhang, Z. Fu, “Attribute mapping and autoencoder neural network based matrix factorization initialization for recommendation systems.” Knowledge-Based Systems, vol. 166, pp. 132-139, 2019.
[52]H. Zhang, F. Min, Z. Zhang, S. Wang, “Efficient collaborative filtering recommendations with multi-channel feature vectors.” International Journal of Machine Learning and Cybernetics, vol. 10, pp. 1165-72, 2019.
[53]D. Tran, Z. Hussain, W. Zhang, N. Khoa, N. Tran, Q. Sheng, “Deep Autoencoder for Recommender Systems: Parameter Influence Analysis.” arXiv preprint arXiv:1901.00415. 2018 Dec 25.
[54]F. Zhang, V. Lee, R. Jin, S. Garg, K. Choo, M. Maasberg, L. Dong, C. Cheng, “Privacy-aware smart city: A case study in collaborative filtering recommender systems.” Journal of Parallel and Distributed Computing, vol. 127, pp. 145-59, 2019.
[55]D. Chao, L. Kaili, Z. Jing, J. Xie, Chao, Duan, Lu Kaili, Zhang Jing, and Jerry Xie. "Collaborative Filtering Recommendation Algorithm Classification and Comparative Study." In Proceedings of the 2019 4th International Conference on Distance Education and Learning, pp. 106-111. ACM, 2019.