Selection of Clusters based on Internal Indices in Multi-Clustering Collaborative Filtering Recommender System
Abstract
The successful application of a multi-clustering based neighborhood approach to recommender systems has led to increased recommendation accuracy and the elimination of divergence related to differences in clustering methods traditionally used. The Multi-Clustering Collaborative Filtering algorithm was developed to achieve this, as described in the author’s previous papers. However, utilizing multiple clusters poses challenges regarding memory consumption and scalability.Not all partitionings are equally advantageous, making selecting clusters for the recommender system’s input crucial without compromising recommendation accuracy. This article presents a solution for selecting clustering schemes
based on internal indices evaluation. This method can be employed for preparing input data in collaborative filtering recommender systems. The study’s results confirm the positive impact of scheme selection on the overall recommendation
performance, as it typically improves after the selection process.
Furthermore, a smaller number of clustering schemes used as
input for the recommender system enhances scalability and
reduces memory consumption. The findings are compared with
baseline recommenders’ outcomes to validate the effectiveness of
the proposed approach.
References
C. C. Aggrawal, Recommender Systems. The Textbook. Yorktown Heights: Springer, 2016.
D. Jannach, Recommender Systems: an Introduction. New York:
Cambridge University Press, 2010.
J. Haupt, “Last.fm: People-powered online radio,” Music Reference
Services Quarterly, vol. 12, pp. 23–24, 2009.
Bennett, J. and Lanning, S. , “The Netflix prize,” in Proceedings of KDD
Cup and Workshop, 2007.
Corbeil, J. and Florent, D., “Deploying a Cost-Effective and Production-
Ready Deep News Recommender System in the Media Crisis Context,”
in Proceedings of RecSys, 2020.
K. Chaudhari and T. Ankit, “A comprehensive survey on travel recommender systems,” Archives of Computational Methods in Engineering,
vol. 27, no. 5, pp. 1545–1571, 2020.
O. C. Agbonifo and A. Motunrayo, “A development of an ontologybased
personalised e-learning recommender system,” International Journal
of Computer (IJC), vol. 38, no. 1, pp. 102–112, 2020.
X. Gao, F. Feng, H. Huang, X. Mao, T. Lan, and Z. Chi, “Food recommendation with graph convolutional network,” Information Sciences,
vol. 584, pp. 170–183, 2022.
J. D´ıez, P. P´erez-N´u˜nez, O. Luaces, B. Remeseiro, and A. Bahamonde, “Towards explainable personalized recommendations by learning from users’ photos,” Information Sciences, vol. 520, pp. 416–430, 2020.
F. Ricci, L. Rokach, and B. Shapira, “Recommender systems: Introduction
and challenges,” in Recommender Systems Handbook, 2015, pp. 1–34.
J. Bobadilla, F. Ortega, A. Hernando, and A. Gutierrez, “Recommender
systems survey,” Knowledge-Based Systems, vol. 46, pp. 109–132, 2013.
M. Singh, “Scalability and sparsity issues in recommender datasets: a
survey,” in Knowledge and Information Systems, 2018, pp. 1–43.
Jannach, D. and Ludewig, M., “When recurrent neural networks meet
the neighborhood for session-based recommendation,” in Proceedings
of the Eleventh ACM Conference on Recommender Systems (RecSys17),
, p. 306–310.
V. Yadav1, R. Shukla, A. Tripathi, and A. Maurya, “A new approach for
movie recommender system using k-means clustering and pca,” Journal
of Scientific & Industrial Research, vol. 80, pp. 159–165, 2021.
L. Kaufman, Finding Groups in Data: An Introduction to Cluster
Analysis. New York: John & Sons Wiley, 2009.
J. Bailey, “Alternative clustering analysis: a review,” in Intelligent
Decision Technologies: Data Clustering: Algorithms and Applications.
Boca Raton: Chapman and Hall/CRC, 2014, pp. 533–548.
T. Li, M. Ogihara, and S. Ma, “On combining multiple clusterings: An
overview and a new perspective,” Applied Intelligence, vol. 33, no. 2,
pp. 207–219, 2010.
U. Ku˙zelewska, “Effect of Dataset Size on Efficiency of Collaborative
Filtering Recommender Systems with Multi-clustering as a Neighbourhood
Identification Strategy,” in International Conference on Computational
Science. New York: Springer, 2020, pp. 342–354.
——, “Scheme Selection based on Clusters’ Quality in Multi-Clustering
M-CCF Recommender System,” in 31st International Conference on
Information Systems Development (ISD 2023), 2023.
S. Latifi, N. Mauro, and D. Jannach, “Session-aware recommendation:
A surprising quest for the state-of-the-art,” Information Sciences, vol.
, pp. 291–315, 2021.
R. Kuo, C. Chen, and S. Keng, “Application of hybrid metaheuristic
with perturbation-based k-nearest neighbors algorithm and densest imputation
to collaborative filtering in recommender systems,” Information
Sciences, vol. 575, pp. 90–115, 2021.
Y. Kilani, A. Otoom, A. Alsarhan, and M. Almaayah, “A genetic
agorithms-based hybrid recommender system of matrix factorization
and neighborhood-based techniques,” Journal of Computational Science,
vol. 28, pp. 78–93, 2018.
U. Ku˙zelewska, “Dynamic Neighbourhood Identification Based on
Multi-clustering in Collaborative Filtering Recommender Systems,” in
International Conference on Dependability and Complex Systems, 2020,
pp. 410–419.
Yaoy, S. and Yuy, G. and Wangy, X. and Wangy, J. and Domeniconiz,
C. and Guox, M., “Discovering Multiple Co-Clusterings in Subspaces,”
in Proceedings of the 2019 SIAM International Conference on Data
Mining, 2019, pp. 423–431.
A. Bilge and H. Polat, “A scalable privacy-preserving recommendation
scheme via bisecting k-means clustering,” Information Process Management,
vol. 49, no. 4, pp. 912–927, 2013.
M. Farahani, J. Torkestani, and M. Rahmani, “Adaptive personalized
recommender system using learning automata and items clustering,”
Information Systems, vol. 106, p. 101978, 2022.
Rashid, M. and Shyong, K. L. and Karypis, G. and Riedl, J., “ClustKNN
a Highly Scalable Hybrid Model - Memory-based CF Algorithm,” in
Proceeding of WebKDD, 2006.
L. R. Divyaa and N. Pervin, “Towards generating scalable personalized
recommendations: Integrating social trust, social bias, and geo-spatial
clustering,” Decision Support Systems, vol. 122, 2019.
R. Logesh, V. Subramaniyaswamy, D. Malathi, N. Sivaramakrishnan,
and V. Vijayakumar, “Enhancing recommendation stability of collaborative
filtering recommender system through bio-inspired clustering
ensemble method,” Neural Computing and Applications, vol. 32, pp.
—-2164, 2020.
S. Kant and T. Mahara, “Nearest biclusters collaborative filtering framework
with fusion,” Journal of Computational Science, vol. 25, pp. 204–
, 2018.
F. de Aguiar Neto, A. da Costa, M. Manzato, and R. Campello, “Preprocessing
approaches for collaborative filtering based on hierarchical
clustering,” Information Sciences, vol. 534, pp. 172–191, 2020.
S. Bansal and N. Baliyan, “Bi-mars: A bi-clustering based memetic
algorithm for recommender systems,” Applied Soft Computing, vol. 97,
p. 106785, 2020.
P. Fr¨anti and S. Sieranoja, “How much can k-means be improved by
using better initialization and repeats?” Pattern Recognition, vol. 93,
pp. 95–112, 2019.
A. Strehl and J. Ghosh, “Cluster ensembles – a knowledge reuse framework
for combining multiple partitions,” Journal of Machine Learning
Research, vol. 3, pp. 583–617, 2002.
L. Bai, Y. Liang, and F. Cao, “A multiple k-means clustering
ensemble algorithm to find nonlinearly separable clusters,” Information
Fusion, vol. 61, pp. 36–47, 2020. [Online]. Available: https:
//doi.org/10.1016/j.inffus.2020.03.009
S. Zahra, M. A. Ghazanfar, A. Khalid, M. A. Azam, U. Naeem,
and A. Prugel-Bennett, “Novel centroid selection approaches for
kmeans-clustering based recommender systems,” Information Sciences,
vol. 320, pp. 156–189, 2015. [Online]. Available: https://doi.org/10.
/j.ins.2015.03.062
P. J. Rousseeuw, “Silhouettes a graphical aid to the interpretation and
validation of cluster analysis,” Computational and Applied Mathematics,
vol. 20, p. 53–65, 1987.
D. L. Davies and D. W. Bouldin, “A cluster separation measure,” PAMIIEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 1,
no. 2, pp. 224–227, 1979.
T. Cali´nski and J. Harabasz, “A dendrite method for cluster analysis,”
Communications in Statistics - Theory and Methods, vol. 3, pp. 1–27,
“Movielens dataset.” [Online]. Available: https://grouplens.org/datasets/
movielens/25m/
F. Pedregosa, “Scikit-learn: Machine learning in python,” JMLR, vol. 12,
pp. 2825–2830, 2011.
J. Miles, R squared adjusted R squared. New York: Wiley
StatsRef: Statistics Reference Online, 2014. [Online]. Available:
Downloads
Published
Issue
Section
License
Copyright (c) 2024 International Journal of Electronics and Telecommunications
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on https://creativecommons.org/licenses/by/4.0/.
2. Author’s Warranties
The author warrants that the article is original, written by stated author/s, has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author/s. The undersigned also warrants that the manuscript (or its essential substance) has not been published other than as an abstract or doctorate thesis and has not been submitted for consideration elsewhere, for print, electronic or digital publication.
3. User Rights
Under the Creative Commons Attribution license, the author(s) and users are free to share (copy, distribute and transmit the contribution) under the following conditions: 1. they must attribute the contribution in the manner specified by the author or licensor, 2. they may alter, transform, or build upon this work, 3. they may use this contribution for commercial purposes.
4. Rights of Authors
Authors retain the following rights:
- copyright, and other proprietary rights relating to the article, such as patent rights,
- the right to use the substance of the article in own future works, including lectures and books,
- the right to reproduce the article for own purposes, provided the copies are not offered for sale,
- the right to self-archive the article
- the right to supervision over the integrity of the content of the work and its fair use.
5. Co-Authorship
If the article was prepared jointly with other authors, the signatory of this form warrants that he/she has been authorized by all co-authors to sign this agreement on their behalf, and agrees to inform his/her co-authors of the terms of this agreement.
6. Termination
This agreement can be terminated by the author or the Journal Owner upon two months’ notice where the other party has materially breached this agreement and failed to remedy such breach within a month of being given the terminating party’s notice requesting such breach to be remedied. No breach or violation of this agreement will cause this agreement or any license granted in it to terminate automatically or affect the definition of the Journal Owner. The author and the Journal Owner may agree to terminate this agreement at any time. This agreement or any license granted in it cannot be terminated otherwise than in accordance with this section 6. This License shall remain in effect throughout the term of copyright in the Work and may not be revoked without the express written consent of both parties.
7. Royalties
This agreement entitles the author to no royalties or other fees. To such extent as legally permissible, the author waives his or her right to collect royalties relative to the article in respect of any use of the article by the Journal Owner or its sublicensee.
8. Miscellaneous
The Journal Owner will publish the article (or have it published) in the Journal if the article’s editorial process is successfully completed and the Journal Owner or its sublicensee has become obligated to have the article published. Where such obligation depends on the payment of a fee, it shall not be deemed to exist until such time as that fee is paid. The Journal Owner may conform the article to a style of punctuation, spelling, capitalization and usage that it deems appropriate. The Journal Owner will be allowed to sublicense the rights that are licensed to it under this agreement. This agreement will be governed by the laws of Poland.
By signing this License, Author(s) warrant(s) that they have the full power to enter into this agreement. This License shall remain in effect throughout the term of copyright in the Work and may not be revoked without the express written consent of both parties.