Selection of Clusters based on Internal Indices in Multi-Clustering Collaborative Filtering Recommender System


  • Urszula Kużelewska Bialystok University of Technology


The successful application of a multi-clustering based neighborhood approach to recommender systems has led to increased recommendation accuracy and the elimination of divergence related to differences in clustering methods traditionally used. The Multi-Clustering Collaborative Filtering algorithm was developed to achieve this, as described in the author’s previous papers. However, utilizing multiple clusters poses challenges regarding memory consumption and scalability.
Not all partitionings are equally advantageous, making selecting clusters for the recommender system’s input crucial without compromising recommendation accuracy. This article presents a solution for selecting clustering schemes
based on internal indices evaluation. This method can be employed for preparing input data in collaborative filtering recommender systems. The study’s results confirm the positive impact of scheme selection on the overall recommendation
performance, as it typically improves after the selection process.
Furthermore, a smaller number of clustering schemes used as
input for the recommender system enhances scalability and
reduces memory consumption. The findings are compared with
baseline recommenders’ outcomes to validate the effectiveness of
the proposed approach.


C. C. Aggrawal, Recommender Systems. The Textbook. Yorktown Heights: Springer, 2016.

D. Jannach, Recommender Systems: an Introduction. New York:

Cambridge University Press, 2010.

J. Haupt, “ People-powered online radio,” Music Reference

Services Quarterly, vol. 12, pp. 23–24, 2009.

Bennett, J. and Lanning, S. , “The Netflix prize,” in Proceedings of KDD

Cup and Workshop, 2007.

Corbeil, J. and Florent, D., “Deploying a Cost-Effective and Production-

Ready Deep News Recommender System in the Media Crisis Context,”

in Proceedings of RecSys, 2020.

K. Chaudhari and T. Ankit, “A comprehensive survey on travel recommender systems,” Archives of Computational Methods in Engineering,

vol. 27, no. 5, pp. 1545–1571, 2020.

O. C. Agbonifo and A. Motunrayo, “A development of an ontologybased

personalised e-learning recommender system,” International Journal

of Computer (IJC), vol. 38, no. 1, pp. 102–112, 2020.

X. Gao, F. Feng, H. Huang, X. Mao, T. Lan, and Z. Chi, “Food recommendation with graph convolutional network,” Information Sciences,

vol. 584, pp. 170–183, 2022.

J. D´ıez, P. P´erez-N´u˜nez, O. Luaces, B. Remeseiro, and A. Bahamonde, “Towards explainable personalized recommendations by learning from users’ photos,” Information Sciences, vol. 520, pp. 416–430, 2020.

F. Ricci, L. Rokach, and B. Shapira, “Recommender systems: Introduction

and challenges,” in Recommender Systems Handbook, 2015, pp. 1–34.

J. Bobadilla, F. Ortega, A. Hernando, and A. Gutierrez, “Recommender

systems survey,” Knowledge-Based Systems, vol. 46, pp. 109–132, 2013.

M. Singh, “Scalability and sparsity issues in recommender datasets: a

survey,” in Knowledge and Information Systems, 2018, pp. 1–43.

Jannach, D. and Ludewig, M., “When recurrent neural networks meet

the neighborhood for session-based recommendation,” in Proceedings

of the Eleventh ACM Conference on Recommender Systems (RecSys17),

, p. 306–310.

V. Yadav1, R. Shukla, A. Tripathi, and A. Maurya, “A new approach for

movie recommender system using k-means clustering and pca,” Journal

of Scientific & Industrial Research, vol. 80, pp. 159–165, 2021.

L. Kaufman, Finding Groups in Data: An Introduction to Cluster

Analysis. New York: John & Sons Wiley, 2009.

J. Bailey, “Alternative clustering analysis: a review,” in Intelligent

Decision Technologies: Data Clustering: Algorithms and Applications.

Boca Raton: Chapman and Hall/CRC, 2014, pp. 533–548.

T. Li, M. Ogihara, and S. Ma, “On combining multiple clusterings: An

overview and a new perspective,” Applied Intelligence, vol. 33, no. 2,

pp. 207–219, 2010.

U. Ku˙zelewska, “Effect of Dataset Size on Efficiency of Collaborative

Filtering Recommender Systems with Multi-clustering as a Neighbourhood

Identification Strategy,” in International Conference on Computational

Science. New York: Springer, 2020, pp. 342–354.

——, “Scheme Selection based on Clusters’ Quality in Multi-Clustering

M-CCF Recommender System,” in 31st International Conference on

Information Systems Development (ISD 2023), 2023.

S. Latifi, N. Mauro, and D. Jannach, “Session-aware recommendation:

A surprising quest for the state-of-the-art,” Information Sciences, vol.

, pp. 291–315, 2021.

R. Kuo, C. Chen, and S. Keng, “Application of hybrid metaheuristic

with perturbation-based k-nearest neighbors algorithm and densest imputation

to collaborative filtering in recommender systems,” Information

Sciences, vol. 575, pp. 90–115, 2021.

Y. Kilani, A. Otoom, A. Alsarhan, and M. Almaayah, “A genetic

agorithms-based hybrid recommender system of matrix factorization

and neighborhood-based techniques,” Journal of Computational Science,

vol. 28, pp. 78–93, 2018.

U. Ku˙zelewska, “Dynamic Neighbourhood Identification Based on

Multi-clustering in Collaborative Filtering Recommender Systems,” in

International Conference on Dependability and Complex Systems, 2020,

pp. 410–419.

Yaoy, S. and Yuy, G. and Wangy, X. and Wangy, J. and Domeniconiz,

C. and Guox, M., “Discovering Multiple Co-Clusterings in Subspaces,”

in Proceedings of the 2019 SIAM International Conference on Data

Mining, 2019, pp. 423–431.

A. Bilge and H. Polat, “A scalable privacy-preserving recommendation

scheme via bisecting k-means clustering,” Information Process Management,

vol. 49, no. 4, pp. 912–927, 2013.

M. Farahani, J. Torkestani, and M. Rahmani, “Adaptive personalized

recommender system using learning automata and items clustering,”

Information Systems, vol. 106, p. 101978, 2022.

Rashid, M. and Shyong, K. L. and Karypis, G. and Riedl, J., “ClustKNN

a Highly Scalable Hybrid Model - Memory-based CF Algorithm,” in

Proceeding of WebKDD, 2006.

L. R. Divyaa and N. Pervin, “Towards generating scalable personalized

recommendations: Integrating social trust, social bias, and geo-spatial

clustering,” Decision Support Systems, vol. 122, 2019.

R. Logesh, V. Subramaniyaswamy, D. Malathi, N. Sivaramakrishnan,

and V. Vijayakumar, “Enhancing recommendation stability of collaborative

filtering recommender system through bio-inspired clustering

ensemble method,” Neural Computing and Applications, vol. 32, pp.

—-2164, 2020.

S. Kant and T. Mahara, “Nearest biclusters collaborative filtering framework

with fusion,” Journal of Computational Science, vol. 25, pp. 204–

, 2018.

F. de Aguiar Neto, A. da Costa, M. Manzato, and R. Campello, “Preprocessing

approaches for collaborative filtering based on hierarchical

clustering,” Information Sciences, vol. 534, pp. 172–191, 2020.

S. Bansal and N. Baliyan, “Bi-mars: A bi-clustering based memetic

algorithm for recommender systems,” Applied Soft Computing, vol. 97,

p. 106785, 2020.

P. Fr¨anti and S. Sieranoja, “How much can k-means be improved by

using better initialization and repeats?” Pattern Recognition, vol. 93,

pp. 95–112, 2019.

A. Strehl and J. Ghosh, “Cluster ensembles – a knowledge reuse framework

for combining multiple partitions,” Journal of Machine Learning

Research, vol. 3, pp. 583–617, 2002.

L. Bai, Y. Liang, and F. Cao, “A multiple k-means clustering

ensemble algorithm to find nonlinearly separable clusters,” Information

Fusion, vol. 61, pp. 36–47, 2020. [Online]. Available: https:


S. Zahra, M. A. Ghazanfar, A. Khalid, M. A. Azam, U. Naeem,

and A. Prugel-Bennett, “Novel centroid selection approaches for

kmeans-clustering based recommender systems,” Information Sciences,

vol. 320, pp. 156–189, 2015. [Online]. Available:


P. J. Rousseeuw, “Silhouettes a graphical aid to the interpretation and

validation of cluster analysis,” Computational and Applied Mathematics,

vol. 20, p. 53–65, 1987.

D. L. Davies and D. W. Bouldin, “A cluster separation measure,” PAMIIEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 1,

no. 2, pp. 224–227, 1979.

T. Cali´nski and J. Harabasz, “A dendrite method for cluster analysis,”

Communications in Statistics - Theory and Methods, vol. 3, pp. 1–27,

“Movielens dataset.” [Online]. Available:


F. Pedregosa, “Scikit-learn: Machine learning in python,” JMLR, vol. 12,

pp. 2825–2830, 2011.

J. Miles, R squared adjusted R squared. New York: Wiley

StatsRef: Statistics Reference Online, 2014. [Online]. Available: