We propose a finite mixture model that simultaneously clusters listeners habits and music genres. By following the Underlying Response Variable (URV) approach, we treat the music genres as discretized versions of latent continuous variables, which are distributed according to a mixture of Gaussians. To introduce a partition of the music genres within each mixture component, we use a factorial representation of the data, where a binary row stochastic matrix represents music genre membership. This method allows us to associate each mixture component with a cluster of music genres, thereby defining a bicluster of listeners habits and music genres. Given the numerical complexity of the likelihood function, we estimate model parameters using a composite likelihood (CL) approach, leading to a computationally efficient like EM algorithm. The results illustrate the effectiveness of the proposed model in discover significant patterns within the data. The model adeptly identifies clusters of listeners who share similar preferences for clusters of music genres, revealing both the listener groups with common tastes and the relationships between different music genres within these groups. Additionally, by allowing the number of clusters of music genres to vary with listener clusters, the model adeptly captures the inherent variability in listener preferences, exhibiting its flexibility and accuracy in representing the data and uncovering interesting patterns in listener behavior.
Biclustering listeners and music genres using a composite likelihood-based approach / Martella, Francesca; Ranalli, Monia. - (2024). (Intervento presentato al convegno International Joint Conference CFE-CMStatistics tenutosi a London).
Biclustering listeners and music genres using a composite likelihood-based approach
francesca martella
;monia ranalli
2024
Abstract
We propose a finite mixture model that simultaneously clusters listeners habits and music genres. By following the Underlying Response Variable (URV) approach, we treat the music genres as discretized versions of latent continuous variables, which are distributed according to a mixture of Gaussians. To introduce a partition of the music genres within each mixture component, we use a factorial representation of the data, where a binary row stochastic matrix represents music genre membership. This method allows us to associate each mixture component with a cluster of music genres, thereby defining a bicluster of listeners habits and music genres. Given the numerical complexity of the likelihood function, we estimate model parameters using a composite likelihood (CL) approach, leading to a computationally efficient like EM algorithm. The results illustrate the effectiveness of the proposed model in discover significant patterns within the data. The model adeptly identifies clusters of listeners who share similar preferences for clusters of music genres, revealing both the listener groups with common tastes and the relationships between different music genres within these groups. Additionally, by allowing the number of clusters of music genres to vary with listener clusters, the model adeptly captures the inherent variability in listener preferences, exhibiting its flexibility and accuracy in representing the data and uncovering interesting patterns in listener behavior.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.