In this work, we modify finite mixtures of factor analysers to provide a method for simultaneous clustering of subjects and multivariate discrete outcomes. The joint clustering is performed through a suitable reparameterization of the outcome (column)-specific parameters. We develop an expectation–maximization-type algorithm for maximum likelihood parameter estimation where the maximization step is divided into orthogonal sub-blocks that refer to row and column-specific parameters, respectively. Model performance is evaluated via a simulation study with varying sample size, number of outcomes and row/column-specific clustering (partitions). We compare the performance of our model with the performance of standard model-based biclustering approaches. The proposed method is also demonstrated on a benchmark data set where a multivariate binary response is considered.
A finite mixture approach to joint clustering of individuals and multivariate discrete outcomes / Martella, Francesca; Alfo', Marco. - In: JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION. - ISSN 0094-9655. - 87:11(2017), pp. 2186-2206. [10.1080/00949655.2017.1322593]
A finite mixture approach to joint clustering of individuals and multivariate discrete outcomes
MARTELLA, Francesca
;ALFO', Marco
2017
Abstract
In this work, we modify finite mixtures of factor analysers to provide a method for simultaneous clustering of subjects and multivariate discrete outcomes. The joint clustering is performed through a suitable reparameterization of the outcome (column)-specific parameters. We develop an expectation–maximization-type algorithm for maximum likelihood parameter estimation where the maximization step is divided into orthogonal sub-blocks that refer to row and column-specific parameters, respectively. Model performance is evaluated via a simulation study with varying sample size, number of outcomes and row/column-specific clustering (partitions). We compare the performance of our model with the performance of standard model-based biclustering approaches. The proposed method is also demonstrated on a benchmark data set where a multivariate binary response is considered.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.