In the last few years, model-based clustering techniques have become widely used in the context of microarray data analysis. In this empirical context, a potential purpose for statistical approaches is the identification of clusters of genes that are co-expressed under subsets of experimental conditions. We discuss a hierarchical mixture model to combine advantages of allowing for dependence within gene clusters and for simultaneous clustering of genes and experimental conditions. Thanks to the adopted hierarchical structure, we may distinguish gene clusters from mixture components, where the latter may represent intra-cluster gene-specific extra-Gaussian departures. To cluster experimental conditions, instead, we suggest a suitable parameterization of component-specific means by using a binary row stochastic matrix representing condition membership. The performance of the proposed approach is discussed on both simulated and real datasets. © Statistical Modeling Society 2011.
Hierarchical mixture models for biclustering in microarray data / Martella, Francesca; Alfo', Marco; Vichi, Maurizio. - In: STATISTICAL MODELLING. - ISSN 1471-082X. - 11:6(2011), pp. 489-505. [10.1177/1471082x1001100602]
Hierarchical mixture models for biclustering in microarray data
MARTELLA, Francesca;ALFO', Marco;VICHI, Maurizio
2011
Abstract
In the last few years, model-based clustering techniques have become widely used in the context of microarray data analysis. In this empirical context, a potential purpose for statistical approaches is the identification of clusters of genes that are co-expressed under subsets of experimental conditions. We discuss a hierarchical mixture model to combine advantages of allowing for dependence within gene clusters and for simultaneous clustering of genes and experimental conditions. Thanks to the adopted hierarchical structure, we may distinguish gene clusters from mixture components, where the latter may represent intra-cluster gene-specific extra-Gaussian departures. To cluster experimental conditions, instead, we suggest a suitable parameterization of component-specific means by using a binary row stochastic matrix representing condition membership. The performance of the proposed approach is discussed on both simulated and real datasets. © Statistical Modeling Society 2011.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.