Catalogo dei prodotti della ricerca

Biclustering concerns the simultaneous partitioning of units and variables into homogeneous blocks of rows and columns in a data matrix. In detail, this approach is often used to analyze large data matrices in which the relationships between rows and columns can be considered symmetrical. A common area of application concerns the field of genetics, where the biclustering approach can be used to identify groups of genes that are co-expressed under subsets of experimental conditions. A novel model-based biclustering approach for multivariate data is introduced exploiting a finite mixture of generalized latent trait models. The proposed model allows us to cluster units into subsets, called components, via a finite mixture specification. Within each component, subsets of variables, called segments, are identified by a flexible and parsimonious specification of the linear predictor in terms of a row-stochastic vector. The model is designed to handle both qualitative and quantitative variables with (conditional) distribution in the Exponential Family. The integration of a multidimensional, continuous latent trait in the linear predictor allows us to account for the residual dependence between multivariate outcomes from the same unit. In addition, the proposal allows for the inclusion of covariates in the latent layer of the model to determine their impact on component formation. We employ an EM-type algorithm for maximum likelihood estimation of model parameters, together with Gauss Hermite quadrature in order to approximate multidimensional integrals whose closed-form solutions are not available.

Mixtures of Generalized Latent Trait Analyzers for biclustering multivariate data / Failli, Dalila; francesca Marino, Maria; Martella, Francesca. - (2024). (Intervento presentato al convegno StaTalk tenutosi a Firenze).

Mixtures of Generalized Latent Trait Analyzers for biclustering multivariate data

dalila Failli;maria francesca Marino;francesca Martella

2024

Abstract

Biclustering concerns the simultaneous partitioning of units and variables into homogeneous blocks of rows and columns in a data matrix. In detail, this approach is often used to analyze large data matrices in which the relationships between rows and columns can be considered symmetrical. A common area of application concerns the field of genetics, where the biclustering approach can be used to identify groups of genes that are co-expressed under subsets of experimental conditions. A novel model-based biclustering approach for multivariate data is introduced exploiting a finite mixture of generalized latent trait models. The proposed model allows us to cluster units into subsets, called components, via a finite mixture specification. Within each component, subsets of variables, called segments, are identified by a flexible and parsimonious specification of the linear predictor in terms of a row-stochastic vector. The model is designed to handle both qualitative and quantitative variables with (conditional) distribution in the Exponential Family. The integration of a multidimensional, continuous latent trait in the linear predictor allows us to account for the residual dependence between multivariate outcomes from the same unit. In addition, the proposal allows for the inclusion of covariates in the latent layer of the model to determine their impact on component formation. We employ an EM-type algorithm for maximum likelihood estimation of model parameters, together with Gauss Hermite quadrature in order to approximate multidimensional integrals whose closed-form solutions are not available.

Scheda breve

Scheda completa

Anno di pubblicazione

2024

Appartiene alla tipologia:

04f Poster

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1717036

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact