Applications in various domains often lead to high-dimensional data, which put up the challenge of interpreting a huge mass of data often consisting of millions of measurements. A rst step towards ad- dressing this challenge is the use of data reduction techniques, which is essential in the data mining process to reveal natural structures and to identify interesting patterns in the analyzed data. One of the most important approaches to synthesize the two modes of a high- dimensional data matrix is the joint clustering of rows and columns. This latter, named biclustering, is a data mining technique which al- lows for simultaneous clustering of rows and columns aiming at par- titioning a data matrix into homogeneous biclusters. During the past decades biclustering approaches have been proposed by various au- thors in several scienti c elds. For example, in the genetic eld, the biclustering approach may be used to identify group of genes and an associated group of conditions over which the genes are co-expressed. In this work, we focus on a proposal of model-based biclustering for multivariate discrete longitudinal data. These come in the form of three-way data: the rst dimension identi es individuals, the second dimension identi es variables, the third one identi es time occasions. While a huge literature is devoted to clustering three-way data, the literature on biclustering three-way data is limited. Motivated by the particular case of multivariate discrete longitudinal responses, we pro- pose a model-based biclustering approach aiming at identifying clus- ters of units sharing common longitudinal trajectories for subsets of variables. Speci cally, a nite mixture of generalized linear models is considered to cluster units; within each mixture component, a exible and parsimonious parameterization of the corresponding canonical pa- rameter is adopted to identify clusters of variables evolving in a similar manner across time by making use of adequate time functions.

Biclustering for multivariate longitudinal data / Alfo, Marco; Maria Marino, Francesca; Martella, Francesca. - (2019), pp. 168-169. (Intervento presentato al convegno The European Meeting of Statisticians (EMS) 2019 tenutosi a Palermo).

Biclustering for multivariate longitudinal data

Alfo Marco;Francesca Martella
2019

Abstract

Applications in various domains often lead to high-dimensional data, which put up the challenge of interpreting a huge mass of data often consisting of millions of measurements. A rst step towards ad- dressing this challenge is the use of data reduction techniques, which is essential in the data mining process to reveal natural structures and to identify interesting patterns in the analyzed data. One of the most important approaches to synthesize the two modes of a high- dimensional data matrix is the joint clustering of rows and columns. This latter, named biclustering, is a data mining technique which al- lows for simultaneous clustering of rows and columns aiming at par- titioning a data matrix into homogeneous biclusters. During the past decades biclustering approaches have been proposed by various au- thors in several scienti c elds. For example, in the genetic eld, the biclustering approach may be used to identify group of genes and an associated group of conditions over which the genes are co-expressed. In this work, we focus on a proposal of model-based biclustering for multivariate discrete longitudinal data. These come in the form of three-way data: the rst dimension identi es individuals, the second dimension identi es variables, the third one identi es time occasions. While a huge literature is devoted to clustering three-way data, the literature on biclustering three-way data is limited. Motivated by the particular case of multivariate discrete longitudinal responses, we pro- pose a model-based biclustering approach aiming at identifying clus- ters of units sharing common longitudinal trajectories for subsets of variables. Speci cally, a nite mixture of generalized linear models is considered to cluster units; within each mixture component, a exible and parsimonious parameterization of the corresponding canonical pa- rameter is adopted to identify clusters of variables evolving in a similar manner across time by making use of adequate time functions.
2019
The European Meeting of Statisticians (EMS) 2019
04 Pubblicazione in atti di convegno::04d Abstract in atti di convegno
Biclustering for multivariate longitudinal data / Alfo, Marco; Maria Marino, Francesca; Martella, Francesca. - (2019), pp. 168-169. (Intervento presentato al convegno The European Meeting of Statisticians (EMS) 2019 tenutosi a Palermo).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1352020
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact