A model-based biclustering method for multivariate discrete longitudinal data is proposed. We consider a finite mixture of generalized linear models to cluster units and, within each mixture component, we adopt a flexible and parsimonious parameterization of the component-specific canonical parameter to define subsets of variables (segments) sharing common dynamics over time. We develop an Expectation-Maximization-type algorithm for maximum likelihood estimation of model parameters. The performance of the proposed model is evaluated on a large scale simulation study, where we consider different choices for the sample the size, the number of measurement occasions, the number of components and segments. The proposal is applied to Italian crime data (font ISTAT) with the aim to detect areas sharing common longitudinal trajectories for specific subsets of crime types. The identification of such biclusters may potentially be helpful for policymakers to make decisions on safety.

Biclustering multivariate discrete longitudinal data / Alfo', Marco; Marino, MARIA FRANCESCA; Martella, Francesca. - In: STATISTICS AND COMPUTING. - ISSN 1573-1375. - 42(2024), pp. 1-21. [10.1007/s11222-023-10292-6]

Biclustering multivariate discrete longitudinal data

Marco Alfo';Maria Francesca Marino
;
Francesca Martella
2024

Abstract

A model-based biclustering method for multivariate discrete longitudinal data is proposed. We consider a finite mixture of generalized linear models to cluster units and, within each mixture component, we adopt a flexible and parsimonious parameterization of the component-specific canonical parameter to define subsets of variables (segments) sharing common dynamics over time. We develop an Expectation-Maximization-type algorithm for maximum likelihood estimation of model parameters. The performance of the proposed model is evaluated on a large scale simulation study, where we consider different choices for the sample the size, the number of measurement occasions, the number of components and segments. The proposal is applied to Italian crime data (font ISTAT) with the aim to detect areas sharing common longitudinal trajectories for specific subsets of crime types. The identification of such biclusters may potentially be helpful for policymakers to make decisions on safety.
2024
Finite mixtures; model-based clustering ; three-way data; generalized linear models; EM algorithm
01 Pubblicazione su rivista::01a Articolo in rivista
Biclustering multivariate discrete longitudinal data / Alfo', Marco; Marino, MARIA FRANCESCA; Martella, Francesca. - In: STATISTICS AND COMPUTING. - ISSN 1573-1375. - 42(2024), pp. 1-21. [10.1007/s11222-023-10292-6]
File allegati a questo prodotto
File Dimensione Formato  
Alfo_Biclustering_2024.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 23.93 MB
Formato Adobe PDF
23.93 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1688819
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact