The standard mixture modelling framework has been widely used to study heteroge neous populations, by modelling them as being composed of a finite number of homoge neous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrealistic. It is in fact con ceptually very different to represent an observation as partly belonging to multiple groups instead of belonging to one group with uncertainty. For this purpose, various soft clustering approaches, or individual-level mixture models, have been developed. In this context, (6) formulated the Bayesian partial membership model (BPM) as an alternative structure for individual-level mixtures, which also captures partial membership in the form of attribute specific mixtures, but does not assume a factorization over attributes. Our work proposes using the BPM for soft clustering of count data. Learning and inference are carried out using Markov chain Monte Carlo methods. The method is applied on Capital Bike share data of Washington DC from 15 of June to 15 of July 2022.

Partial membership models for soft clustering of multivariate count data / Seri, Emiliano; Brendan Murphy, Thomas; Rocci, Roberto. - (2023), pp. 623-628. (Intervento presentato al convegno SIS 2023 - Statistical Learning, Sustainability and Impact Evaluation tenutosi a Ancona, Italia).

Partial membership models for soft clustering of multivariate count data

Emiliano Seri;Roberto Rocci
2023

Abstract

The standard mixture modelling framework has been widely used to study heteroge neous populations, by modelling them as being composed of a finite number of homoge neous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrealistic. It is in fact con ceptually very different to represent an observation as partly belonging to multiple groups instead of belonging to one group with uncertainty. For this purpose, various soft clustering approaches, or individual-level mixture models, have been developed. In this context, (6) formulated the Bayesian partial membership model (BPM) as an alternative structure for individual-level mixtures, which also captures partial membership in the form of attribute specific mixtures, but does not assume a factorization over attributes. Our work proposes using the BPM for soft clustering of count data. Learning and inference are carried out using Markov chain Monte Carlo methods. The method is applied on Capital Bike share data of Washington DC from 15 of June to 15 of July 2022.
2023
SIS 2023 - Statistical Learning, Sustainability and Impact Evaluation
partial membership models; model based clustering; mixture models
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Partial membership models for soft clustering of multivariate count data / Seri, Emiliano; Brendan Murphy, Thomas; Rocci, Roberto. - (2023), pp. 623-628. (Intervento presentato al convegno SIS 2023 - Statistical Learning, Sustainability and Impact Evaluation tenutosi a Ancona, Italia).
File allegati a questo prodotto
File Dimensione Formato  
Seri_Partial-membership-models_2023.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 948.73 kB
Formato Adobe PDF
948.73 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1727467
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact