The standard mixture modelling framework has been widely used to study heteroge neous populations, by modelling them as being composed of a finite number of homoge neous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrealistic. It is in fact con ceptually very different to represent an observation as partly belonging to multiple groups instead of belonging to one group with uncertainty. For this purpose, various soft clustering approaches, or individual-level mixture models, have been developed. In this context, (6) formulated the Bayesian partial membership model (BPM) as an alternative structure for individual-level mixtures, which also captures partial membership in the form of attribute specific mixtures, but does not assume a factorization over attributes. Our work proposes using the BPM for soft clustering of count data. Learning and inference are carried out using Markov chain Monte Carlo methods. The method is applied on Capital Bike share data of Washington DC from 15 of June to 15 of July 2022.
Partial membership models for soft clustering of multivariate count data / Seri, Emiliano; Brendan Murphy, Thomas; Rocci, Roberto. - (2023), pp. 623-628. (Intervento presentato al convegno SIS 2023 - Statistical Learning, Sustainability and Impact Evaluation tenutosi a Ancona, Italia).
Partial membership models for soft clustering of multivariate count data
Emiliano Seri;Roberto Rocci
2023
Abstract
The standard mixture modelling framework has been widely used to study heteroge neous populations, by modelling them as being composed of a finite number of homoge neous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrealistic. It is in fact con ceptually very different to represent an observation as partly belonging to multiple groups instead of belonging to one group with uncertainty. For this purpose, various soft clustering approaches, or individual-level mixture models, have been developed. In this context, (6) formulated the Bayesian partial membership model (BPM) as an alternative structure for individual-level mixtures, which also captures partial membership in the form of attribute specific mixtures, but does not assume a factorization over attributes. Our work proposes using the BPM for soft clustering of count data. Learning and inference are carried out using Markov chain Monte Carlo methods. The method is applied on Capital Bike share data of Washington DC from 15 of June to 15 of July 2022.File | Dimensione | Formato | |
---|---|---|---|
Seri_Partial-membership-models_2023.pdf
accesso aperto
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
948.73 kB
Formato
Adobe PDF
|
948.73 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.