Due to its many applications, the clustering ensemble problem has been subject of intense algorithmic study over the last two decades. The input to this problem is a set of clusterings; its goal is to output a clustering that minimizes the average distance to the input clusterings. In this paper, we propose, to the best of our knowledge, the first generative model for this problem. Our Gibbs-like model is parameterized by a center clustering, and by a scale; the probability of a particular clustering decays exponentially with its scaled Rand distance to the center clustering.For our new model, we give polynomial-time algorithms forcenter dot sampling, when the center clustering has a constant number of clusters andcenter dot reconstruction, when the scale parameter is small.En route, we establish several interesting properties of our model. Our work shows that the combinatorial structure of a Gibbs-like model for clusterings is more intricate and challenging than the corresponding and well-studied (Mallows) model for permutations.

The Gibbs-Rand Model / Chierichetti, F; Kumar, R; Lattanzi, S. - (2022), pp. 151-163. (Intervento presentato al convegno PODS tenutosi a Philadelphia; USA) [10.1145/3517804.3526227].

The Gibbs-Rand Model

Chierichetti, F;
2022

Abstract

Due to its many applications, the clustering ensemble problem has been subject of intense algorithmic study over the last two decades. The input to this problem is a set of clusterings; its goal is to output a clustering that minimizes the average distance to the input clusterings. In this paper, we propose, to the best of our knowledge, the first generative model for this problem. Our Gibbs-like model is parameterized by a center clustering, and by a scale; the probability of a particular clustering decays exponentially with its scaled Rand distance to the center clustering.For our new model, we give polynomial-time algorithms forcenter dot sampling, when the center clustering has a constant number of clusters andcenter dot reconstruction, when the scale parameter is small.En route, we establish several interesting properties of our model. Our work shows that the combinatorial structure of a Gibbs-like model for clusterings is more intricate and challenging than the corresponding and well-studied (Mallows) model for permutations.
2022
PODS
Clustering; Gibbs model; Rand distance
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
The Gibbs-Rand Model / Chierichetti, F; Kumar, R; Lattanzi, S. - (2022), pp. 151-163. (Intervento presentato al convegno PODS tenutosi a Philadelphia; USA) [10.1145/3517804.3526227].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1657787
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact