Due to its many applications, the clustering ensemble problem has been subject of intense algorithmic study over the last two decades. The input to this problem is a set of clusterings; its goal is to output a clustering that minimizes the average distance to the input clusterings. In this paper, we propose, to the best of our knowledge, the first generative model for this problem. Our Gibbs-like model is parameterized by a center clustering, and by a scale; the probability of a particular clustering decays exponentially with its scaled Rand distance to the center clustering.For our new model, we give polynomial-time algorithms forcenter dot sampling, when the center clustering has a constant number of clusters andcenter dot reconstruction, when the scale parameter is small.En route, we establish several interesting properties of our model. Our work shows that the combinatorial structure of a Gibbs-like model for clusterings is more intricate and challenging than the corresponding and well-studied (Mallows) model for permutations.
The Gibbs-Rand Model / Chierichetti, F; Kumar, R; Lattanzi, S. - (2022), pp. 151-163. (Intervento presentato al convegno PODS tenutosi a Philadelphia; USA) [10.1145/3517804.3526227].
The Gibbs-Rand Model
Chierichetti, F;
2022
Abstract
Due to its many applications, the clustering ensemble problem has been subject of intense algorithmic study over the last two decades. The input to this problem is a set of clusterings; its goal is to output a clustering that minimizes the average distance to the input clusterings. In this paper, we propose, to the best of our knowledge, the first generative model for this problem. Our Gibbs-like model is parameterized by a center clustering, and by a scale; the probability of a particular clustering decays exponentially with its scaled Rand distance to the center clustering.For our new model, we give polynomial-time algorithms forcenter dot sampling, when the center clustering has a constant number of clusters andcenter dot reconstruction, when the scale parameter is small.En route, we establish several interesting properties of our model. Our work shows that the combinatorial structure of a Gibbs-like model for clusterings is more intricate and challenging than the corresponding and well-studied (Mallows) model for permutations.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.