One of the most popular and widely used methods for data clustering is hierarchical clustering. This clustering technique has proved useful to reveal interesting structure in the data in several applications ranging from computational biology to computer vision. Robustness is an important feature of a clustering technique if we require the clustering to be stable against small perturbations in the input data. In most applications, getting a clustering output that is robust against adversarial outliers or stochastic noise is a necessary condition for the applicability and effectiveness of the clustering technique. This is even more critical in hierarchical clustering where a small change at the bottom of the hierarchy may propagate all the way through to the top. Despite all the previous work [2, 3, 6, 8], our theoretical understanding of robust hierarchical clustering is still limited and several hierarchical clustering algorithms are not known to satisfy such robustness properties. In this paper, we study the limits of robust hierarchical k-center clustering by introducing the concept of universal hierarchical clustering and provide (almost) tight lower and upper bounds for the robust hierarchical k-center clustering problem with outliers and variants of the stochastic clustering problem. Most importantly we present a constant-factor approximation for optimal hierarchical k-center with at most z outliers using a universal set of at most O(z2) set of outliers and show that this result is tight. Moreover we show the necessity of using a universal set of outliers in order to compute an approximately optimal hierarchical k-center with a diffierent set of outliers for each k.
Robust hierarchical k-center clustering / Lattanzi, Silvio; Leonardi, Stefano; Mirrokni, Vahab; Razenshteyny, Ilya. - STAMPA. - (2015), pp. 211-218. (Intervento presentato al convegno 6th Conference on Innovations in Theoretical Computer Science, ITCS 2015 tenutosi a Rehovot; Israel nel 11-13 January 2015) [10.1145/2688073.2688104].
Robust hierarchical k-center clustering
LATTANZI, SILVIO;LEONARDI, Stefano
;
2015
Abstract
One of the most popular and widely used methods for data clustering is hierarchical clustering. This clustering technique has proved useful to reveal interesting structure in the data in several applications ranging from computational biology to computer vision. Robustness is an important feature of a clustering technique if we require the clustering to be stable against small perturbations in the input data. In most applications, getting a clustering output that is robust against adversarial outliers or stochastic noise is a necessary condition for the applicability and effectiveness of the clustering technique. This is even more critical in hierarchical clustering where a small change at the bottom of the hierarchy may propagate all the way through to the top. Despite all the previous work [2, 3, 6, 8], our theoretical understanding of robust hierarchical clustering is still limited and several hierarchical clustering algorithms are not known to satisfy such robustness properties. In this paper, we study the limits of robust hierarchical k-center clustering by introducing the concept of universal hierarchical clustering and provide (almost) tight lower and upper bounds for the robust hierarchical k-center clustering problem with outliers and variants of the stochastic clustering problem. Most importantly we present a constant-factor approximation for optimal hierarchical k-center with at most z outliers using a universal set of at most O(z2) set of outliers and show that this result is tight. Moreover we show the necessity of using a universal set of outliers in order to compute an approximately optimal hierarchical k-center with a diffierent set of outliers for each k.File | Dimensione | Formato | |
---|---|---|---|
Lattanzi_Postprint_Robust Hierarchical_2015.pdf
accesso aperto
Note: https://dl.acm.org/citation.cfm?doid=2688073.2688104
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
299.83 kB
Formato
Adobe PDF
|
299.83 kB | Adobe PDF | |
Lattanzi_Robust Hierarchical_2015.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
584.77 kB
Formato
Adobe PDF
|
584.77 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.