In correlation clustering, we are given n objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error

Correlation Clustering with Adaptive Similarity Queries / Bressan, Marco; Nicolò, Cesa-Bianchi; Paudice, Andrea; Vitale, Fabio. - 32:(2019). (Intervento presentato al convegno 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) tenutosi a Vancouver; Canada).

Correlation Clustering with Adaptive Similarity Queries

Marco Bressan
;
Nicolò Cesa-Bianchi
;
Fabio Vitale
2019

Abstract

In correlation clustering, we are given n objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error
2019
33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
correlation clustering; active learning; oracles
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Correlation Clustering with Adaptive Similarity Queries / Bressan, Marco; Nicolò, Cesa-Bianchi; Paudice, Andrea; Vitale, Fabio. - 32:(2019). (Intervento presentato al convegno 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) tenutosi a Vancouver; Canada).
File allegati a questo prodotto
File Dimensione Formato  
Bressan_Correlation_2019.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 412.92 kB
Formato Adobe PDF
412.92 kB Adobe PDF   Contatta l'autore
Bressan_preprint_Correlation_2019.pdf

accesso aperto

Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 659.24 kB
Formato Adobe PDF
659.24 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1479395
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 1
social impact