We initiate the study of finding the Jaccard center of a given collection N of sets. For two sets X,Y, the Jaccard index is defined as |X\cap Y|/|X\cup Y| and the corresponding distance is 1-|X\cap Y|/|X\cup Y|. The Jaccard center is a set C minimizing the maximum distance to any set of N. We show that the problem is NP-hard to solve exactly, and that it admits a PTAS while no FPTAS can exist unless P = NP. Furthermore, we show that the problem is fixed parameter tractable in the maximum Hamming norm between Jaccard center and any input set. Our algorithms are based on a compression technique similar in spirit to coresets for the Euclidean 1-center problem. In addition, we also show that, contrary to the previously studied median problem by Chierichetti et al. (SODA 2010), the continuous version of the Jaccard center problem admits a simple polynomial time algorithm.

On finding the Jaccard center / Bury, Marc; Schwiegelshohn, CHRIS RENE. - 80:(2017). (Intervento presentato al convegno 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017 tenutosi a Warsaw; Poland nel 2017) [10.4230/LIPIcs.ICALP.2017.23].

On finding the Jaccard center

SCHWIEGELSHOHN, CHRIS RENE
2017

Abstract

We initiate the study of finding the Jaccard center of a given collection N of sets. For two sets X,Y, the Jaccard index is defined as |X\cap Y|/|X\cup Y| and the corresponding distance is 1-|X\cap Y|/|X\cup Y|. The Jaccard center is a set C minimizing the maximum distance to any set of N. We show that the problem is NP-hard to solve exactly, and that it admits a PTAS while no FPTAS can exist unless P = NP. Furthermore, we show that the problem is fixed parameter tractable in the maximum Hamming norm between Jaccard center and any input set. Our algorithms are based on a compression technique similar in spirit to coresets for the Euclidean 1-center problem. In addition, we also show that, contrary to the previously studied median problem by Chierichetti et al. (SODA 2010), the continuous version of the Jaccard center problem admits a simple polynomial time algorithm.
2017
44th International Colloquium on Automata, Languages, and Programming, ICALP 2017
1-Center; Clustering; Jaccard
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
On finding the Jaccard center / Bury, Marc; Schwiegelshohn, CHRIS RENE. - 80:(2017). (Intervento presentato al convegno 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017 tenutosi a Warsaw; Poland nel 2017) [10.4230/LIPIcs.ICALP.2017.23].
File allegati a questo prodotto
File Dimensione Formato  
Bury_On-finding_2017.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 497.92 kB
Formato Adobe PDF
497.92 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1085837
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact