Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. In this paper, we address the problem of identifying a densest subgraph, while ensuring that none of one binary protected attribute is disparately impacted. Unfortunately, the underlying algorithmic problem is NP-hard, even in its approximation version: approximating the densest fair subgraph with a polynomial-time algorithm is at least as hard as the densest subgraph problem of at most k vertices, for which no constant approximation algorithms are known. Despite such negative premises, we are able to provide approximation results in two important cases. In particular, we are able to prove that a suitable spectral embedding allows recovery of an almost optimal, fair, dense subgraph hidden in the input data, whenever one is present, a result that is further supported by experimental evidence. We also show a polynomial-time, $2$-approximation algorithm, whenever the underlying graph is itself fair. We finally prove that, under the small set expansion hypothesis, this result is tight for fair graphs. The above theoretical findings drive the design of heuristics, which we experimentally evaluate on a scenario based on real data, in which our aim is to strike a good balance between diversity and highly correlated items from Amazon co-purchasing graphs.

Spectral Relaxations and Fair Densest Subgraphs / Anagnostopoulos, A.; Becchetti, L.; Fazzone, A.; Menghini, C.; Schwiegelshohn, C.. - (2020), pp. 35-44. (Intervento presentato al convegno 29th ACM International Conference on Information and Knowledge Management, CIKM 2020 tenutosi a Virtual Event, Ireland) [10.1145/3340531.3412036].

Spectral Relaxations and Fair Densest Subgraphs

Anagnostopoulos A.
;
Becchetti L.
;
Fazzone A.
;
Menghini C.
;
Schwiegelshohn C.
2020

Abstract

Reducing hidden bias in the data and ensuring fairness in algorithmic data analysis has recently received significant attention. In this paper, we address the problem of identifying a densest subgraph, while ensuring that none of one binary protected attribute is disparately impacted. Unfortunately, the underlying algorithmic problem is NP-hard, even in its approximation version: approximating the densest fair subgraph with a polynomial-time algorithm is at least as hard as the densest subgraph problem of at most k vertices, for which no constant approximation algorithms are known. Despite such negative premises, we are able to provide approximation results in two important cases. In particular, we are able to prove that a suitable spectral embedding allows recovery of an almost optimal, fair, dense subgraph hidden in the input data, whenever one is present, a result that is further supported by experimental evidence. We also show a polynomial-time, $2$-approximation algorithm, whenever the underlying graph is itself fair. We finally prove that, under the small set expansion hypothesis, this result is tight for fair graphs. The above theoretical findings drive the design of heuristics, which we experimentally evaluate on a scenario based on real data, in which our aim is to strike a good balance between diversity and highly correlated items from Amazon co-purchasing graphs.
2020
29th ACM International Conference on Information and Knowledge Management, CIKM 2020
densest subgraph; fairness; spectral graph analysis
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Spectral Relaxations and Fair Densest Subgraphs / Anagnostopoulos, A.; Becchetti, L.; Fazzone, A.; Menghini, C.; Schwiegelshohn, C.. - (2020), pp. 35-44. (Intervento presentato al convegno 29th ACM International Conference on Information and Knowledge Management, CIKM 2020 tenutosi a Virtual Event, Ireland) [10.1145/3340531.3412036].
File allegati a questo prodotto
File Dimensione Formato  
Anagnostopoulos_postprint_Spectral_2020.pdf

accesso aperto

Note: https://doi.org/10.1145/3340531.3412036
Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 777.31 kB
Formato Adobe PDF
777.31 kB Adobe PDF
Anagnostopoulos_Spectral_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.59 MB
Formato Adobe PDF
2.59 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1463354
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 7
social impact