In this paper, we consider the application of data mining methods in medical contexts, wherein the data to be analysed (e.g. records from different patients) is distributed among multiple clinical parties. Although inference procedures could provide meaningful medical information (such as optimal clustering of the subjects), each party is forbidden to disclose its local dataset to a centralized location, due to privacy concerns over sensible portions of the dataset. To this end, we propose a general framework enabling the parties involved to perform (in a decentralized fashion) any data mining procedure relying solely on the Euclidean distance among patterns, including kernel methods, spectral clustering, and so on. Specifically, the problem is recast as a decentralized matrix completion problem, whose proposed solution does not require the presence of a centralized coordinator, and full privacy of the original data can be ensured by the use of different strategies, including random multiplicative updates for secure computation of distances. Experimental results support our proposal as an efficient tool for performing clustering and classification in distributed medical contexts. As an example, on the known Pima Indians Diabetes dataset, we obtain a Rand-Index for clustering of 0.52 against 0.54 of the (unfeasible) centralized solution, while on the Parkinson speech database we increase from 0.45 to 0.50.

Privacy-preserving data mining for distributed medical scenarios / Scardapane, Simone; Altilio, Rosa; Ciccarelli, V.; Uncini, Aurelio; Panella, Massimo. - STAMPA. - 69(2018), pp. 119-128. - SMART INNOVATION, SYSTEMS AND TECHNOLOGIES. [10.1007/978-3-319-56904-8_12].

Privacy-preserving data mining for distributed medical scenarios

SCARDAPANE, SIMONE;ALTILIO, ROSA;UNCINI, Aurelio;PANELLA, Massimo
2018

Abstract

In this paper, we consider the application of data mining methods in medical contexts, wherein the data to be analysed (e.g. records from different patients) is distributed among multiple clinical parties. Although inference procedures could provide meaningful medical information (such as optimal clustering of the subjects), each party is forbidden to disclose its local dataset to a centralized location, due to privacy concerns over sensible portions of the dataset. To this end, we propose a general framework enabling the parties involved to perform (in a decentralized fashion) any data mining procedure relying solely on the Euclidean distance among patterns, including kernel methods, spectral clustering, and so on. Specifically, the problem is recast as a decentralized matrix completion problem, whose proposed solution does not require the presence of a centralized coordinator, and full privacy of the original data can be ensured by the use of different strategies, including random multiplicative updates for secure computation of distances. Experimental results support our proposal as an efficient tool for performing clustering and classification in distributed medical contexts. As an example, on the known Pima Indians Diabetes dataset, we obtain a Rand-Index for clustering of 0.52 against 0.54 of the (unfeasible) centralized solution, while on the Parkinson speech database we increase from 0.45 to 0.50.
2018
Multidisciplinary Approaches to Neural Computing
978-3-319-56903-1
Distributed learning; biomedicine; Kernel methods; spectral clustering; privacy
02 Pubblicazione su volume::02a Capitolo o Articolo
Privacy-preserving data mining for distributed medical scenarios / Scardapane, Simone; Altilio, Rosa; Ciccarelli, V.; Uncini, Aurelio; Panella, Massimo. - STAMPA. - 69(2018), pp. 119-128. - SMART INNOVATION, SYSTEMS AND TECHNOLOGIES. [10.1007/978-3-319-56904-8_12].
File allegati a questo prodotto
File Dimensione Formato  
Dichiarazione_conformità 18-11-2016.pdf

solo utenti autorizzati

Tipologia: Altro materiale allegato
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.98 MB
Formato Adobe PDF
1.98 MB Adobe PDF   Contatta l'autore
Scardapane_Privacy-preserving_2018.pdf

solo gestori archivio

Note: chapter 12
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 190.6 kB
Formato Adobe PDF
190.6 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/869673
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? ND
social impact