XGDAG: explainable gene–disease associations via graph neural networks

Mastropietro, Andrea; De Carlo, Gianluca; Anagnostopoulos, Aris

doi:10.1093/bioinformatics/btad482

Motivation: Disease gene prioritization consists in identifying genes that are likely to be involved in the mechanisms of a given disease, providing a ranking of such genes. Recently, the research community has used computational methods to uncover unknown gene-disease associations; these methods range from combinatorial to machine learning-based approaches. In particular, during the last years, approaches based on deep learning have provided superior results compared to more traditional ones. Yet, the problem with these is their inherent black-box structure, which prevents interpretability. Results: We propose a new methodology for disease gene discovery, which leverages graph-structured data using graph neural networks (GNNs) along with an explainability phase for determining the ranking of candidate genes and understanding the model’s output. Our approach is based on a positive–unlabeled learning strategy, which outperforms existing gene discovery methods by exploiting GNNs in a non-black-box fashion. Our methodology is effective even in scenarios where a large number of associated genes need to be retrieved, in which gene prioritization methods often tend to lose their reliability.

XGDAG: explainable gene–disease associations via graph neural networks / Mastropietro, Andrea; DE CARLO, Gianluca; Anagnostopoulos, Aris. - In: BIOINFORMATICS. - ISSN 1367-4811. - 39:8(2023). [10.1093/bioinformatics/btad482]

XGDAG: explainable gene–disease associations via graph neural networks

Andrea Mastropietro^Primo;Gianluca De Carlo^Secondo;Aris Anagnostopoulos^Ultimo

2023

Abstract

Motivation: Disease gene prioritization consists in identifying genes that are likely to be involved in the mechanisms of a given disease, providing a ranking of such genes. Recently, the research community has used computational methods to uncover unknown gene-disease associations; these methods range from combinatorial to machine learning-based approaches. In particular, during the last years, approaches based on deep learning have provided superior results compared to more traditional ones. Yet, the problem with these is their inherent black-box structure, which prevents interpretability. Results: We propose a new methodology for disease gene discovery, which leverages graph-structured data using graph neural networks (GNNs) along with an explainability phase for determining the ranking of candidate genes and understanding the model’s output. Our approach is based on a positive–unlabeled learning strategy, which outperforms existing gene discovery methods by exploiting GNNs in a non-black-box fashion. Our methodology is effective even in scenarios where a large number of associated genes need to be retrieved, in which gene prioritization methods often tend to lose their reliability.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Parole chiave
	
				bioinformatics; disease gene discovery; gene disease association; graph neural networks; explainable artificial intelligence
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				XGDAG: explainable gene–disease associations via graph neural networks / Mastropietro, Andrea; DE CARLO, Gianluca; Anagnostopoulos, Aris. - In: BIOINFORMATICS. - ISSN 1367-4811. - 39:8(2023). [10.1093/bioinformatics/btad482]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Mastropietro_XGDAG_2023.pdf accesso aperto Note: Full text Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 3.35 MB Formato Adobe PDF	3.35 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1687346

Citazioni

1

9

7

Catalogo dei prodotti della ricerca