Learning characteristics of graph neural networks predicting protein–ligand affinities

Mastropietro, Andrea; Pasculli, Giuseppe; Bajorath, Jürgen

doi:10.1038/s42256-023-00756-9

In drug design, compound potency prediction is a popular machine learning application. Graph neural networks (GNNs) predict ligand affinity from graph representations of protein–ligand interactions typically extracted from X-ray structures. Despite some promising findings leading to claims that GNNs can learn details of protein–ligand interactions, such predictions are also controversially viewed. For example, evidence has been presented that GNNs might not learn protein–ligand interactions but memorize ligand and protein training data instead. We have carried out affinity predictions with six GNN architectures on community-standard datasets and rationalized the predictions using explainable artificial intelligence. The results confirm a strong influence of ligand—but not protein—memorization during GNN learning and also show that some GNN architectures increasingly prioritize interaction information for predicting high affinities. Thus, while GNNs do not comprehensively account for protein–ligand interactions and physical reality, depending on the model, they balance ligand memorization with learning of interaction patterns.

Learning characteristics of graph neural networks predicting protein–ligand affinities / Mastropietro, Andrea; Pasculli, Giuseppe; Bajorath, Jürgen. - In: NATURE MACHINE INTELLIGENCE. - ISSN 2522-5839. - 5:12(2023), pp. 1427-1436. [10.1038/s42256-023-00756-9]

Learning characteristics of graph neural networks predicting protein–ligand affinities

Andrea Mastropietro^Primo;Giuseppe Pasculli^Secondo;

2023

Abstract

In drug design, compound potency prediction is a popular machine learning application. Graph neural networks (GNNs) predict ligand affinity from graph representations of protein–ligand interactions typically extracted from X-ray structures. Despite some promising findings leading to claims that GNNs can learn details of protein–ligand interactions, such predictions are also controversially viewed. For example, evidence has been presented that GNNs might not learn protein–ligand interactions but memorize ligand and protein training data instead. We have carried out affinity predictions with six GNN architectures on community-standard datasets and rationalized the predictions using explainable artificial intelligence. The results confirm a strong influence of ligand—but not protein—memorization during GNN learning and also show that some GNN architectures increasingly prioritize interaction information for predicting high affinities. Thus, while GNNs do not comprehensively account for protein–ligand interactions and physical reality, depending on the model, they balance ligand memorization with learning of interaction patterns.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Parole chiave
	
				chemoinformatics; graph neural networks; explainable artificial intelligence
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Learning characteristics of graph neural networks predicting protein–ligand affinities / Mastropietro, Andrea; Pasculli, Giuseppe; Bajorath, Jürgen. - In: NATURE MACHINE INTELLIGENCE. - ISSN 2522-5839. - 5:12(2023), pp. 1427-1436. [10.1038/s42256-023-00756-9]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Mastropietro_Learning-characteristics_2023.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.43 MB Formato Adobe PDF Contatta l'autore	1.43 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1691854

Citazioni

ND

20

13

Catalogo dei prodotti della ricerca