Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. However, although the 3D molecular graph structure is necessary for models to achieve strong performance on many tasks, it is infeasible to obtain 3D structures at the scale required by many real-world applications. To tackle this issue, we propose to use existing 3D molecular datasets to pre-train a model to reason about the geometry of molecules given only their 2D molecular graphs. Our method, called 3D Infomax, maximizes the mutual information between learned 3D summary vectors and the representations of a graph neural network (GNN). During fine-tuning on molecules with unknown geometry, the GNN is still able to produce implicit 3D information and uses it for downstream tasks. We show that 3D Infomax provides significant improvements for a wide range of properties, including a 22% average MAE reduction on QM9 quantum mechanical properties. Moreover, the learned representations can be effectively transferred between datasets in different molecular spaces.

3D Infomax improves GNNs for Molecular Property Prediction / Stark, H.; Beaini, D.; Corso, G.; Tossou, P.; Dallago, C.; Gunnemann, S.; Lio, P.. - 162:(2022), pp. 20479-20502. (Intervento presentato al convegno International Conference on Machine Learning tenutosi a Baltimore; usa).

3D Infomax improves GNNs for Molecular Property Prediction

Lio P.
2022

Abstract

Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. However, although the 3D molecular graph structure is necessary for models to achieve strong performance on many tasks, it is infeasible to obtain 3D structures at the scale required by many real-world applications. To tackle this issue, we propose to use existing 3D molecular datasets to pre-train a model to reason about the geometry of molecules given only their 2D molecular graphs. Our method, called 3D Infomax, maximizes the mutual information between learned 3D summary vectors and the representations of a graph neural network (GNN). During fine-tuning on molecules with unknown geometry, the GNN is still able to produce implicit 3D information and uses it for downstream tasks. We show that 3D Infomax provides significant improvements for a wide range of properties, including a 22% average MAE reduction on QM9 quantum mechanical properties. Moreover, the learned representations can be effectively transferred between datasets in different molecular spaces.
2022
International Conference on Machine Learning
Deep learning; Molecules; Quantum theory; Three dimensional computer graphics
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
3D Infomax improves GNNs for Molecular Property Prediction / Stark, H.; Beaini, D.; Corso, G.; Tossou, P.; Dallago, C.; Gunnemann, S.; Lio, P.. - 162:(2022), pp. 20479-20502. (Intervento presentato al convegno International Conference on Machine Learning tenutosi a Baltimore; usa).
File allegati a questo prodotto
File Dimensione Formato  
Stark_3D-Infomax_2022.pdf

accesso aperto

Note: https://proceedings.mlr.press/v162/stark22a.html
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.71 MB
Formato Adobe PDF
1.71 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1721198
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 94
  • ???jsp.display-item.citation.isi??? 3
social impact