Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. However, although the 3D molecular graph structure is necessary for models to achieve strong performance on many tasks, it is infeasible to obtain 3D structures at the scale required by many real-world applications. To tackle this issue, we propose to use existing 3D molecular datasets to pre-train a model to reason about the geometry of molecules given only their 2D molecular graphs. Our method, called 3D Infomax, maximizes the mutual information between learned 3D summary vectors and the representations of a graph neural network (GNN). During fine-tuning on molecules with unknown geometry, the GNN is still able to produce implicit 3D information and uses it for downstream tasks. We show that 3D Infomax provides significant improvements for a wide range of properties, including a 22% average MAE reduction on QM9 quantum mechanical properties. Moreover, the learned representations can be effectively transferred between datasets in different molecular spaces.
3D Infomax improves GNNs for Molecular Property Prediction / Stark, H.; Beaini, D.; Corso, G.; Tossou, P.; Dallago, C.; Gunnemann, S.; Lio, P.. - 162:(2022), pp. 20479-20502. (Intervento presentato al convegno International Conference on Machine Learning tenutosi a Baltimore; usa).
3D Infomax improves GNNs for Molecular Property Prediction
Lio P.
2022
Abstract
Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. However, although the 3D molecular graph structure is necessary for models to achieve strong performance on many tasks, it is infeasible to obtain 3D structures at the scale required by many real-world applications. To tackle this issue, we propose to use existing 3D molecular datasets to pre-train a model to reason about the geometry of molecules given only their 2D molecular graphs. Our method, called 3D Infomax, maximizes the mutual information between learned 3D summary vectors and the representations of a graph neural network (GNN). During fine-tuning on molecules with unknown geometry, the GNN is still able to produce implicit 3D information and uses it for downstream tasks. We show that 3D Infomax provides significant improvements for a wide range of properties, including a 22% average MAE reduction on QM9 quantum mechanical properties. Moreover, the learned representations can be effectively transferred between datasets in different molecular spaces.File | Dimensione | Formato | |
---|---|---|---|
Stark_3D-Infomax_2022.pdf
accesso aperto
Note: https://proceedings.mlr.press/v162/stark22a.html
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
1.71 MB
Formato
Adobe PDF
|
1.71 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.