State-of-the-art models for 3D molecular generation are based on significant inductive biases: SE(3) equivariance, permutation invariance and graph message-passing networks to capture local chemistry, yet the generated molecules struggle with physical plausibility. We introduce TABASCO which relaxes these assumptions: The model has a standard non-equivariant transformer architecture, treats atoms in a molecule as sequences and does not explicitly model bonds. The absence of equivariant layers and message passing allows us to simplify the model architecture and scale data throughput. On the GEOM-Drugs and QM9 benchmarks TABASCO achieves state-of-the-art PoseBusters validity and delivers inference roughly 10× faster than the strongest baseline, while exhibiting emergent rotational equivariance without hard-coded symmetry. Our work offers a blueprint for training minimalist, high-throughput, unconditional generative models and the resulting architecture is readily extensible to future conditional tasks. We provide a link to our implementation at https://github.com/carlosinator/tabasco.

TABASCO: A Fast, Simplified Model for Molecular Generation with Improved Physical Quality / Vonessen, C.; Harris, C.; Cretu, M.; Lio, P.. - In: TRANSACTIONS ON MACHINE LEARNING RESEARCH. - ISSN 2835-8856. - 2026:(2026).

TABASCO: A Fast, Simplified Model for Molecular Generation with Improved Physical Quality

Lio P.
2026

Abstract

State-of-the-art models for 3D molecular generation are based on significant inductive biases: SE(3) equivariance, permutation invariance and graph message-passing networks to capture local chemistry, yet the generated molecules struggle with physical plausibility. We introduce TABASCO which relaxes these assumptions: The model has a standard non-equivariant transformer architecture, treats atoms in a molecule as sequences and does not explicitly model bonds. The absence of equivariant layers and message passing allows us to simplify the model architecture and scale data throughput. On the GEOM-Drugs and QM9 benchmarks TABASCO achieves state-of-the-art PoseBusters validity and delivers inference roughly 10× faster than the strongest baseline, while exhibiting emergent rotational equivariance without hard-coded symmetry. Our work offers a blueprint for training minimalist, high-throughput, unconditional generative models and the resulting architecture is readily extensible to future conditional tasks. We provide a link to our implementation at https://github.com/carlosinator/tabasco.
2026
TABASCO; 3D molecular generation; Non-equivariant transformer
01 Pubblicazione su rivista::01a Articolo in rivista
TABASCO: A Fast, Simplified Model for Molecular Generation with Improved Physical Quality / Vonessen, C.; Harris, C.; Cretu, M.; Lio, P.. - In: TRANSACTIONS ON MACHINE LEARNING RESEARCH. - ISSN 2835-8856. - 2026:(2026).
File allegati a questo prodotto
File Dimensione Formato  
Vonessen_TABASCO_2026.pdf

solo gestori archivio

Note: https://openreview.net/forum?id=Kg6CSrbXl4
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 9.92 MB
Formato Adobe PDF
9.92 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1768837
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact