Motivation G-quadruplex-binding proteins (G4BPs) play key roles in RNA metabolism and stress response, yet their identification remains experimentally challenging. Here, we present a deep learning (DL) framework for the prediction of RNA G4BPs (RG4BPs), integrating diverse encoding strategies and neural architectures. Our best-performing model, which includes ESM-2 protein language model embeddings and consists of an LSTM architecture, achieved 86% accuracy in distinguishing RG4BPs from non-binder proteins. The application of this model to the human proteome uncovered 2160 high-confidence RG4BP candidates, many of which display intrinsically disordered regions (IDRs) and enrichment in stress granule organelles. These findings reveal a potential link between G-quadruplex recognition and cellular stress responses. To enable easy and broad access to the framework, we developed G4REP, a web server for RG4BP prediction and analysis. Overall, an effective approach to explore the RG4BPs landscape and uncover novel players in RNA regulation is provided.
A deep learning framework for comprehensive prediction of human RNA G-quadruplex-binding proteins / Rosignoli, Serena; Taraglio, Sophie; Di Luzio, Francesco; Lustrino, Elisa; Marzella, Dario; Elofsson, Arne; Panella, Massimo; Paiardini, Alessandro. - In: BIOINFORMATICS. - ISSN 1367-4803. - 42:3(2026), pp. 1-10. [10.1093/bioinformatics/btag088]
A deep learning framework for comprehensive prediction of human RNA G-quadruplex-binding proteins
Rosignoli, Serena
Co-primo
;Taraglio, SophieCo-primo
;Di Luzio, Francesco;Lustrino, Elisa;Panella, Massimo;Paiardini, Alessandro
2026
Abstract
Motivation G-quadruplex-binding proteins (G4BPs) play key roles in RNA metabolism and stress response, yet their identification remains experimentally challenging. Here, we present a deep learning (DL) framework for the prediction of RNA G4BPs (RG4BPs), integrating diverse encoding strategies and neural architectures. Our best-performing model, which includes ESM-2 protein language model embeddings and consists of an LSTM architecture, achieved 86% accuracy in distinguishing RG4BPs from non-binder proteins. The application of this model to the human proteome uncovered 2160 high-confidence RG4BP candidates, many of which display intrinsically disordered regions (IDRs) and enrichment in stress granule organelles. These findings reveal a potential link between G-quadruplex recognition and cellular stress responses. To enable easy and broad access to the framework, we developed G4REP, a web server for RG4BP prediction and analysis. Overall, an effective approach to explore the RG4BPs landscape and uncover novel players in RNA regulation is provided.| File | Dimensione | Formato | |
|---|---|---|---|
|
Rosignoli_A-deep-learning_2026.pdf
accesso aperto
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
2.61 MB
Formato
Adobe PDF
|
2.61 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


