Catalogo dei prodotti della ricerca

The discovery of long non-coding RNAs (lncRNA) has revealed additional layers of gene-expression control. Specific interactions of lncRNAs with DNA, RNAs, and RNA-binding proteins enable regulation in both cytoplasmic and nuclear compartments; e.g. a conserved triple-helix motif is essential for MALAT1 stability and oncogenic activity. Here, we present a secondary-structure-based framework to annotate and detect RNA triple helices. First, we extend the dot-bracket formalism with a third annotation line that encodes Hoogsteen contacts. Second, we introduce TripleMatcher, which searches for a triple-helix pattern, filters candidates by C1'-C1' distance thresholds, and merges overlaps into region-level zones. Using telomerase RNAs and RNA-stability elements with experimentally established triple helices (8 RNAs), TripleMatcher localized all annotated regions (structure-wise detection 8/8); geometric filtering removed most spurious candidates and improved precision (positive predictive value from 0.42 to 0.81) and overall accuracy (F$_{1}$ from 0.42 to 0.62) while maintaining sensitivity. Benchmarking eight predictors showed that pseudoknot-aware methods most reliably reproduce the local architecture required for detection, aligning secondary-structure quality with downstream triple-helix recovery. Applied prospectively, the framework identified candidate regions directly from predicted secondary structures and scaled to a screen of 4160 RNAs, where distance filtering reduced 150 990 (median per molecule: 108 [20-270]) raw candidates to 97 geometrically feasible regions across seven molecules, including human telomerase complexes. Together, the notation and TripleMatcher provide a concise route from secondary structure to a small, interpretable set of triple-helix candidates suitable for targeted experimental validation.

Decoding RNA triple helices: identification from sequence and secondary structure / Matarrese, Margherita A. G.; Quadrini, Michela; Luchetti, Nicole; Di Petta, Federico; Durante, Daniele; Ballarino, Monica; Chiodo, Letizia; Tesei, Luca. - In: BRIEFINGS IN BIOINFORMATICS. - ISSN 1467-5463. - 27:1(2026). [10.1093/bib/bbag009]

Decoding RNA triple helices: identification from sequence and secondary structure

Michela Quadrini^Methodology;Nicole Luchetti^Methodology;Federico Di Petta^Methodology;Daniele Durante^Methodology;Monica Ballarino^{Funding Acquisition};Letizia Chiodo^{Data Curation};

2026

Abstract

The discovery of long non-coding RNAs (lncRNA) has revealed additional layers of gene-expression control. Specific interactions of lncRNAs with DNA, RNAs, and RNA-binding proteins enable regulation in both cytoplasmic and nuclear compartments; e.g. a conserved triple-helix motif is essential for MALAT1 stability and oncogenic activity. Here, we present a secondary-structure-based framework to annotate and detect RNA triple helices. First, we extend the dot-bracket formalism with a third annotation line that encodes Hoogsteen contacts. Second, we introduce TripleMatcher, which searches for a triple-helix pattern, filters candidates by C1'-C1' distance thresholds, and merges overlaps into region-level zones. Using telomerase RNAs and RNA-stability elements with experimentally established triple helices (8 RNAs), TripleMatcher localized all annotated regions (structure-wise detection 8/8); geometric filtering removed most spurious candidates and improved precision (positive predictive value from 0.42 to 0.81) and overall accuracy (F$_{1}$ from 0.42 to 0.62) while maintaining sensitivity. Benchmarking eight predictors showed that pseudoknot-aware methods most reliably reproduce the local architecture required for detection, aligning secondary-structure quality with downstream triple-helix recovery. Applied prospectively, the framework identified candidate regions directly from predicted secondary structures and scaled to a screen of 4160 RNAs, where distance filtering reduced 150 990 (median per molecule: 108 [20-270]) raw candidates to 97 geometrically feasible regions across seven molecules, including human telomerase complexes. Together, the notation and TripleMatcher provide a concise route from secondary structure to a small, interpretable set of triple-helix candidates suitable for targeted experimental validation.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				non-coding RNA; long non-coding RNA; RNA pattern search; RNA secondary structure; RNA structure prediction
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Decoding RNA triple helices: identification from sequence and secondary structure / Matarrese, Margherita A. G.; Quadrini, Michela; Luchetti, Nicole; Di Petta, Federico; Durante, Daniele; Ballarino, Monica; Chiodo, Letizia; Tesei, Luca. - In: BRIEFINGS IN BIOINFORMATICS. - ISSN 1467-5463. - 27:1(2026). [10.1093/bib/bbag009]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Matarrese_Decoding_2026.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 2.02 MB Formato Adobe PDF	2.02 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1758791

Citazioni

ND

0

0

social impact