Catalogo dei prodotti della ricerca

Sequences of nucleotides (for DNA and RNA) or aminoacids (for proteins) are central objects in biology. Among the most important computational problems is that of sequence alignment, i.e. arranging sequences from different organisms in such a way to identify similar regions, to detect evolutionary relationships between sequences, and to predict biomolecular structure and function. This is typically addressed through profile models, which capture position-specificities like conservation in sequences, but assume an independent evolution of different positions. Over the last years, it has been well established that coevolution of different amino-acid positions is essential for maintaining three-dimensional structure and function. Modeling approaches based on inverse statistical physics can catch the coevolution signal in sequence ensembles; and they are now widely used in predicting protein structure, protein-protein interactions, and mutational landscapes. Here, we present DCAlign, an efficient alignment algorithm based on an approximate message-passing strategy, which is able to overcome the limitations of profile models, to include coevolution among positions in a general way, and to be therefore universally applicable to protein- and RNA-sequence alignment without the need of using complementary structural information. The potential of DCAlign is carefully explored using well-controlled simulated data, as well as real protein and RNA sequences.

Aligning biological sequences by exploiting residue conservation and coevolution / Paola Muntoni, Anna; Pagnani, Andrea; Weigt, Martin; Zamponi, Francesco. - In: PHYSICAL REVIEW. E. - ISSN 2470-0045. - 102:(2020), pp. 1-30. [10.1103/PhysRevE.102.062409]

Aligning biological sequences by exploiting residue conservation and coevolution

Anna Paola Muntoni;Andrea Pagnani;Martin Weigt;Francesco Zamponi

2020

Abstract

Sequences of nucleotides (for DNA and RNA) or aminoacids (for proteins) are central objects in biology. Among the most important computational problems is that of sequence alignment, i.e. arranging sequences from different organisms in such a way to identify similar regions, to detect evolutionary relationships between sequences, and to predict biomolecular structure and function. This is typically addressed through profile models, which capture position-specificities like conservation in sequences, but assume an independent evolution of different positions. Over the last years, it has been well established that coevolution of different amino-acid positions is essential for maintaining three-dimensional structure and function. Modeling approaches based on inverse statistical physics can catch the coevolution signal in sequence ensembles; and they are now widely used in predicting protein structure, protein-protein interactions, and mutational landscapes. Here, we present DCAlign, an efficient alignment algorithm based on an approximate message-passing strategy, which is able to overcome the limitations of profile models, to include coevolution among positions in a general way, and to be therefore universally applicable to protein- and RNA-sequence alignment without the need of using complementary structural information. The potential of DCAlign is carefully explored using well-controlled simulated data, as well as real protein and RNA sequences.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2020
			
	Parole chiave
	
				Sequence alignment; direct coupling analysis; bioinformatics
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Aligning biological sequences by exploiting residue conservation and coevolution / Paola Muntoni, Anna; Pagnani, Andrea; Weigt, Martin; Zamponi, Francesco. - In: PHYSICAL REVIEW. E. - ISSN 2470-0045. - 102:(2020), pp. 1-30. [10.1103/PhysRevE.102.062409]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Muntoni_Alligning-biological_2020.pdf accesso aperto Note: Articolo rivista Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.07 MB Formato Adobe PDF	3.07 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1693867

Citazioni

ND

9

9

social impact