The biological function of multiple repetitions of single amino acids, or homo-repeats, is largely unknown, but their occurrence in proteins has been associated with more than 20 hereditary diseases. Analysing 122 bacterial and eukaryotic genomes, we observed that the number of proteins containing homo-repeats is significantly larger than expected from theoretical estimates. Analysis of statistical significance indicates that the minimal size of homo-repeats varies with amino acid type and proteome. In an attempt to characterize proteins harbouring long homo-repeats, we found that those containing polar or small amino acids S, P, H, E, D, K, Q and N are enriched in structural disorder as well as protein- and RNA-interactions. We observed that E, S, Q, G, L, P, D, A and H homo-repeats are strongly linked with occurrence in human diseases. Moreover, S, E, P, A, Q, D and T homo-repeats are significantly enriched in neuronal proteins associated with autism and other disorders. We release a webserver for further exploration of homo-repeats occurrence in human pathology at http://bioinfo.protres.ru/hradis/.

Non-random distribution of homo-repeats: links with biological functions and human diseases / Lobanov, Michail Yu.; Klus, Petr; Sokolovsky, Igor V.; Tartaglia, Gian Gaetano; Galzitskaya, Oxana V.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 6:1(2016), pp. 1-11. [10.1038/srep26941]

Non-random distribution of homo-repeats: links with biological functions and human diseases

Tartaglia, Gian Gaetano
;
2016

Abstract

The biological function of multiple repetitions of single amino acids, or homo-repeats, is largely unknown, but their occurrence in proteins has been associated with more than 20 hereditary diseases. Analysing 122 bacterial and eukaryotic genomes, we observed that the number of proteins containing homo-repeats is significantly larger than expected from theoretical estimates. Analysis of statistical significance indicates that the minimal size of homo-repeats varies with amino acid type and proteome. In an attempt to characterize proteins harbouring long homo-repeats, we found that those containing polar or small amino acids S, P, H, E, D, K, Q and N are enriched in structural disorder as well as protein- and RNA-interactions. We observed that E, S, Q, G, L, P, D, A and H homo-repeats are strongly linked with occurrence in human diseases. Moreover, S, E, P, A, Q, D and T homo-repeats are significantly enriched in neuronal proteins associated with autism and other disorders. We release a webserver for further exploration of homo-repeats occurrence in human pathology at http://bioinfo.protres.ru/hradis/.
2016
Bacterial Proteins; Databases, Genetic; Gene Ontology; Humans; Proteome; Repetitive Sequences, Amino Acid; Sequence Analysis, Protein; Genetic Predisposition to Disease; Multidisciplinary
01 Pubblicazione su rivista::01a Articolo in rivista
Non-random distribution of homo-repeats: links with biological functions and human diseases / Lobanov, Michail Yu.; Klus, Petr; Sokolovsky, Igor V.; Tartaglia, Gian Gaetano; Galzitskaya, Oxana V.. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - 6:1(2016), pp. 1-11. [10.1038/srep26941]
File allegati a questo prodotto
File Dimensione Formato  
Lobanov_Non-random__2016.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.66 MB
Formato Adobe PDF
1.66 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1279548
Citazioni
  • ???jsp.display-item.citation.pmc??? 12
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 27
social impact