Catalogo dei prodotti della ricerca

Computational Intelligence methods are typically designed according to the assumption that the input space is essentially a vector space. When departing from vector-based pattern representations many theoretical and practical problems arise, which are mostly due to the absence of an intuitive geometric interpretation of the data. However, since such representations could offer additional insights when used in real-world applications of data-driven inference systems, their exploitation is also a practical and convenient choice. Here we apply several state-of-the-art classification methods for non-geometric data, with the aim to compare different representations of the proteins gathered from Niwa et al. (2009) [35]. Such representations include sequences of objects and labeled (contact) graphs enriched with chemico-physical attributes. The experiment performed by Niwa et al. provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free, standardized microenvironment. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating “easily foldable” from “hardly foldable” molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution that we used in the various discrimination systems.

Computational Intelligence methods are typically designed according to the assumption that the input space is essentially a vector space. When departing from vector-based pattern representations many theoretical and practical problems arise, which are mostly due to the absence of an intuitive geometric interpretation of the data. However, since such representations could offer additional insights when used in real-world applications of data-driven inference systems, their exploitation is also a practical and convenient choice. Here we apply several state-of-the-art classification methods for non-geometric data, with the aim to compare different representations of the proteins gathered from Niwa et al. (2009) [35]. Such representations include sequences of objects and labeled (contact) graphs enriched with chemico-physical attributes. The experiment performed by Niwa et al. provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free, standardized microenvironment. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating “easily foldable” from “hardly foldable” molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution that we used in the various discrimination systems.

Toward a multilevel representation of protein molecules. Comparative approaches to the aggregation/folding propensity problem / Livi, Lorenzo; Giuliani, Alessandro; Rizzi, Antonello. - In: INFORMATION SCIENCES. - ISSN 0020-0255. - STAMPA. - 326:(2016), pp. 134-145. [10.1016/j.ins.2015.07.043]

Toward a multilevel representation of protein molecules. Comparative approaches to the aggregation/folding propensity problem

LIVI, LORENZO;Giuliani, Alessandro;RIZZI, Antonello

2016

Abstract

Computational Intelligence methods are typically designed according to the assumption that the input space is essentially a vector space. When departing from vector-based pattern representations many theoretical and practical problems arise, which are mostly due to the absence of an intuitive geometric interpretation of the data. However, since such representations could offer additional insights when used in real-world applications of data-driven inference systems, their exploitation is also a practical and convenient choice. Here we apply several state-of-the-art classification methods for non-geometric data, with the aim to compare different representations of the proteins gathered from Niwa et al. (2009) [35]. Such representations include sequences of objects and labeled (contact) graphs enriched with chemico-physical attributes. The experiment performed by Niwa et al. provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free, standardized microenvironment. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating “easily foldable” from “hardly foldable” molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution that we used in the various discrimination systems.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2016
			
	Abstract
	
				Computational Intelligence methods are typically designed according to the assumption that the input space is essentially a vector space. When departing from vector-based pattern representations many theoretical and practical problems arise, which are mostly due to the absence of an intuitive geometric interpretation of the data. However, since such representations could offer additional insights when used in real-world applications of data-driven inference systems, their exploitation is also a practical and convenient choice. Here we apply several state-of-the-art classification methods for non-geometric data, with the aim to compare different representations of the proteins gathered from Niwa et al. (2009) [35]. Such representations include sequences of objects and labeled (contact) graphs enriched with chemico-physical attributes. The experiment performed by Niwa et al. provides the unique possibility to analyze the relative aggregation/folding propensity of the elements of the entire Escherichia coli (E. coli) proteome in a cell-free, standardized microenvironment. By this comparison, we are able to identify also some interesting general properties of proteins. Notably, (i) we suggest a threshold around 250 residues discriminating “easily foldable” from “hardly foldable” molecules consistent with other independent experiments, and (ii) we highlight the relevance of contact graph spectra for folding behavior discrimination and characterization of the E. coli solubility data. The soundness of the experimental results presented in this paper is proved by the statistically relevant relationships discovered among the chemico-physical description of proteins and the developed cost matrix of substitution that we used in the various discrimination systems.
			
	Parole chiave
	
				Classification of structured data; protein aggregation; protein folding; sequence-structure relation; artificial intelligence; software; control and systems engineering; theoretical computer science; computer science applications; computer vision and pattern recognition
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Toward a multilevel representation of protein molecules. Comparative approaches to the aggregation/folding propensity problem / Livi, Lorenzo; Giuliani, Alessandro; Rizzi, Antonello. - In: INFORMATION SCIENCES. - ISSN 0020-0255. - STAMPA. - 326:(2016), pp. 134-145. [10.1016/j.ins.2015.07.043]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Livi_Toward-multileve_2016.pdf solo utenti autorizzati Note: Toward a multilevel representation of protein molecules: Comparative approaches to the aggregation/folding propensity problem Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 362.57 kB Formato Adobe PDF Contatta l'autore	362.57 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/847734

Citazioni

ND

12

11

social impact