Catalogo dei prodotti della ricerca

We evaluate a version of the recently-proposed classification system named Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a classification system for patterns represented as labeled graphs. However, since ODSE is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. Here we demonstrate the effectiveness of the ODSE classifier for sequences by considering an application dealing with the recognition of the solubility degree of the Escherichia coli proteome. Solubility, or analogously aggregation propensity, is an important property of protein molecules, which is intimately related to the mechanisms underlying the chemico-physical process of folding. Each protein of our dataset is initially associated with a solubility degree and it is represented as a sequence of symbols, denoting the 20 amino acid residues. The herein obtained computational results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.

We evaluate a version of the recently-proposed classification system named Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a classification system for patterns represented as labeled graphs. However, since ODSE is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. Here we demonstrate the effectiveness of the ODSE classifier for sequences by considering an application dealing with the recognition of the solubility degree of the Escherichia coli proteome. Solubility, or analogously aggregation propensity, is an important property of protein molecules, which is intimately related to the mechanisms underlying the chemico-physical process of folding. Each protein of our dataset is initially associated with a solubility degree and it is represented as a sequence of symbols, denoting the 20 amino acid residues. The herein obtained computational results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.

Classifying sequences by the optimized dissimilarity space embedding approach: A case study on the solubility analysis of the E. coli proteome / Livi, Lorenzo; Rizzi, Antonello; Sadeghian, Alireza. - In: JOURNAL OF INTELLIGENT & FUZZY SYSTEMS. - ISSN 1064-1246. - STAMPA. - 28:6(2015), pp. 2725-2733. [10.3233/IFS-151550]

Classifying sequences by the optimized dissimilarity space embedding approach: A case study on the solubility analysis of the E. coli proteome

LIVI, LORENZO;RIZZI, Antonello;Sadeghian, Alireza

2015

Abstract

We evaluate a version of the recently-proposed classification system named Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a classification system for patterns represented as labeled graphs. However, since ODSE is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. Here we demonstrate the effectiveness of the ODSE classifier for sequences by considering an application dealing with the recognition of the solubility degree of the Escherichia coli proteome. Solubility, or analogously aggregation propensity, is an important property of protein molecules, which is intimately related to the mechanisms underlying the chemico-physical process of folding. Each protein of our dataset is initially associated with a solubility degree and it is represented as a sequence of symbols, denoting the 20 amino acid residues. The herein obtained computational results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2015
			
	Abstract
	
				We evaluate a version of the recently-proposed classification system named Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a classification system for patterns represented as labeled graphs. However, since ODSE is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. Here we demonstrate the effectiveness of the ODSE classifier for sequences by considering an application dealing with the recognition of the solubility degree of the Escherichia coli proteome. Solubility, or analogously aggregation propensity, is an important property of protein molecules, which is intimately related to the mechanisms underlying the chemico-physical process of folding. Each protein of our dataset is initially associated with a solubility degree and it is represented as a sequence of symbols, denoting the 20 amino acid residues. The herein obtained computational results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.
			
	Parole chiave
	
				Dissimilarity representation; E. coli proteome analysis; entropy-based data characterization; sequence matching and classification; Artificial Intelligence; Engineering (all); Statistics and Probability
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Classifying sequences by the optimized dissimilarity space embedding approach: A case study on the solubility analysis of the E. coli proteome / Livi, Lorenzo; Rizzi, Antonello; Sadeghian, Alireza. - In: JOURNAL OF INTELLIGENT & FUZZY SYSTEMS. - ISSN 1064-1246. - STAMPA. - 28:6(2015), pp. 2725-2733. [10.3233/IFS-151550]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Livi_Classifying-sequences_2015 .pdf solo utenti autorizzati Note: Classifying sequences by the optimized dissimilarity space embedding approach: A case study on the solubility analysis of the E. coli proteome Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 222.65 kB Formato Adobe PDF Contatta l'autore	222.65 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/793956

Citazioni

ND

1

1

social impact