In this paper we propose a classifier for generalized sequences that is conceived in the granular computing framework. The classification system processes the input sequences of objects by means of a suited interplay among dissimilarity and clustering based techniques. The core data mining engine retrieves information granules that are used to represent the input sequences as feature vectors. Such a representation allows to deal with the original sequence classification problem through standard pattern recognition tools. We have evaluated the generalization capability of the system in an interesting case study concerning the protein folding problem. In the considered dataset, the entire E. Coli proteome was screened as for the prediction of protein relative solubility on a pure amino acids sequence basis. We report the analysis of the dataset considering different settings, showing interesting test set classification accuracy results. The developed system consents also to extract knowl

A dissimilarity-based classifier for generalized sequences by a granular computing approach / Rizzi, Antonello; Possemato, Francesca; Livi, Lorenzo; Azzurra, Sebastiani; Alessandro, Giuliani; FRATTALE MASCIOLI, Fabio Massimo. - (2013), pp. 1-8. ((Intervento presentato al convegno 2013 International Joint Conference on Neural Networks, IJCNN 2013 tenutosi a Dallas; United States nel 4 August 2013 through 9 August 2013 [10.1109/ijcnn.2013.6707041].

A dissimilarity-based classifier for generalized sequences by a granular computing approach

RIZZI, Antonello;POSSEMATO, FRANCESCA;LIVI, LORENZO;FRATTALE MASCIOLI, Fabio Massimo
2013

Abstract

In this paper we propose a classifier for generalized sequences that is conceived in the granular computing framework. The classification system processes the input sequences of objects by means of a suited interplay among dissimilarity and clustering based techniques. The core data mining engine retrieves information granules that are used to represent the input sequences as feature vectors. Such a representation allows to deal with the original sequence classification problem through standard pattern recognition tools. We have evaluated the generalization capability of the system in an interesting case study concerning the protein folding problem. In the considered dataset, the entire E. Coli proteome was screened as for the prediction of protein relative solubility on a pure amino acids sequence basis. We report the analysis of the dataset considering different settings, showing interesting test set classification accuracy results. The developed system consents also to extract knowl
2013 International Joint Conference on Neural Networks, IJCNN 2013
sequence representation and classification; granular computing and modeling; protein folding prediction; sequence representation and classification
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
A dissimilarity-based classifier for generalized sequences by a granular computing approach / Rizzi, Antonello; Possemato, Francesca; Livi, Lorenzo; Azzurra, Sebastiani; Alessandro, Giuliani; FRATTALE MASCIOLI, Fabio Massimo. - (2013), pp. 1-8. ((Intervento presentato al convegno 2013 International Joint Conference on Neural Networks, IJCNN 2013 tenutosi a Dallas; United States nel 4 August 2013 through 9 August 2013 [10.1109/ijcnn.2013.6707041].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/526112
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 0
social impact