Increasingly the datasets used for data mining are becoming huge and physically distributed. Since the distributed knowledge discovery process is bothdata and computational intensive, the Grid is a natural platform for deploying a high performance data mining service. The focus of this paper is on the core services of such a Grid infrastructure. In particular we concentrate our attention on the design and implementation of specialized broker aware of data source locations and resource needs of data mining tasks. Allocation and scheduling decisions are taken on the basis of performance cost metrics and models that exploit knowledge about previous executions, and use sampling to acquire estimate about execution behavior.

Scheduling high performance data mining tasks on a data grid environment / Orlando, S.; Palmerini, P.; Perego, R.; Silvestri, F.. - 2400:(2002), pp. 375-384. (Intervento presentato al convegno 8th International Euro-Par Conference on Parallel Processing, Euro-Par 2002 tenutosi a deu) [10.1007/3-540-45706-2_49].

Scheduling high performance data mining tasks on a data grid environment

Orlando S.;Silvestri F.
2002

Abstract

Increasingly the datasets used for data mining are becoming huge and physically distributed. Since the distributed knowledge discovery process is bothdata and computational intensive, the Grid is a natural platform for deploying a high performance data mining service. The focus of this paper is on the core services of such a Grid infrastructure. In particular we concentrate our attention on the design and implementation of specialized broker aware of data source locations and resource needs of data mining tasks. Allocation and scheduling decisions are taken on the basis of performance cost metrics and models that exploit knowledge about previous executions, and use sampling to acquire estimate about execution behavior.
2002
8th International Euro-Par Conference on Parallel Processing, Euro-Par 2002
caching
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Scheduling high performance data mining tasks on a data grid environment / Orlando, S.; Palmerini, P.; Perego, R.; Silvestri, F.. - 2400:(2002), pp. 375-384. (Intervento presentato al convegno 8th International Euro-Par Conference on Parallel Processing, Euro-Par 2002 tenutosi a deu) [10.1007/3-540-45706-2_49].
File allegati a questo prodotto
File Dimensione Formato  
VE_2002_11573-1572773.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 335.67 kB
Formato Adobe PDF
335.67 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1572773
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 5
social impact