Novel"manycore" architectures, such as graphics processors, are high-parallel and high-performance shared-memory ar- chitectures [7] born to solve specific problems such as the graphical ones. Those architectures can be exploited to solve a wider range of problems by designing the related algorithm for such architectures. We present a fast sorting algorithm implementing an efficient bitonic sorting network. This algorithm is highly suitable for information retrieval applications. Sorting is a fundamental and universal prob- lem in computer science. Even if sort has been extensively addressed by many research works, it still remains an inter- esting challenge to make it faster by exploiting novel tech- nologies. In this light, this paper shows how to use graph- ics processors as coprocessors to speed up sorting while al- lowing CPU to perform other tasks. Our new algorithm exploits a memory-efficient data access pattern maintain- ing the minimum number of accesses to the memory out of the chip. We introduce an efficient instruction dispatch mechanism to improve the overall sorting performance. We also present a cache-based computational model for graph- ics processors. Experimental results highlight remarkable improvements over prior CPU-based sorting methods, and a significant improvement over previous GPU-based sorting algorithms.

Sorting using bitonic network with CUDA / Baraglia, R.; Capannini, G.; Nardini, F. M.; Silvestri, F.. - 480:(2009), pp. 33-40. ((Intervento presentato al convegno 7th Workshop on Large-Scale Distributed Systems for Information Retrieval, LSDS-IR 2009 - Co-located with ACM SIGIR 2009 tenutosi a Boston, MA, usa.

Sorting using bitonic network with CUDA

Silvestri F.
2009

Abstract

Novel"manycore" architectures, such as graphics processors, are high-parallel and high-performance shared-memory ar- chitectures [7] born to solve specific problems such as the graphical ones. Those architectures can be exploited to solve a wider range of problems by designing the related algorithm for such architectures. We present a fast sorting algorithm implementing an efficient bitonic sorting network. This algorithm is highly suitable for information retrieval applications. Sorting is a fundamental and universal prob- lem in computer science. Even if sort has been extensively addressed by many research works, it still remains an inter- esting challenge to make it faster by exploiting novel tech- nologies. In this light, this paper shows how to use graph- ics processors as coprocessors to speed up sorting while al- lowing CPU to perform other tasks. Our new algorithm exploits a memory-efficient data access pattern maintain- ing the minimum number of accesses to the memory out of the chip. We introduce an efficient instruction dispatch mechanism to improve the overall sorting performance. We also present a cache-based computational model for graph- ics processors. Experimental results highlight remarkable improvements over prior CPU-based sorting methods, and a significant improvement over previous GPU-based sorting algorithms.
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1571312
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? ND
social impact