Novel"manycore" architectures, such as graphics processors, are high-parallel and high-performance shared-memory ar- chitectures [7] born to solve specific problems such as the graphical ones. Those architectures can be exploited to solve a wider range of problems by designing the related algorithm for such architectures. We present a fast sorting algorithm implementing an efficient bitonic sorting network. This algorithm is highly suitable for information retrieval applications. Sorting is a fundamental and universal prob- lem in computer science. Even if sort has been extensively addressed by many research works, it still remains an inter- esting challenge to make it faster by exploiting novel tech- nologies. In this light, this paper shows how to use graph- ics processors as coprocessors to speed up sorting while al- lowing CPU to perform other tasks. Our new algorithm exploits a memory-efficient data access pattern maintain- ing the minimum number of accesses to the memory out of the chip. We introduce an efficient instruction dispatch mechanism to improve the overall sorting performance. We also present a cache-based computational model for graph- ics processors. Experimental results highlight remarkable improvements over prior CPU-based sorting methods, and a significant improvement over previous GPU-based sorting algorithms.
Sorting using bitonic network with CUDA / Baraglia, R.; Capannini, G.; Nardini, F. M.; Silvestri, F.. - 480:(2009), pp. 33-40. (Intervento presentato al convegno 7th Workshop on Large-Scale Distributed Systems for Information Retrieval, LSDS-IR 2009 - Co-located with ACM SIGIR 2009 tenutosi a Boston, MA, usa).
Sorting using bitonic network with CUDA
Silvestri F.
2009
Abstract
Novel"manycore" architectures, such as graphics processors, are high-parallel and high-performance shared-memory ar- chitectures [7] born to solve specific problems such as the graphical ones. Those architectures can be exploited to solve a wider range of problems by designing the related algorithm for such architectures. We present a fast sorting algorithm implementing an efficient bitonic sorting network. This algorithm is highly suitable for information retrieval applications. Sorting is a fundamental and universal prob- lem in computer science. Even if sort has been extensively addressed by many research works, it still remains an inter- esting challenge to make it faster by exploiting novel tech- nologies. In this light, this paper shows how to use graph- ics processors as coprocessors to speed up sorting while al- lowing CPU to perform other tasks. Our new algorithm exploits a memory-efficient data access pattern maintain- ing the minimum number of accesses to the memory out of the chip. We introduce an efficient instruction dispatch mechanism to improve the overall sorting performance. We also present a cache-based computational model for graph- ics processors. Experimental results highlight remarkable improvements over prior CPU-based sorting methods, and a significant improvement over previous GPU-based sorting algorithms.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.