If you are interested in using the GPU for general purpose computation, i.e. GGPU, then Thrust is a library that can save you a huge amount of effort.
Thrust is an open-source template library for CUDA applications and the latest version 1.3 has just been released.
Modeled after the C++ Standard Template Library (STL), Thrust brings a familiar abstraction layer to GPU computing.
Version 1.3 adds several new features, including:
- a state-of-the-art sorting implementation,
- performance improvements to stream compaction and reduction
- robust error reporting and failure detection
- support for CUDA 3.2 and gf104-based GPUs
- search algorithm
To give you some idea of how easy Thrust is to use consider the following short program that generates random numbers and then sorts them using the GPU: (Note: "host" means CPU and "device" means GPU)
// generate 32M random numbers on
thrust::host_vector<int> h_vec(1 << 24);
// transfer data to the device
// sort data on the device
// (846M keys per second on
// GeForce GTX 480)
// transfer data back to host
Notice the way that Thrust methods can be used on both the host and the device and how the whole messy business of parallelising algorithms is hidden from the programmer.
To get started first download Thrust v1.3 and then follow the online quick-start guide. Refer to the online documentation for a complete list of features. Many examples and a set of introductory slides are also available.
Thrust is open-source software distributed under the Apache License v2.0.
Other relevant articles:
GPU Gems Volume 1