CUDA Toolkit 3.2
Wednesday, 24 November 2010

The new CUDA Toolkit makes matrix operations, FFT and random number generation significantly faster.

Banner

 

The new release of the CUDA Toolkit from nvidia is worth knowing about. It features significant speed increases for Fermi GPUs (GeForce 400/500).  Matrix manipulation is up to 300% faster, the Fast Fourier Transform is faster at 2x to 10x and so is random number generation. The H.264 encode/decode library is also now included with the Toolkit.  Debugging support has also been extended to multi-GPU setups in gdb and Parallel Nsight.

 

nvida

There are also some new SDK code samples:

  • Several code samples demonstrating how to use the new CURAND library, including MonteCarloCURAND, EstimatePiInlineP, EstimatePiInlineQ, EstimatePiP, EstimatePiQ, SingleAsianOptionP, and randomFog
  • Conjugate Gradient Solver, demonstrating the use of CUBLAS and CUSPARSE in the same application
  • Function Pointers, a sample that shows how to use function pointers to implement the Sobel Edge Detection filter for 8-bit monochrome images
  • Interval Computing, demonstrating the use of interval arithmetic operators using C++ templates and recursion
  • Simple Printf, demonstrating best practices for using both printf and cuprintf in compute kernels
  • Bilateral Filter, an edge-preserving non-linear smoothing filter for image recovery and denoising implemented in CUDA C with OpenGL rendering
  • SLI with Direct3D Texture, a simple example demonstrating the use of SLI and Direct3D interoperability with CUDA C
  • cudaEncode, showing how to use the NVIDIA H.264 Encoding Library using YUV frames as input
  • Vflocking Direct3D/CUDA, which simulates and visualizes the flocking behavior of birds in flight
  • simpleSurfaceWrite, demonstrating how CUDA kernels can write to 2D surfaces on Fermi GPUs

The CUDA Toolkit 3.2 is available to download for Windows, Mac OS X and Linux.

Related items

Thrust for CUDA

CUDA by Example

Parallel Nsight - another shot in the GPU war

 

Banner


BusyBeaver(5) Is 47,176,870
03/07/2024

The thing about the BusyBeaver function is that it is very easy to understand, but very difficult to compute. We now know its value up to 5, which isn't much progress for more than 50 years work.



Andrew Tanenbaum Gains ACM Award
28/06/2024

Andrew Tanenbaum has been awarded the 2023 ACM System Software Award for MINIX the operating system he created for teaching purposes and which was an important influence on Linux.


More News

<ASIN:0321228324>

<ASIN:0321335597>

<ASIN:0321515269>

Last Updated ( Wednesday, 24 November 2010 )