NVIDIA vs AMD: GPGPU performance


I'd like to hear from people with experience of coding for both. Myself, I only have experience with NVIDIA.

NVIDIA CUDA seems to be a lot more popular than the competition. (Just counting question tags on this forum, 'cuda' outperforms 'opencl' 3:1, and 'nvidia' outperforms 'ati' 15:1, and there's no tag for 'ati-stream' at all).

On the other hand, according to Wikipedia, ATI/AMD cards should have a lot more potential, especially per dollar. The fastest NVIDIA card on the market as of today, GeForce 580 ($500), is rated at 1.6 single-precision TFlops. AMD Radeon 6970 can be had for $370 and it is rated at 2.7 TFlops. The 580 has 512 execution units at 772 MHz. The 6970 has 1536 execution units at 880 MHz.

How realistic is that paper advantage of AMD over NVIDIA, and is it likely to be realized in most GPGPU tasks? What happens with integer tasks?

Best Solution

Metaphorically speaking ati has a good engine compared to nvidia. But nvidia has a better car :D

This is mostly because nvidia has invested good amount of its resources (in money and people) to develop important libraries required for scientific computing (BLAS, FFT), and then a good job again in promoting it. This may be the reason CUDA dominates the tags over here compared to ati (or OpenCL)

As for the advantage being realized in GPGPU tasks in general, it would end up depending on other issues (depending on the application) such as, memory transfer bandwidth, a good compiler and probably even the driver. nvidia having a more mature compiler, a more stable driver on linux (linux because, its use is widespread in scientific computing), tilt the balance in favor of CUDA (at least for now).

EDIT Jan 12, 2013

It's been two years since I made this post and it still seems to attract views sometimes. So I have decided to clarify a few things

  • AMD has stepped up their game. They now have both BLAS and FFT libraries. Numerous third party libraries are also cropping up around OpenCL.
  • Intel has introduced Xeon Phi into the wild supporting both OpenMP and OpenCL. It also has the ability use existing x86 code. as noted in the comments, limited x86 without SSE for now
  • NVIDIA and CUDA still have the edge in the range of libraries available. However they may not be focusing on OpenCL as much as they did before.

In short OpenCL has closed the gap in the past two years. There are new players in the field. But CUDA is still a bit ahead of the pack.

Related Question