I'd like to hear from people with experience of coding for both. Myself, I only have experience with NVIDIA.
NVIDIA CUDA seems to be a lot more popular than the competition. (Just counting question tags on this forum, 'cuda' outperforms 'opencl' 3:1, and 'nvidia' outperforms 'ati' 15:1, and there's no tag for 'ati-stream' at all).
On the other hand, according to Wikipedia, ATI/AMD cards should have a lot more potential, especially per dollar. The fastest NVIDIA card on the market as of today, GeForce 580 ($500), is rated at 1.6 single-precision TFlops. AMD Radeon 6970 can be had for $370 and it is rated at 2.7 TFlops. The 580 has 512 execution units at 772 MHz. The 6970 has 1536 execution units at 880 MHz.
How realistic is that paper advantage of AMD over NVIDIA, and is it likely to be realized in most GPGPU tasks? What happens with integer tasks?