The Tech Report: "Have you ever stayed awake at night wondering about the intricacies of Nvidia's general-purpose GPU computing implementation? Well, maybe not. But the few who have are in luck: David Kanter at Real World Technologies has strung together a surprisingly detailed article about how Nvidia's GT200 graphics processor does all its GPGPU magic.
Kanter starts off with a discussion of CUDA, the programming interface that lets developers write general-purpose apps for Nvidia GPUs. He covers the way the API lays out tasks for the GPU, how it handles memory, and how everything looks from the coder's perspective. Then, Kanter dives head-first into the GT200's hardware architecture, from the nooks and crannies of the stream processors to the memory pipeline.
Along the way, Kanter provides some interesting insight into how current GPGPU implementations compare, how they relate to more conventional parallel CPUs (like Cell and Sun's Niagara II), and what they can't do so well (namely workloads that require "complex data structures" such as trees, linked lists, and so on)."