This is a pretty common news topic, and I'd like to try to explain it in detail, for anyone who doesn't understand, or who "disbelieves" these numbers, because they don't understand how they are measured.
I'm going to use the PS3 as an example, since it tends to show up more often than other hardware, with this kind of news topic.
I'll start with the ugly stuff.
When developers state they are using N% of the PS3, they are typically referring to the percentage of cycles, each game frame, during which the SPU processors are actively working. This is directly measurable via performance anaylsis tools, so its true, the developers don't just make this stuff up. Almost without exception, the PPU core of the Cell CPU is busy for the entire frame (although some developers do not take advantage of the "leftover" time on the interleaved 2nd PPU HW thread, which the OS is not using very often, but that's another discussion), and each SPU, cycle for cycle, is actually faster, in practice, than the PPU, so the PPU is often left out of the "N% used" comment "formula". The RSX GPU is also, typically, busy for a majority of the frame, and any downtime for it, or the PPU, is often unusable for anything by non-critical "side" tasks which cannot delay the main rendering loop.
Additionally, in some cases the PPU may idle while waiting for SPU tasks to complete, and this is difficult to factor into a "% used" calculation.
That technical stuff said, then why is it so uncommon to use 100% of the SPUs for the entire frame?
Game engines basically operate as on-the-fly motion picture studios -- they work for an entire "frame" moving the actors around, making bullets fly, creating explosions, etc., and then do the work to "render" an image of that scenery to the user. The trouble with using all of the available horsepower, at all times, boils down to some simple facts about how some of the problems in game engines are figured out.
Game engines can do many things in parallel, but some things they *must* do in an ordered sequence. I'll give you a very common example:
(Step 1) Query the gamepad for input
(Step 2) Translate the input as physical impulses for movement
(Step 3) Move the player
(Step 4) Figure out where the camera should be for the frame, now that the player is in position.
(Step 5) Move other stuff that needs to move in the scene
(Step 6) Cull (remove) all the things in the scene that can't be seen by the camera, so we don't have to do lots of work to render them.
(Step 7+) lots more stuff
If you ponder the above ordering, you should see that its basically impossible to do (Step 6) without having done (Step 4 and Step 5), and that its impossible to do (Step 4) without doing (Step 3) for many kinds of games, and that it's impossible to do (Step 3) without first doing (Step 2)... so on and so forth. On top of the ordering dependancy, if you don't do (Step 6), your game will run at a snail's pace, because you will be asking the computer to do more work than it can handle at interactive framerates, unless your scene is ludicrously simple.
Later in the frame, there are all sorts of things you can do on many parallel cores -- beginning with (Step 5) and (Step 6), in the above example. Its at that point that the power of a highly parallel system comes into play, and luckily, there's a lot of work there to be done.
Still.. there's a portion of the frame where the parallel cores will end up idle (not doing work), because there's just nothing for them to do... or is that true?
Developers are inventive with "spare" horsepower -- and there are non-critical (meaning the work doesn't have to finish each frame, by a certain time) "fun" tasks that unused cores can take up, when they're not otherwise busy. Rendering a background scene? Streaming non-critical data? Making trees sway in the wind? MLAA? Recognizing a new face from a camera image? Etc, etc.
50%-70% usage numbers are commonplace for PS3 games, because that's about how much of the frame (it varies widely, mostly depending on the genre of the game, and developer skill/experience) that lends itself well to parallel tasks, both rendering and non-rendering. The remaining 30-50% of the SPU time *is* usable, but it often entails a great deal more work to get there, than the first chunk of work does. Parallelizing game logic, in particular, is arduous, and often undesirable in a cross-console game, since it might put not-so-parallel consoles in a position where they have to keep up, or in many cases, can complicate development too much to be worth the investment and remaining speed gains, unless the title is exclusive.
No matter how much parallelism a title uses, game engines always benefit from changing bad code (excuse the ridiculous example) like:
int Mult(int x, int y)
for(i=0; i < y; ++i)
returnvalue += x;
z = Mult(x, y);
to something that utilizes the hardware better like:
z = x * y;
...so throughout a game engine's evolution on a particular console, it will always continue to get better and faster, via better parallelism or otherwise!