12/18/2006

Specialization, Centralization, and the Future of Chip Integration

Filed under: Debate, Economics, Science, Technology — Tim @ 4:21 pm

gpu.jpgIf you have any experience building computers, you are undoubtedly familiar with processor specialization. NIC cards handle network traffic, audio cards - such as those from Creative - synthesize sound, and the GPU analyzes and transforms a slew of pixel-bending instructions to output photorealistic video.

In the past, I have discussed technological concepts such as system-on-a-chip, in which several computational processors (e.g. audio, network, video) are fused onto a single piece of silicon.

This growing trend in integration is nothing entirely new. Years ago, many personal computer manufactures soldered math coprocessors onto circuit boards, to aid central processors in calculation heavy tasks.

The Acronym War (TAW)

And over the past decade technologies such as SIMD, OOOE, and SMT have become a mainstay of a central processor’s plumbing. Seemingly, as quickly as newer acceleration methods can be conjured up, they become integrated into a traditional processor.

A contemporary case-in-point is the Physics Processing Unit (PPU), which became a hot topic of discussion in mid-2005, with the commercial release of the PhysX chip and corresponding SDK. To counter this specialization, the two main incumbent providers of graphics cards, ATI and NVidia, have rolled out several stop-gap solutions.

Assuming you purchase a couple of their newer cards, ATI has developed some software that can turn your current, older video card into a piece of hardware specializing in physics calculations.

And NVidia is working with a popular physics middleware provider (called HAVOK) to do something similar; here is a good FAQ explaining the whole situation.

However the biggest twist is coming in the next couple of years, as the GPU and CPU fields will become increasingly blended.

Out of the blue

This past summer, CPU-manufacture AMD purchased the GPU market leader, ATI. Several months later, they unveiled their long-term plans of what they dubbed Fusion. In a nutshell it involve integrating the GPU’s from ATI onto the same silicon as the CPU.

In fact, according to their financial analyst meeting last week, AMD is actually taking this silicon fusion concept a step further through their Accelerated Processing Units (APU) strategy.

In contrast to Intel’s strategy of including potentially hundreds of CPU’s on a single piece of silicon, AMD’s APU strategy involves integrating a mix and mash of specialized accelerators (such as GPU’s, PPU’s and even AI processors) onto one piece of silicon based upon market demand (be sure to look at the diagrams here and here).

So they might mass manufacture a silicon product that includes 4 CPU’s, 2 GPU’s and 2 PPU’s targeted for real-time render intensive jobs (e.g. CAD, 3D art). Or 3 CPU’s and 5 GPU’s targeted at scientists trying to sequence biological data (e.g. genomes and DNA).

To top this off, a month ago ATI released a new (r)evolutionary product called “Stream Computing.”

Basically what this does is allow programmers to take advantage of the powerful parallel nature within a GPU and gives them the ability to write general purpose computer programs for them.

This concept of Stream Processing or Stream Computing is actually in and of itself not new, as it falls within the category called General-Purpose Computing on GPU (GPGPU). And perhaps the most well-known case involves scientists using GPU’s to analyze and calculate protein folding (see the Folding@home project at Stanford).

And NVidia has also followed suit by releasing a set of tools to aid developers in writing programs that take advantage of the highly multi-threaded aspects of its products.

Going to the basement

In fact, this is yet another war zone in which the two competing graphics firms are using different strategies for conquest. ATI has essentially released many of underlying blueprints to their processors through the “Close to Metal” initiative. This plan utilizes a organic, bottom-up method by which hobbyists and researchers can create their own customized software to their specific needs.

In contrast, NVidia recently released a proprietary set of tools - called CUDA - that supposedly interacts with this lower abstraction level.

Currently, ATI’s first Stream Processing solution clocks in at ~375 gigaflops, with 1 GB of RAM and can be purchased for a hefty sum of $2000. Whereas the new NVidia G80, with 768 MB of RAM is currently around $600 on Pricewatch.

Sum of all numbers

Academics such as Hans Moravec, that study brain processing power, suggest that you need 20 petaflops or so to “brute force” the emulation between the human eye and brain.

And based upon where this GPGPU movement is going, Ray Kurzweil’s charts might actually be too conservative in predicting the speed of processor developments.

Granted the software needed to take advantage of the hardware is nowhere near this level of complexity, but the fact remains that the engineering advances in the GPU arena is moving much faster than the CPU which is something that Kurzweil did not foresee at publication time.

As far as actual predictions of what the market will look like 10 years from now:

While you will either be able to purchase or rent 10 teraflops of raw calculation speed for a relatively low price of $500, the price per watt and power consumption will become an increasingly important monetary factor to consider as well. For instance, see Infoworld on datacenter power consumption.

Note: none of the next-gen consoles have a dedicated PPU or AI processor. The PS3 only utilizes the PhysX SDK and not the chip.

See also: Grid Computing, Supercomputing’s Next Revolution, The Information Factories, and Intel Has a Small Urethra