Tommy Minyard is a power planner—literally. As the assistant director at Texas Advanced Computing Center in Austin, Tex., he's helping to assemble what will be one of the most powerful supercomputers in the world. The behemoth will include 3,936 servers, each capable of running its own network of machines.
One of the biggest challenges facing planners like Minyard is figuring out how much power these data centers need (when fully operational, the center will consume about two megawatts) and how much energy and spending is required to keep them from overheating. Factor in too little power consumption and cooling needs and a company may end up keeping servers idle, reducing the amount of computing that can be done. Factor in too much and a company risks overspending on such resources as air conditioning.
A 2005 survey by Strategy Group, an independent research firm, found that 70% of organizations with data centers were concerned about power supply and cooling, and 85% of companies surveyed have had to deal with problems related to ample power supply or cooling. They, in turn, have been demanding more power-efficient products (BusinessWeek, 5/14/07), not only from chipmakers like Advanced Micro Devices (AMD) and Intel (INTC), but also computer companies like Dell (DELL), Hewlett-Packard (HPQ), IBM (IBM), and Sun Microsystems (JAVA) that make the servers. "Power consumption is so important that it's driving chip designs and computer designs now," says Jim McGregor, an analyst at In-Stat/MDR.
To help Minyard and other engineers plan better, AMD, the maker of the chips being used by Texas Advanced Computing Center, is encouraging the adoption of a new method for measuring computer power consumption. With the release of its new line of server chips, code-named Barcelona, AMD is emphasizing the average power use of each chip rather than assuming that all chips will be running flat out all the time. The new metric, referred to as Average CPU Power, is designed to help customers get a more realistic idea of how much power and cooling they'll need in real-world situations. "Over-budgeting can be a big problem," says Brent Kerby, a product manager at AMD. "If you build more cooling than you really need, you're stuck with it. But if you don't need it, why spend the money?"
That's a switch from the traditional method, known as Thermal Design Power, or TDP, which measures the outer theoretical limits of the amount of power a chip can draw, sort of like the top speed of a car. Just because a car can go 180 miles per hour doesn't means it's always going to be driven at that speed. The same goes for chips in servers, AMD argues. Why build a data center assuming that it will always be running at maximum load when you can save money by assuming otherwise?