A Talk with the Brain behind Blue Gene


Nearly two years ago, IBM declared that it was going to tackle the Mount Everest of computer projects: A five-year, $100 million research effort to build the world's fastest supercomputer to solve one of the thorniest problems in biology -- and, in the process, some of the toughest tasks in computing. The supercomputer, dubbed Blue Gene, is being designed to operate a hundred times faster than today's speediest machines. Its mission: to simulate how proteins fold themselves into their unique patterns. A routine occurrence in nature, protein folding is enormously complex. Solving that mystery could have profound implications for understanding diseases and designing more effective drugs.

With Blue Gene, IBM is trying to set a new supercomputer speed limit -- a petaflop, or a thousand trillion floating calculations per second. To reach that goal, Big Blue must invent a new computer architecture. The payoff for IBM is the trickle-down effect of the research required to hit its mark -- from creating new chips to developing unique software.

The person managing the Blue Gene expedition is William R. Pulleyblank, a Canadian who joined IBM in the late 1960s before going off to academia in 1974. He was a computer-science professor at the University of Waterloo until rejoining IBM in 1990, this time in IBM's vaunted research division. Pulleyblank, director of IBM's Deep Computing Institute, which coordinates research in the field of high-performance computing, was given the Blue Gene project in January, 2000, and named director of exploratory server systems.

PAINFUL DISTRACTION. With a staff of 100 researchers, this is Pulleyblank's first hardware-development project. An unabashed software guy, he has overseen the mathematical sciences department at IBM Research, the math wonks who develop the software algorithms that make high-power computing possible. And lately, he has had another distraction -- spending his late nights watching the New York Yankees play baseball. "I forgot what it's like when your team doesn't win the World Series," he said the day after the Yankees lost the seventh game to the Arizona Diamondbacks.

That diversion is over. Soon, Pulleyblank is going to be managing an even bigger supercomputer project. On Nov. 9, IBM will disclose a partnership with Lawrence Livermore National Labs to work on a wide range of scientific applications for Blue Gene. The computer giant also is looking for a commercial research partner to strike a similar deal. The moves underscore IBM's desire to create a family of supercomputers for a broad range of tasks, from nanotechnology research to simulating climate conditions to Web hosting and financial modeling.

The Livermore machine, called Blue Gene/L, is expected to operate at about 200 teraflops, 1 trillion operations per second, which is larger that the total power of the top 500 supercomputers in operation today. And work is continuing on Blue Gene/L's big brother, but the machine developed with Livermore is expected to be completed in 2005 and will be used for simulations in such field as aging, explosions, and fire research.

Even with his expanded project, Pulleyblank isn't giving up one of his favorite pastimes: playing blues guitar. He started lessons 10 years ago and manages to find time to play most days. Still, he confesses, "I hope I'm better at building machines than I am at playing blues guitar." In a discussion with Ira Sager, managing editor of BusinessWeek's e.biz supplement, Pulleyblank discusses the art of building supercomputers and what keeps him awake at night -- other than Yankee games. Edited excerpts follow:

Q: It's now almost two years into the Blue Gene project. How have IBM's goals changed?

A: We wanted to attack protein folding and protein science. We knew we wanted to look at what it would take to go into the petaflop range, and we didn't know a whole lot more than that when we began the project. When we got into it, we found that the range of problems that could be attacked by this kind of [supercomputer] architecture was broader than we had realized. In fact, we could change what had originally been the notion of building an enormously large supercomputer into a scalable family of computers. We began to see some really exciting applications we could do immediately.

Q: IBM has begun to talk about "supercomputing for the masses." What does that mean?

A: As the price/performance [of building a supercomputer] improves dramatically, as well as the power and space requirements, it now becomes feasible to take, in effect, a supercomputer and provide it to a person who needs it in a very local environment. What was a few years ago the largest supercomputer in the world, we're saying we can actually produce a machine with that kind of capability for an individual scientist or researcher.

Q: How does IBM benefit from this work?

A: These aggressive research projects feed products to be developed. The idea to build a scalable family of servers, which can go from small to very large servers in a continuous fashion, enables us to explore some of the major issues we face scaling these kinds of servers.

For example, one of the key issues is software. How do we create an application development paradigm so that people can take an application and get it to run efficiently on these very large machines? This is one of the biggest problems we face in our industry, how to reduce the time required for application development. If we don't solve this challenge, Blue Gene is not going to have the impact on IBM's future products and on the future of information technology we want it to have.

Q: How can Blue Gene influence developments in the IT industry?

A: One important area is designing objects, whether it's cars or aircraft or pharmaceuticals. The original application we had in mind when we started the project is protein structure, which is a key component of drug design. As we try to design custom pharmaceuticals, this capability will be crucial. Much of traditional supercomputing works very well in what I would call a planning model.

For example, we can take the entire set of flight requirements for an airline over a month, and we can give them an optimized schedule making efficient use of the aircraft and the personnel. However, if there's a disruption in the schedule, how quickly can you respond to that disruption and keep everything operating? This is one of the areas that we see supercomputers moving into, and to do that [they have] to be able to give almost instantaneous response to these disruptions. That will require enormous computational capability.

Q: How will this capability be used for the development of online businesses?

A: At present, we're used to the idea of data being distributed on the Internet and then being able to collect it and bring it together and find the information we're looking for. Now, we can think of it much more as an application that may exist on the Internet.

For example, we can do local forecasts of the weather, a capability a lot of companies would not want to do themselves. Airlines planning their flights or airports scheduling takeoffs, if they have an accurate weather forecast, it's hugely valuable to them. These kinds of applications will be on the Internet, and you'll be able to access them and apply your own data.

Q: Will this spur a new era of Internet applications?

A: It will accelerate an era that's beginning to take off. Think of grid computing as a collection of computer nodes pulled together to attack a big problem or to deal with distributed-data issues. Now, if one of these nodes is as a machine like Blue Gene/L, it becomes very exciting, because then as people have times when they need this very fast response to a problem, they could use a node such as that. And it can be a shared resource for many users.

Q: IBM has a lot riding on Blue Gene, so what keeps you awake at night?

A: We have so many balls in the air at the same time. My nightmare is summed up in this cartoon which I envision and may get someone to draw for me. We're standing there on the opening day of the Blue Gene supercomputer center, and as the director of research finishes cutting the ribbon, my hardware guy is looking at my software guy saying "Cable? I thought you ordered the cable."


Later, Baby
LIMITED-TIME OFFER SUBSCRIBE NOW

(enter your email)
(enter up to 5 email addresses, separated by commas)

Max 250 characters

 
blog comments powered by Disqus