On Oracle Corp.'s (ORCL
) gleaming campus in the heart of Silicon Valley, Lawrence J. Ellison heads an information empire that pulled in about $2.3 billion in revenues last quarter. The company is worth about $85.6 billion, of which Ellison's personal share is 24.2%. The world's second-richest man after William H. Gates III, he relishes his hands-on involvement in Oracle. But ask Ellison if he has any regrets or has ever considered another vocation, and he isn't stumped for an answer: ``If I were 21 years old,'' he says, ``I would go into biotechnology or genetic engineering.'' And, in fact, Oracle has become the leading supplier of database tools in the life-sciences sector.
Exactly 2,426 miles away in Rockville, Md., J. Craig Venter, president and chief scientific officer of Celera Genomics Corp. (CRA
), guides visitors through a climate-controlled room crammed with Compaq computers, all processing genetic data in parallel. Venter proudly calls this system the world's most powerful nongovernment supercomputing facility. And it deserves a big chunk of the credit for Celera's ability to match a worldwide consortium of scientists in a race to decode the human genome. Venter himself was one of the first to recognize that the gene business is a data business. Indeed, two-thirds of the people he employs at Celera are software engineers, not biologists. ``It will take massive computing power to solve the biological problems of the future,'' he says.
It's no coincidence that Ellison and Venter share a passion for both biology and information sciences. These once-separate domains are flowing together in a union that could change the whole geography of high technology--and lead to a $43 billion market by 2004. On one side, the tsunami of information generated by the Human Genome Project is forcing drug companies to retool themselves as information brokers. Their survival will depend on finding new ways to spin gene data into blockbuster drugs.
On the other side, beleaguered computer and software companies are searching for the next big growth market. For them, the life-sciences industry is a rapacious animal, howling for innovative technology. The computer and software companies are itching to supply next-generation networks, database tools, and other goodies the drug companies will need in their quest for new cancer therapies, a cure for AIDS, or treatments for Alzheimer's. The IT gurus are also eager to test their new products on the huge computational problems--the structure of proteins, the interaction of chemical compounds--that are so common in biology. ``It is symbolic that the code of the human genome has been broken in the year 2000,'' says Sun Microsystems Inc. Chief Scientist William N. Joy. ``The 21st century is going to be the real Information Age, and I don't mean the Internet.''
CONVERGENCE. Many companies on this year's BW 50 list already grasp that IT plus biology equals growth. Server and software giant Sun Microsystems (SUNW
) has a major research effort in ``bioinformatics,'' a buzzword that captures the convergence of biology and IT. EMC Corp. (EMC
), a world leader in computer storage, factors biology straight into its long-term market projections. Chief Technology Officer James B. Rothnie reckons individuals or insurers will soon require vast space on the Net to store billions of bytes of personal medical and genetic data. And in the pharmaceutical camp, Merck & Co. (MRK
) scientists have become regular computer jocks, customizing their Oracle databases to organize mountains of information on drugs moving toward clinical trials. Down the road, the ability to succeed in bioinformatics could be a key criterion for landing on the BW 50 list--not just in life sciences and information technology, but for chemicals, agribusiness, and some other areas of manufacturing as well.
The first companies to cash in on bioinformatics will be the pharmaceutical giants. And the trend couldn't come at a better moment. On average, it takes $500 million and 14 years to go from discovery to government approval on a new drug. The process is highly inefficient: Nine out of ten compounds fail in human tests. On top of all that, the flood of information from the Human Genome Project ramps up the competitive pressure to test more compounds and move more quickly.
Information technology provides the only means for drug companies to automate and truncate the development process. And the players are willing to spend big money, says Pradip Banerjee, a senior partner with Accenture Consulting. He conservatively estimates that, as a group, drugmakers are shelling out more than $4 billion a year on information technology--and that doesn't include money spent on hardware such as servers. That number could quickly grow to the tens of billions as clinics and doctors begin using new genetic technologies to treat patients.
Companies use information technology in radically different ways. At Merck, a longtime star on the BW 50, it is all about sharing knowledge. Richard A. Blevins, Merck's bioinformatics chief, has spent the last eight years making sure that information contained in the company's 150 databases can be accessed by each of the company's 7000 scientists. This organizational feat assures that the millions of pieces of data generated by each experiment can be quickly scrutinized and correlated with billions of other data bits streaming from Merck's other labs. Blevins and his staff of 30 rely on a system of interconnected Oracle databases to manage the flow--and it works. Looking at the data in real time, he says, allows company scientists to make fast calls about which compounds to push into development.
Bristol Myers-Squibb Co. (BMY
), meanwhile, is using IT to slash redundancy. ``We can't afford to repeat experiments anymore,'' says Shawn E. Ramer, vice-president for informatics, so scientists use software tools to track all research. BMS has also created a centralized list of procedures, or protocols, to be followed in an experiment. ``This way scientists don't have to spend precious time reinventing an assay that we already know works,'' says Ramer.
Most drug execs contend that it is impossible to quantify how much bioinformatics contributes to the bottom line. But anecdotes suggest its value is substantial. Biotech pioneer Human Genome Sciences (HGSI
) in Rockville, Md., for example, credits its computer system for the rapid development of repifermin, a much-heralded protein that helps wounds heal and is now in clinical trials. William A. Haseltine, HGS's chief executive, says the IT system has shortened the 14-year drug-development process by four or five years, allowing HGS to get drugs into human clinical trials for one-tenth of the costs shouldered by large pharmaceutical companies.
Increased efficiency, however, is only part of the game. To hit the big time, many biotech execs believe they must lock up significant amounts of intellectual property. That means getting results that can be patented and filing applications aggressively. So, built into Human Genome's IT system is a program that automatically mines the company's database for new genetic information. Each patent would ordinarily take a team of scientists and lawyers weeks to prepare and file. But at HGS, the software can produce up to 400 new applications a month. To date, the company has applied for 7,000 patents on human genes, 165 of which have been issued. Until HGS tries to collect fees from other companies that use its ideas or materials, nobody will know what these patents are actually worth. But the fruit of intensive bioinformatics isn't just intellectual property--it's knowledge.
Celera showed how fast and far the quest for genomic knowledge had progressed in mid-February. That's when Craig Venter announced that his company had successfully assembled 3 billion units of the human genome--pairs of nucleosides that form DNA's double helix--into an organized unit that can be combed for biological insights. Eugene W. Myers, Celera's vice-president for informatics research, compares the challenge to assembling ``a 40 million-piece jigsaw puzzle where all the pieces look like sky.'' Celera made this scientific feat look effortless. One essential ingredient: row upon row of top-end Compaq computers linked together on a network.
Having roughly delineated the body's 30,000 genes, Venter's group has already moved on to the next, and much larger, hurdle: identifying and characterizing each of the body's 1 million-odd proteins. To accomplish this, Celera has joined forces again with Compaq Computer Corp. (CPQ
), as well as with the Energy Dept., to build a supercomputer that links together thousands of processors capable of running 100 trillion operations per second, 500,000 times faster than the typical desktop.
Compaq earned bragging rights by helping Celera power the first phase of its genomic initiative. Now other IT giants are coming on strong. Motorola Inc. (MOT
) and Agilent Technologies Inc. (A
) (a Hewlett-Packard Co. spin-off) are expanding from semiconductor chips to DNA chips. IBM (IBM
), Sun, and Oracle, meanwhile, are developing new types of network and database software to fit the needs of drug companies.
The reason for this sudden feeding frenzy? Sales growth. Although analysts estimate that bioinformatics will grow into a $2 billion dollar industry in the next five years, most IT companies believe the payoffs will be much higher. An internal study commissioned by IBM, for instance, predicts that when the markets for high-performance computing, storage, and e-commerce combine with that of data management, the worldwide market for IT products and services in the life-sciences sector will swell to $43 billion by 2004. Looking at these kinds of numbers, ``now is not the time to think small,'' says Caroline A. Kovac, vice-president of IBM's Global Life Sciences Business Unit.
PETABYTES. The market, clearly, is tempting to IT companies. But so is the chance to test-drive their latest gear on some of science's knottiest problems. The Human Genome Project was an eye-opener, says Sia Zadeh, director of Sun's Life Science Initiative. But computationally, it was nothing compared with understanding what happens to individual molecules within a cell. Analyzing phenomena such as the interaction between a hormone and a protein in a muscle cell, for instance, will require databases that are at least an order of magnitude bigger than anything used during the genome project. ``We just moved from gigabyte to terrabyte,'' says Zadeh--referring to databases holding 1 billion and 1 trillion bytes, respectively. ``But these new technologies call for petabytes [quadrillions of bytes],'' he says. He notes that is greater capacity than any government research program or other industry--including energy, retail, or finance--requires.
Not only are life-science companies generating lots of data but they are doing so at exponential growth rates. Companies like Merck and Celera will eventually need to push petabytes of data around their computer networks in just a few seconds. Designing networks stable enough to handle these vast data flows means rethinking both basic network design and the software required to run them, says Steven A. MacKay, Sun's chief systems architect. ``We're running out of brute-force methods.''
And this is where biology makes, perhaps, its greatest contribution to IT. It supplies new metaphors for future software and network architectures. IBM, for instance, took its cue from biology when designing its next generation supercomputer Blue Gene. Twelve to fifteen times more powerful than today's top supercomputer, Blue Gene houses a million processors, each capable of performing a billion operations per second. With that many processors and complex software, glitches are inevitable. So IBM is designing the system to diagnose and heal itself. According to Ajay K. Royyuru, a molecular biologist at IBM's computational biology center, once Blue Gene detects an error, the failed component will be isolated, and its software will automatically reconfigure the system to work without it.
Blue Gene's first assignment will be to tackle one of biology's toughest computational problems: Predicting the structure of a protein from its building blocks--complicated strings of amino acids that contain thousands of atoms. When these molecules are formed in a cell, they fold themselves into exactly the right configuration in a matter of seconds. But with large proteins, no existing computer is powerful enough to predict the exact pattern of folds.
The applications for self-healing systems like Blue Gene extend far beyond protein folding to astrophysics, finance, and weather prediction. Says IBM's Kovac: ``This system's unique architecture will move the industry one step closer to non-stop computing.''
As the worlds of biology and IT converge, there will be growing pains. One of the trickiest issues: patents. Some of the smartest minds in the fields of science and law are locked in combat over issues of intellectual property, openness of scientific research, and the public good. In the last five years the U.S. patent office has received nearly 90,000 applications for exclusive rights to genes, parts of genes, and all manner of molecular compounds. And while such land-grabbing provokes a visceral negative reaction from the public, drug-company execs insist that patents are essential for the advancement of medicine. Biologists are accustomed to these debates; now, by hitching their wagons to life-science companies, IT players may be dragged into the thick of it as well.
Despite the challenges ahead, the union of high tech and biotech is sure to unleash new tools and products that cannot, even now, be imagined. As life-science companies struggle to understand and use the billions of bits of information they are generating, IT companies are eager to meet the demands of this new market. After all, future computing solutions will require huge scale. Tackling biology's most difficult computational dilemmas today is one way IT companies can prepare themselves for tomorrow.
By Ellen Licking in New York, with John Carey in Washington and Jim Kerstetter in San Mateo, Calif.
[an error occurred while processing this directive]