Businessweek Archives

A Trillion Byte Weapon


Information Processing: INFORMATION MANAGEMENT

A TRILLION-BYTE WEAPON

Every day, Bank of America's phone reps field about 100,000 calls from customers wanting to check a balance, dispute a charge-card bill, or ask about loan rates. Once they're on the line, reckons Luke S. Helms, vice-chairman in charge of retail banking, "why not sell them something?" Something tailored to each customer's needs: If they have been inadvertently bouncing checks, say, maybe they'll go for overdraft protection. If they have more than 20% of their savings in a passbook account, a higher-interest product might do the trick.

How can America's largest retail bank know exactly what products to pitch to whom? Easy. It just reaches into its "corporate data store," or data warehouse--a bunch of superpowerful computers crammed with details about its customers' banking activities over the past five years. Based on gear from AT&T, the system consolidates 35 million records collected by separate computers that handle daily checking, savings, and other transactions. The resulting database--containing 800 billion characters of data--yields so much insight into customers' behavior, Helms says, that "it's like you're cheating."

CROSS REFERENCE. But data warehousing is hardly cheating. Instead, it's the biggest trend in information management today. This is the technology that may finally deliver on a dream pursued by management theorists since the 1960s: First, collect in one database immense volumes of information detailing every aspect of a company's operations--from daily cash register and ATM transactions to orders, factory outputs, inventories, shipments, and financial accounting records. Then cross-index all that data, put lots of computing power behind it, and the resulting warehouse can give managers, from the CEO on down, "any view of the data they can conceive of," says Tammy Lowe, assistant director of management information systems at Burlington Coat Factory Warehouse Corp., a discount retail clothing chain.

For several years now, Burlington has lived and died by this scheme. Its business is built on a 1.5 trillion-byte data warehouse, running on a cluster of eight superminicomputers from Sequent Computer Systems. Every day, managers all over the company tap into the system to identify top-selling styles and brands, balance regional inventories, measure the performance of one store manager against others--essentially, to learn more about practically anything that interests them that day. By comparing reams of demographic data, historical buying patterns, and sales trends in existing stores, for instance, Burlington can even determine just where to open its next store and what to stock it with. Data warehousing "is how we're growing the business. We can operate more efficiently because we're aware of exactly what's happening," says Lowe.

Thanks to that kind of lavish testimonial, data warehousing has become one of the killer applications driving demand for large-scale parallel computers. These machines, which gang together dozens or even hundreds of high-performance microprocessors, are the only computers fast enough and cheap enough to run complex analyses against billions of bytes of data. They're also able to store and process information at a tenth of the cost or less of a huge mainframe.

Take John Alden Life Insurance Co.'s new data warehouse. It's being loaded with four years of detailed medical claims--close to 150 billion bytes of information cross-indexed every which way. If all this info were analyzed on a mainframe, then comparing how hospital networks in New Jersey and Illinois, say, treat hip replacements--listing all surgical procedures, tests, and prescriptions--would tie up the machine all night. But a 24-processor IBM SP2 computer will do the job in "tens of minutes," says Sullivan B. McConnell, information systems manager. That kind of speed is critical for John Alden, which is constantly trying to measure and compare quality of care, profitability, and other factors in the various regions where it operates. "Health care is getting more complex," says McConnell. "We have to make quicker business decisions."

The vast power of parallel processing is useful in two ways. It can help scan a large database for items that meet some set of criteria--customers who have taken out a loan in the past three years, earn more than $50,000, and live in Arizona, say. In other cases, the computer can "mine" masses of data in search of unanticipated patterns and relationships. Supermarket chains regularly analyze reams of cash register data to discover what items customers are typically buying at the same time--shopping patterns that can be exploited by altering floor and shelf layouts.

To help, computer makers such as IBM, Hewlett-Packard, and Digital Equipment are rushing into data warehousing with parallel computers that are priced at $100,000 and up. Companies such as Red Brick Systems, Arbor, Prodea, Oracle, and Prism Solutions are supplying warehousing software that extracts data from mainframes, cross-indexes it for rapid searching, and finally analyzes and presents the results. With 90% of large companies now building or planning to build data warehouses, according to Price Waterhouse, the total market will likely top $20 billion by 2000.

Companies are scrambling to exploit the concept, including American Airlines, Citibank, and Mervyn's department stores. Bass Taverns, a division of Bass PLC, is testing a new parallel computer from Unisys Corp., called OPUS, for monitoring sales and profit trends at some 4,500 British pubs. In the past, Bass's ad campaigns were often completed before its managers could see whether or not they had paid off. Now, by analyzing sales data gathered daily from hundreds of pubs' cash registers, a campaign's effectiveness can be monitored while it's in progress--and shipments of bitter ale and vinegar crisps kept moving at just the right pace. Says Brian R. Wilson, information technology director: "The more information, the merrier."

McKesson Corp., a wholesale distributor to drugstores, is using a data warehouse to cope with what had become a computing nightmare. Each day, the company receives orders that break down to more than 1 million line items, each one of which refers to one of the 100,000 stock units it keeps on hand in 30-plus warehouses. To make matters worse, the price for a particular item is often determined by a unique deal the retailer has struck directly with the product's maker. McKesson had struggled to keep track of this morass using a handful of old mainframes. But because the data was split among disparate machines, it was difficult to view it as one.

PERSONAL CALLING PLANS? So McKesson has turned to a parallel computer from Pyramid Technology Corp., a unit of Siemens of Germany. It consolidates data from each of the mainframes and lets managers throughout the company analyze any or all of it as they choose. Sales representatives can run a query and be able to tell Revlon Inc., for instance, just how its cosmetics are moving in a certain region or set of retail outlets--a good way for McKesson to differentiate itself from the competition.

Perhaps the biggest data warehousers are telephone companies. MCI Communications Corp., the long-distance carrier, now sifts through 1 trillion bytes of customer phoning data to fine-tune its marketing campaigns and formulate new discount-calling plans. Ben C. Barnes, an IBM vice-president, says he's talking with an unidentified phone company that's thinking of using a 420-processor IBM SP2 computer to sift through more than 4 trillion bytes of such records. The goal: devise discount-calling plans for individual customers, not just for aggregate groups.

With competition mounting in every industry, data warehousing is the latest must-have marketing weapon--a way to learn more about customers' needs and how to hang on to them. "Everybody wants into the finance business--General Motors, General Electric, Microsoft, Charles Schwab," says Bank of America's Helms. To keep its head start, the bank is bulking up its marketing muscle with ultrafast parallel computers from Cray Research Inc., the king of supercomputing. Working in combination with the AT&T computers already in place, the Crays will grind through more data even faster--and, the bank hopes, identify more profitable sales opportunities before the competition does. And so begins yet another round of digital one-upmanship, the force that has been driving computer sales since Day One.

BUILDING DATA WAREHOUSES

Many companies are rushing to supply the specialized hardware and software needed to store masses of data

DATA PREPARATION SOFTWARE Prism Solutions, Innovative Systems

PARALLEL COMPUTERS IBM, Unisys, Cray Research, Siemens/ Pyramid, AT&T, Sequent Computer

DATABASE SOFTWARE Sybase, Informix, Oracle, Red Brick, AT&T/Teradata, IBM

ANALYTIC SOFTWARE Prodea, Brio Technology, Arbor Software By John W. Verity in New York, with Russell Mitchell in San Francisco


Ebola Rising
LIMITED-TIME OFFER SUBSCRIBE NOW
 
blog comments powered by Disqus