The E-Business Software Weekly is a series profiling trends and developments in software and applications that support e-business, the Internet, and other electronic communication channels. Look for a new story each week in this space.
Demystifying XML
When the Internet first emerged into the corporate consciousness several years ago, it was initially seen as a novelty, a technological treasure chest, a desktop library that the research-minded might find useful but that was not expected to have any more effect on corporate strategy than a pocket calculator.
That philosophy lasted for, oh, about three seconds.
Before long, the Internet was everything. Corporations whose executives a few months before had hardly heard of the Internet suddenly had to have a Web site. Marketing departments still struggling to spell "HTML" were trotting out massive online brochures, while their colleagues in sales were retrofitting their spreadsheets to account for the billions of dollars--no, trillions of dollars--that would soon fall like electronic pennies from heaven into their waiting laps.
Well, they weren't really wrong. Companies like Amazon, eBay, and Travelocity truly transformed the way that business is conducted, not just in the American economy, but worldwide. Other corporations, from Dell to Cisco to Ford, ultimately have made or saved billions of dollars on the Internet. The hypemeisters were right: the Internet was big business, and it is going to be bigger still in the years to come.
But the Internet wasn't everything. It wasn't easy money, a worry-free, build-it-and-they-will-come business engine of its own. The Internet was a channel, and little more. The companies that succeeded on the Internet took this understanding to heart: the Internet didn't rewrite the rules of business, it only recast them into a new dialect that made it possible to speak more efficiently to billions of people around the world. Quaint corporate niceties like a desirable product, a robust business model, and a reliable fulfillment process, it turned out, mattered just as much on the Internet as they did at the local shopping mall.
The Promise of XML
Now, as the economy continues to suffer from the implosion of Internet dreams, comes talk of the new corporate panacea: XML. You can hear the shouts of the early Internet era all over again. A random selection of corporate publications in recent months have referred to XML as "the Morse Code of data," "the universal integrator," "the lingua franca of the Internet," "the great enabler of electronic commerce." Michael Molinski, writing in BusinessWeek, explained this phenomenon simply: XML "is fast becoming the new language of the Internet--allowing for easy exchange of data, as HTML allows for the fixed display of info on the Web."
Coco Jaenicke, a software development manager and one of the Web's most articulate advocates of XML, puts the business case enthusiastically in The XML Journal. Because of XML, she writes, "integration can happen anywhere: between systems, applications, and even businesses. Integrating two (or more) entities does more than improve the abilities of each of them. It allows businesses to achieve higher-level goals. Or, put mathematically, the sum of the whole is greater than the sum of the parts. For example, instead of focusing on a single inventory system, you can strategize for a more efficient supply chain. Instead of developing a good app, such as a customer retail site or help center, you can manage the entire customer experience."
Listening to this promise, any good, profit-minded business executive can scarcely be blamed for asking, "Where do I sign up?"
In fact, many have. Blue-chip companies from Wells Fargo to First Union, from IBM to Nokia--and hundreds of others--have major XML projects underway. For instance, in a joint project with SABRE Group (the folks behind Travelocity) and IBM, Nokia is creating a cell-phone based travel information system that should ease the lives of corporate travelers everywhere. As one write-up describes the new system, "Whether a traveler has had a meeting rescheduled or is running late for a flight, this first-of-its-kind technology gives the ability to make new travel arrangements while on the road. Travelers can use the phone's graphical display to request flight details, change a flight, or search for alternatives--from an office or hotel, or even on the way to the airport."
It's all made possible by XML.
The Nokia experience is hardly an anomaly. XML--like the browser-based Internet before it--is in fact truly powerful stuff. But, just like the Internet, XML must be done right if it is going to deliver on its promise of efficiency, e-business, and universal integration. And, in the rush to "XMLify" everything that connects to the Internet, one must wonder: is that really taking place?
The Genius of XML
Coco Jaenicke explains a key benefit of XML--data integration--in this way. "Before XML," she writes, "if two companies wanted to connect electronically, they had to get a one-to-one T1 line, develop a proprietary protocol, and hard-wire a system that was impervious to change. Needless to say, this 'EDI generation' was limited to companies that could afford to make such a large investment. Or ones that could count on the strategic importance of an unchanging relationship for many years in the future." Because of the way in which XML formats data, however, "an XML-based solution supports dynamic change. Organizations can move away from having a few key connections set in stone to having an entire trading network that can fluidly change based on business criteria."
Such a description makes XML sound more complicated than it really is. In fact, the principle behind XML is quite simple.
Prior to the invention of HTML (HyperText Markup Language) by Tim Berners-Lee and Anders Burgland in the late 1980s, Internet-based communications were largely limited to straight Courier text. For the first time, HTML introduced document formatting. For example, This is a heading would be displayed within a Web browser in the default heading style (e.g., large and bold) as opposed to standard body text. Other formatting "tags" would specify position, graphics, and spatial relationships among units of text. In other words, HTML did for Web-based documents what desktop publishing programs like Pagemaker and Quark XPress did for ordinary word-processing documents.
But transformative as it was, HTML suffered (and continues to suffer) from a number of significant limitations. Aside from its graphical shortcomings (HTML, for instance, natively supports only about half a dozen different typefaces), it is restricted to one universal set of document tags; Web site creators cannot invent new document tags, like a tag representing a new typeface, because the standard Web browsers will not recognize and be able to act upon the new tags. In addition, HTML supports only one view of data--specifically, the view embodied in computer-based Web-browsers. As a result, HTML-formatted documents will not display correctly on other Web devices, like Web-enabled cell phones or television sets.
But the most important deficiency of HTML is that it is "content unaware." In the same way that your dog can't tell the difference between a $1 million check and an old piece of newspaper, HTML can't tell the difference between, say, a customer name in a database file (e.g., John Smith) and a heading in a news article (e.g., John Smith wins the lottery). Consequently, there is no way for a computer program to automatically "extract" the relevant data from an HTML document, and hence no way to exchange such data among applications or databases.
The genius of XML (or eXtensible Markup Language), in the most basic terms, is that it introduces new, user-definable tags that describe, not the formatting of a document, but its content. And so a document tag called <price> might be created that could be used to designate certain groups of numbers as product prices: <price>6.97</price>. A computer script thus could be written that would enable a quote-generation program running on one computer system to automatically extract the pricing data from a price sheet resident on another computer system, and hence automate a necessary but time-consuming business process, instantly enhancing a company's efficiency. This simple step would have been impossible, however, with ordinary HTML.
Intelligent Implementation
That's the good news. And it is good news indeed. But it's not without some serious complications. As Robert Worden writes on XML.com, "XML is now the world standard platform for e-business transactions. However, in any business application, XML itself is not the answer. It is only a standard foundation on which answers can be built. In that lack of prescription like both the power, and the danger, of XML in e-business."
One key problem is that XML is a little like the Chinese language: there isn't one XML, but a number of different, and competing, dialects--a multiplicity of linguistic riches that has the potential to impede rather than simplify data integration and communication. Over the long term, this XML standards competition probably will not be a major difficulty. As Scott Hebner, program director for Java and e-business technology marketing in the IBM Software Group, tells SoftwareMag.com, "The standard of XML is not going to be left in the hands of technologists--it's driven by business people creating business relationships. Therefore, it's much less likely that you will have fragmentation of the XML standard."
Still, in the near-term, implementation of XML in any data- or application-integration project requires intelligent implementation--a careful assessment not just of the project's current goals, but its future extensions and applications as well.
A second complicating factor is what is commonly termed the "N-way problem." Robert Worden points out that "XML today is in a position similar to that of relational databases twenty years ago. The relational data model does not pre-define how you will store your data; it gives you a standard foundation, leaving you to choose how to use it to store data. You must define the tables and columns. What relational databases do for information storage," he goes on, "XML does for transmission of information over the Net."
All well and good. But here's the rub. Shortly after relational databases appeared in the early 1980s, companies became entranced with their power, and such databases proliferated. The process, recalls Worden, went like this: "As companies built tens and hundreds of new databases, the same data were stored in many different databases with redundancies, overlaps, and inconsistencies. The result today inside many big companies is information chaos--a corporate spaghetti of system-to-system links for data exchange, application integration costs as high as 40% of the total IT budget, and, above all, delays in building vital new applications."
The problem with such system-to-system interchanges between relational databases, Worden explains, is that, "if you have N databases, the number of possible data interchange links can grow as N squared. With even 30 different databases--and most companies have more than that--there are nearly 1,000 possible links. If even a small fraction of these interfaces have to be built, maintained, and understood, the resulting complexity will be unmanageable." Indeed, "twenty years after the start of the relational era, we still have not solved the complexity problems it created." The threat to XML is that, because it operates in fundamentally the same way as relational databases, its implementation, too, could be imperiled by complexity, not just within companies, but across all companies linked by an XML interchange.
A third challenge goes to the heart of the XML value proposition. XML is, as advertised, a simplification tool, but what it simplifies is the exchange of information, not the work of setting up the interchange. As Clay Shirky, a professor of new media at Hunter College, writes, "Designing a good format using XML still requires human intelligence." In fact, "good XML takes more work [than traditional database integration] because it requires a rigorous description of the problem to be solved, and its . . . extensibility only works if the basic framework is sound." The result: "XML does not mean less [development] pain. It does not remove the pain of having to describe your data; it simply front-loads the pain where it's easier to see and deal with." All this offers up one more reason for intelligently implementing XML. Its payoff, explains Professor Shirky, "only comes if XML is rolled out carefully enough at the start to lessen day-to-day difficulties once the system is up and running."
A Superior Solution
Listening to the promises of XML--its ability to automate data-exchange processes that previously required countless hours of point-solution customization--it's hard not to get excited by what lies ahead. And with good reason: like the next-generation Internet itself, XML ultimately will triumph because it is a vastly superior solution to what has come before. But companies wishing to secure the benefits of XML in the near-term need to realize that, as with the Internet, XML is not easy money. Its payoff can be huge. But getting to that point, as in so much of business, will require planning, intelligence, and a lot of hard work.