The X-Files

The X-Files

The truth is out there: Extensible markup language delivers on its promise to simplify searching and integrating Web-based information.

Extensible markup language (XML) is touted as a more versatile version of hypertext markup language.

In this article learn

Why XML offers improved functionality

How early adopters use XML

What barriers may prevent widescale XML adoption Previously on CIO.COM...

X Marks the Spot

Tony Hill thought he could take a slice of the growing e-commerce pie by serving as a middleman between small retailers with little or no e-commerce experience and big product suppliers. An e-commerce consultant, Hill figured small companies would have to spend inordinate amounts of time and money extracting product information from vast supplier databases, creating Web pages to showcase products and setting up computer systems to track prices and availability. Yet Hill had to wrestle with a few problems: How could he avoid the pain and expense of revamping information delivery to retailers every time a supplier upgraded to a new computer system? And was there any way that small merchants could quickly cull specific information from supplier databases? Hill believes that extensible markup language (XML) will make his life much easier than the traditional mode of formatting Web documents, namely hypertext markup language (HTML). XML is an emerging markup language that promises to release developers from many of the strictures - such as incompatible browsers and the messiness of integrating applications - that can inhibit the easy exchange of data on the Web.

XML is HTML's nimble cousin, but where developers use HTML to determine how information and images look on a Web page, they use XML to describe the content of that page. In other words, XML helps developers define what the content of a page means, not just how a browser displays it. Think of XML in terms of adding a universal vocabulary and structure to the vast array of documents on the Web, and its potential becomes clear. As Hill envisions it, XML will enable the owner of a small mountainside ski shop to easily access a distributor's database with a browser and pull out information about the latest downhill racing poles. The browser (provided it supports XML) displays only the relevant information because XML can make a distinction between downhill poles, say, and cross-country poles.

The underlying concept of XML is extremely simple. XML-based documents and systems contain content labelled with an ID tag characterising the meaning of that content. HTML, on the other hand, tags content with codes that recommend only how the content should be viewed (see "Tag, You're It"). To compensate for HTML's deficiencies, Web developers have cooked up their own labelling schemes.

But those home-grown approaches often don't work beyond a Web site's own pages.

Kevin Werbach, managing editor of newsletter Release 1.0 in New York City envisions two classes of XML use. "On the client-side, XML is essentially HTML on steroids," he says, referring to the ability to accurately search content in XML-based documents. It's on the server-side where the language really flexes its muscle. "What XML can do is serve as a bridge among many different kinds of data repositories, making it easier to share information among different applications and Web sites," he says. As long as developers use the same XML tags to define data, information can be swapped between a document on the Web and a database on a mainframe. For example, in an accounts-receivable system, companies all have to use the tag customer instead of client, consumer, patron or end user to describe those who buy products or services. Once standard definitions are agreed upon, XML can transform the Web into a way to integrate documents rather than simply access them.

Hoping to take advantage of XML's strengths, Hill hooked up with database expert Dr. Kee Ong in fall 1998 to launch Infoprise Inc. (now known as iVendor Inc.), a Redwood City, Calif.-based firm designed to help small retailers quickly set up shops online. iVendor's strategy is relatively simple: Gather vast quantities of product information from wholesale suppliers and make it readily accessible to e-commerce retailers. But the startup couldn't get off the ground without a way to quickly organise and manipulate content - from different companies and systems - on the Web. Enter XML.

XML holds the promise of allowing iVendor to do three key things with data more easily than before: sort it, share it and update it. "We realised we could really make this [business] happen," Hill recalls. "XML offers the flexibility to allow merchants to get access to the supplier's database very quickly, very easily, on a real-time basis." iVendor is currently testing its concept with a few pilot merchants and suppliers via a password-protected Web site. Yet as an early fan of XML, Hill is ushering in what many proponents say will be the greatest thing to hit e-commerce transactions since online credit card processing. With XML, a book distributor can create a database that breaks down product information into categories. Things like price, ISBN number or author are identified or tagged as separate data elements within the database, allowing the distributor to offer retailers customised views of information. A developer of a Web site about dogs can link into the book distributor's database and quickly extract and post a list of dog books instead of painstakingly sifting through the entire inventory.

Unlike HTML, which has a finite number of tags, XML allows developers to add new levels of granularity to existing information (hence the "extensible" part of its name). The book distributor can add a data element identifying wholesale prices and make that information available to retailers but not to shoppers. In effect, the book supplier can create one repository of information and present different views of data to different customers.

XML has a way to go before achieving its potential as a universal tool, but its use is growing. In the third quarter of 1998, more than 16 percent of corporate users surveyed by Zona Research Inc. of Redwood City, Calif., had XML in their Web pages or applications, up from only 1 percent in the second quarter.

Vendors also have to get behind XML by offering software that supports it.

One vendor making a high-profile commitment to XML is Ariba Technologies Inc. of Sunnyvale, Calif. The software company recently announced its support for an XML-based e-commerce initiative aimed specifically at exchanging catalogue content and conducting transactions on the Web. Other companies involved include vendors Vignette Corp. and Poet Software as well as online retailers Staples and

New Opportunities

Boston-based investment bank Adams, Harkness & Hill Inc. (AH&H) was attracted to XML in large part for the ability to port information to a variety of display devices, such as handheld computers and laptops. Many of AH&H's 125 institutional clients rely on a variety of ways to get reports including fax and mailed hard copies. Increasingly, AH&H expects that customers will demand fresh research as well as access to older reports. "We wanted to be able to go down a path that was highly customisable," says Steven Frankel, AH&H's managing director. Yet Frankel didn't want to overload the three-person IS staff with additional tasks. So AH&H looked for a way to get information from a Lotus Notes system used to distribute research internally onto a Web-based extranet for clients.

AH&H chose an XML-based Web content management and publishing system from Inso Corp. of Boston that allows clients to request documents via a password-protected Web site. (The system converts XML documents into HTML appropriate for different browsers and devices.) The project, which was completed in six months at a cost of about US$ 50,000, gives the investment firm a way to make a painless transition to instant Web publishing, according to Frankel.

Chet Ensign, director of electronic and editorial information technology at legal publisher Matthew Bender and Co. Inc. in New York City, is helping the company build a cross-reference system with XML. The system will allow customers to click on a highlighted citation in an electronic document to view such informational goodies as legal opinions, similar cases and case histories regardless of their source. In this capacity, XML can help Bender strengthen its relationship with new corporate siblings Lexis-Nexis and Shepard's Co.

Already, XML has come in handy relieving an editorial headache. Each week, freelancers write summaries of bankruptcy case decisions, highlighting three or four key legal points and sending them to Bender editors. The editors compile the summaries in an Adobe Acrobat file for posting on Bender's Web site and for e-mailing to their customers. On Fridays the editors used to have a problem. In addition to the weekly bulletins, they wanted to issue a monthly Web and print report for customers, recompiling the data by legal points. Creating a prototype that involved cutting and pasting reports together consumed 60 hours of labour. "The time was just for the mechanics of producing the final publication, not for enhancing editorial content," says Ensign.

With an XML template that includes tags for such elements as judge and jurisdiction, Bender automated the process. Writers use the template to enter their summaries into text files and e-mail them to editors who store them in a folder. On Friday morning, editors run a script program that automatically organises the summaries into a single document based on the XML tags.

Building the automation system took about a month and cost less than US$ 15,000, Ensign says. With it, Bender launched weekly Web-based bulletins and monthly print reports using previously unwieldy information. Ultimately, Bender will take advantage of the real power of XML by allowing the creation of new products such as reports that mix and match summary criteria from existing information.

Kinks and Standards

Although Microsoft and Netscape offer browsers that support XML, there will be an installed base of older browsers that can't take advantage of XML's power, says Release 1.0's Werbach. While there are tools that essentially translate XML into HTML for browser viewing - iVendor tapped RivCom Inc., a publishing services company based in New York City and Swindon, United Kingdom, to provide such a tool - Werbach believes client-based applications will lag server-based applications for the next year or so.

In a similar vein, implementing XML is difficult when dealing with legacy systems. iVendor assigned three engineers the task of translating data from one supplier's database into XML, moving the data into iVendor's database and creating a Web front end.

Then of course there are issues concerning standards. For XML to work, organisations need to establish common ground rules governing precisely how to tag content and present the data in Web browsers. Any barriers now may slow widescale adoption of XML in the short term but won't stop it in the long term.

"Time to market in the Internet space is a killer," says iVendor's Hill. "We just couldn't compete if we had to go out and individually install or download software for each merchant we work with or integrate our system with every supplier." Tag, You're It XML's power starts with its tags The World Wide Web Consortium (W3C) in Cambridge, Mass., formally approved a standard definition for extensible markup language, or XML, in February 1998.

For a basic understanding of XML, first consider how hypertext markup language (HTML) - the current lingua franca of the Web - works. With HTML, a Web designer marks text, images and other content on a Web page with a set of tags that say nothing about the meaning of the content; they just suggest how to display the content through a Web browser.

Imagine you have a Web site peddling shoes. You are holding a sale on girls' running shoes, pricing them at US$ 49. You want to highlight the price by using boldface type. In HTML, the tag accomplishing this is
.With HTML tags, the number figure has no context. The page listing the price for girls' running shoes might turn up if a shopper searched for information on the San Francisco Gold Rush.

With XML, a developer can signify that US$ 49 is a price by labelling it with tags like US$ 49. Now a search engine looking for a price can find it more readily. For more precise searching, a developer can even sandwich the number between two tags.

Unfortunately, precise tags raise one of XML's potential problems. Should a developer call an item a running shoe? How about a sneaker or a tennis shoe? The computer systems of business partners need to use the same tags in order to swap information with ease. Certain industries have already created their own dialect of XML that contains industry-specific identifying tags. The chemical industry, for example, has developed chemical markup language, or CML.

Since XML allows for many possibilities, more standards related to its use need to be introduced or the Web will turn into a Tower of Babel. Luckily, consortiums such as W3C and CommerceNet in Palo Alto, Calif., are working on them.

Susan E. Fisher is a freelance writer in Chicago who can be reached at

Join the CIO New Zealand group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about Adobe SystemsAribabarnesandnoble.comCommerceNetInsoMicrosoftVignetteW3CWorld Wide Web ConsortiumZona Research

Show Comments