Data integration is critical to all large systems projects, yet it is almost always grossly underestimated as a factor by even the most experienced IT and business professionals. Air New Zealand’s deputy chief information officer Andrew Care agrees. Despite having best practice documentation procedures in place for IT systems, the airline was surprised by its lack of understanding of source systems and data during its journey through an enterprise data warehouse (EDW) implementation. Clearly, even the best-prepared and managed companies endure the traumas of data integration, with application integration almost always being the easy bit. Managing director of data management consultancy MIP, Steve Hitchman, says all large systems projects require a substantial investment in data integration. Customer relationship management (CRM), enterprise resource planning (ERP) and other major business systems implementations will require data to be migrated and integrated in order for the new system to function properly.
Similarly, systems integration projects – including supply chain management, portal deployment and EDW/business intelligence – require a major focus on data management. Each and every one of these implementations is critically dependant upon the quality and reliability of the underlying data for success.
“Some of the consequences are catastrophic and go well beyond the cost overrun,” says Hitchman. “The potential to make large numbers of customers deeply unhappy and cause long-term harm to the business is very real.”
Gartner Group vice president Jeff Comport emphasises the point in his research note Metadata: Key to application definition and management.
“Intrinsic to most new enterprise application projects is the need to manipulate large amounts of information from legacy and new sources. For example, in CRM projects, even though the information is diverse, it must be comprehended and used in a consistent fashion if the overarching CRM objective of unified, customer-centric pro-cesses and data is to be achieved. The failure of a package or project to have well-organised and malleable metadata leaves the project swamped in data detail and prone to trial-and-error and re-work.” He says the technical connectivity is readily achievable but the data integration is not.
In Hitchman’s experience, companies typically invest 15 per cent of project budgets on software acquisition. “The rest will be primarily an investment in the data integration process, the bit that is usually grossly underestimated or indeed totally overlooked,” he says. “They don’t realise the enormity of the data problem until they start to bring data together from the various source systems.”
Why then is this seemingly obvious element of systems projects neglected? Hitchman says the size of the problem is underestimated because it’s hidden. Typically, legacy operational systems are stable and support a narrow set of end-user functional requirements. This ‘stability’ often masks a plethora of potential data problems lurking deep in the bowels of the database.
“Most CIOs won’t admit to having a data quality problem, yet when asked about the contingency they allocate to data migration issues, it’s often more than 40 per cent of the total project budget,” says Hitchman. They don’t know that they have a problem, but their experience tells them to allow for one. Often the problem is not acknowledged, “until you get millions of dollars of overrun and you suddenly realise ‘my systems don’t talk’”.
Air New Zealand’s Care has been involved in a number of data integration activities specifically supporting the airline’s EDW. These have been driven by the need to better understand measures including revenue, customer loyalty, market share and bookings, with data being sourced from its internal transaction processing systems, ASP services and external sources.
Care says the increased visibility of data through the EDW project has identified data quality and completeness issues within Air New Zealand’s source data structures. These included:
Data not aligning with system specifications.
Limited detailed knowledge of the application systems.
Outdated application system documentation.
The need for business process changes to improve the quality and completeness of data.
Hitchman says these are typical findings. “Everyone’s documentation is out of date – absolutely guaranteed – and when you make your first pass at data integration, the result is what we call ‘code, load and explode’. You need to analyse all the data in each of the systems before you start loading it. The most common approach is to buy an ETL [extract, transform, load] tool. The first time you run it, it falls over. You then correct the faults you find and run it again. This can be a time-consuming and expensive process as you iterate through the data.
“Going into the project, we felt we had a sound understanding of our internal data sources. Certainly, this knowledge was sufficient to use and maintain the original system,” says Care. “However, in hindsight, it was clearly insufficient for disassembling the actual processing activities.” And this is despite the fact Air New Zealand has worked hard to maintain information about its data. Technical metadata about lineage, documentation of transformations and information enhancement has always been maintained.
Gartner Group VP and research director Michael Blechar defines metadata as “... an abstracted level of information regarding characteristics of an artifact, such as its name, location, perceived importance, quality or value to the enterprise and its relationships to other artifacts”. Put simply, metadata is data about data.
Blechar says metadata management is a critical success factor in enabling and supporting service-oriented development of applications (SODA). “One would not dream of running a large manufacturing plant without knowing a number of key pieces of metadata, such as which components are needed to construct a given product, or where these parts are inventoried. However, many organisations have little similar information regarding their computer applications.”
For example, if an EDW database is to be truly shared, there must be understanding and agreement about its contents. End-users of the data warehouse will want to know the source of the data element and the transformations made to it by an ETL tool.
Blechar says of the metadata maintenance process, “If a data element needed to be changed, we would need to know which components and services might be affected. Moreover, the basis for a reuse program is contingent on metadata understanding and management. How else are re-useable services and components located and changes coordinated across the enterprise and the customer and supply chain partnerships?”
At Air New Zealand, business metadata has increased in importance as the EDW has gained critical mass. Care says business metadata has always been captured as part of the development lifecycle – in training and user documentation. They are now examining methods of better distributing this to the user community and also developing a friendlier metadata repository.
“We’re currently preparing for the next wave of activity with the data warehouse,” says Care. “We’re looking to improve our practices in deployment of business metadata and making this more effective in decision support activities; and embedding improved data quality processes to enable timely quality issues and manage their resolution before progressing significantly into the development process.”
Edwin Bruce, the delivery manager for the e-government unit, State Services Commission in Wellington, New Zealand has run what he calls the “Government services portal project” for the past 12 months. Government services metadata has been critical to the success of this portal project. The metadata standard has been developed by the government and is called the NZ government locator standard.
“The metaphor we use is that we not only describe government services, but we also describe the associated documentation, forms, websites, advisory services,” says Bruce. “It’s a standard for describing things for discovery purposes.”
The project is about enabling real people to easily find government services. To enable that to happen, you must have a standardised way of collecting information from government agencies for delivery through the portal to allow people to actually find those services. Consistency and clarity of information are therefore critical.
The process for metadata collection is unusual. Sara Barham, the portal project information manager, asks the 90 government agencies involved in the project to collect and maintain the metadata using a creation tool supplied by the project team. The data is stored in a central database with a centralised quality assurance program such that no metadata is released to the portal without it being reviewed by the team. “We chose this model because our focus is on maintaining a very high level of quality from a citizen’s perspective, yet we wanted the agencies to take ownership of the metadata.”
There have been a number of welcome but unplanned benefits from this process. Bruce says one result of having a centralised, highly structured repository of metadata has been the ability to use it to create a ‘cookbook’ for building portlets that are re-useable by others. Instead of having to build their own infrastructure, users are able to use these portlets to deliver a range of online services.
“Metadata is transferred to the portal through a variety of mechanisms and basically dished up to a new audience in a different way with wrappers around it. So we can roll out new portals in this country quickly and cheaply,” he says.
“We can now very easily look at government from a service, a function or a subject-based perspective, without having to understand the structure,” says Bruce. Thus, citizens are able to locate government services without understanding of the structure of the government and its agencies. He says if you wanted to analyse government services from a service or process improvement point of view, you have the tools and the data to do so with ease.
Barham says the major challenge has been cultural change, with agency staff having to think about and describe their services very differently. It has also forced agencies to understand and analyse their services, prompting many to improve their service delivery model.
Barham says the overview of government developed through the project is unique, and has been an opportunity to bring a large number of decentralised government agencies together to share experiences and knowledge.
Bruce says the next step is application integration. “Now that we have highly structured services and an integrated dataset, we can start looking at opportunities to improve how the management and delivery of those services should be hooked up and start working with agencies to build applications that integrate some of those services.”
Join the CIO New Zealand group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.