Around the table * John Ansley, informatics head, Asia-Pacific, Roche Pharmaceuticals
* Michael Bark, IT manager, Noni B
* Chris Clark, group general manager IT, Brookfield Multiplex
* Richard Deck, CIO, Luxottica
* Barry Dinham, IT manager, Liverpool City Council
* Colin de Kantzow, chief technology officer, Baptist Community Services
* Angus Jones, manager, enterprise products and services marketing, HP
* Adam Levin, IT manager, Stryker Australia
* Peter O'Donovan, IT operations manager, Asia-Pacific, Travelex
* Chris Jenkins, online editor, MISAustralia.com
* Paul Smith, editor, MIS Australia - moderator
OK, who's responsible?
Colin de Kantzow: I steer clear of the expression "business continuity" because that's a business question. From the IT point of view, we deal with service restoration. There is a lot of support at an executive level to ensure we can do that.
We also work with the business to provide continuity, but that's about us helping them travel the journey of continuing their business even when they don't have services, rather than IT being responsible for making sure the business continues - because I don't see that as our role.
Peter O'Donovan: We have recently - over the past 12 months - done a fairly major piece of work around our disaster recovery (DR) and crisis management processes. I think we've probably made some fairly good inroads into institutionalising some best practices around that as well.
We've done a lot of work with the business guys to take ownership of their crisis management plans. Getting them to understand their critical functions and then getting them to turn that understanding into something that led to a risk profile that let us drive our disaster recovery [plan] from an IT perspective was important.
Adam Levin: I'm clear about differentiating that there's business continuity planning (BCP) and then there's disaster recovery.
IT disaster recovery is our responsibility and BCP is the business's responsibility. We've initiated putting a project team together with the business, and they are responsible for coming up with the BCP plan, which is then consolidated into a manual and stored off site - so that if anybody has a look at it ... they can follow the procedure as an individual.
John Ansley: Where you don't have those clear delineations and the business thinks IT owns business processes because our systems support them, I think that's where it gets messy.
What I've encouraged the company locally and regionally to do is to have business process owners who form an improvement group, and those are the guys and girls around the region who own this as a continuity plan.
Governance, other drivers
Adam Levin: Business continuity planning is a big issue for us because we're listed on the New York Stock Exchange, so there's a Sarbanes-Oxley [regulatory] component. That in itself adds a lot of complexity to the plans that you have in place.
From an IT perspective, disaster recovery is kept separately from the BCP but an auditor coming onto the site would want to see both plans. That's added a bit of complexity, because from what I'm hearing, it's [about] getting the business to take ownership and responsibility for making sure their unit is able to recover in a disaster. It's not our responsibility as an IT division to do that; it's up to them.
Chris Clark: BCP and disaster recovery had died down as a high-profile issue from its post-September 11 levels, and in a way, I'm kind of lucky because, with the recent Brookfield purchase, Sarbanes-Oxley is something that we've got to go through now. So the need to be compliant has brought these things back onto the agenda.
Barry Dinham: From the auditing point of view, PricewaterhouseCoopers will send in their junior staff to go through their little checklist and it is a driving force for us to pass these things. We are a community and public service organisation, so perception is as [important] as reality sometimes. You don't want bad press, so we bend over backwards for the external auditors, because it's important to have all the documentation.
John Ansley: I think there's a real danger, too, in the "fear sale", where you talk about what would happen if offices flooded, or if there were a terrorist strike. As an industry, we've been selling fear for a long time.
"You can't do without us" is our first fear sale, and then we've had the Y2K stuff, we've used September 11, we've used the dotcom stuff. Every chief executive in the world now thinks they need to be on Web 2.0 but they have no idea what it means. We're scaring them all the time and I think they're over it. They're getting quite cranky, so you have to discuss these things practically, rather than going in hard.
Peter O'Donovan: Once we began to work with the business, we could ask, "How quickly do you need these systems to become available?" We started to develop critical times for a return to operation and we got a fairly good response to that.
It was a six-month project. We worked closely with the risk guys to develop that. We've got things that need to be operational within two hours, four hours, six hours, 24 hours and up to even five days later.
So we now know and have a good profile of the systems and critical functions and their time to recover.
BCP in practice
Peter O'Donovan: We ran what's called a displacement test, where you assume building access is not available. Everything is operational in the building but the bums on seats go out to the disaster recovery site.
We did that last year and it was a huge success. Everyone had to work as though they were in the office. We did a lot of planning to get it going quickly but it worked.
The second test we did towards the end of last year was to cut the network links to the head office and its office production data centre. That meant switching phone lines over, fax lines - the lot.
Richard Deck: BCP is not so much about keeping SAP and those big ERP [enterprise resource planning] engines running, it's about your call centre systems and your email systems and what will happen if you lose those.
Businesses rely more on call centre and email systems than on ERP systems these days, and some DR and BCP plans don't even talk about email and call centre systems because they're add-ons to the ERP.
John Ansley: We go through BCP scenarios annually. We have an external facilitator come in and drive us through those.
As an emergency management team, we sit down and he just throws whatever scenarios he wants to at us, and we've got to work through how we would respond and what we would do. That's then reflected back at the document, to see if the document was strong enough in the first place.
Richard Deck: Infrastructure does play a part. Today, our systems don't tend to fall over that often because there's so much redundancy within the boxes themselves inside the data centre.
Today, if you walk in and your email isn't available and SAP isn't available then people get quite cross because it's assumed that everything will work and it's not front of mind for them. So if you do have fires and floods and electricity cuts as an issue then at least it is front of mind.
Angus Jones: It's good to know that the hardware is reliable, and we've been talking about fire and floods and all these other things - but is human error an issue?
Colin de Kantzow: Human error causes all our significant outages. Either a communications contractor makes a mistake or somebody connects the wrong item or rolls in an application without enough testing.
Richard Deck: There are exceptions to this, but human error is typically one of the easiest things to recover from - they've blighted an application with a bug or an interface is no longer working - all those types of things. What's more important for me - and I've presented this to my organisation - is that we share our building with another company, [so] what happens if that building is declared a crime scene? The police have the right to walk in and declare the whole building a crime scene and lock you out.
Richard Deck: I think what's happening is that in today's world our focus for DR and BCP is changing and it's more to the desktop collaborative platforms - because the big infrastructure hardware vendors and SAPs of the world and Oracles of the world have solved a lot of our problems for us.
They've made them so robust that they do stay up and they are easily recoverable, whereas the other desktop collaborative platforms are not so recoverable.
Chris Clark: BCP is also about reducing the risk of needing to implement DR, so we took to saying "Why not invest that million dollars or whatever into virtualisation?"
That bought us the high availability, bought us a reduction of the systems going down due to anomalous issues, apart from the big major components, like where you can't get into the building, and floods.
Virtualisation and BCP
Michael Bark: How far down the virtualisation track have you gone?
Chris Clark: We're 95 per cent virtualised - we've even virtualised Oracle. The high availability, the ability to respond to the business, all those are great, but when you bring it to DR, it starts to make you think about the way you can do DR differently.
You're not now dealing with 1000 files that make up an operating system. You're dealing with a file. It's like a Word file. I can copy that file to Brisbane, I can play it on another VMware server and I can do the same with the data. I can wrap it up into one file.
Adam Levin: We've got a 30 per cent footprint in terms of virtualisation. For me, there's still a question mark about SQL databases that operate in a virtual environment. Our call-reporting structure is on an SQL platform and that's a question mark for me. But, generally, if you look at exchange and the file and print services and non-critical applications, absolutely, from a DR perspective, flick a switch and it's there.
John Ansley: We're only just starting. Any drug manufacturer has to have systems that are strongly structured and checked, so we've got a large engineering group that sits in our shared-services centre in Madrid and they've taken a long time to look at it and we've just purchased the infrastructure now.
So we're a long way behind, but we also have a plan to consolidate about 200 servers that have grown up around the affiliates in the region back to the global data centre here in Sydney. They're the ones we're going to target.
They're not what I would call business-critical applications, so we'll target those ones first and then we'll think about the rest later. But we do have a SAN-to-SAN with EMC and we've got a warm site off-site.
Angus Jones: We've certainly virtualised. We're probably, I'd say, 80 per cent there in virtualising our entire server, but we're finding a lot of benefits in doing that.
* A clear distinction needs to be drawn between areas of responsibility for the technology team and areas of wider business liability.
* Business continuity planning is not always top of mind for non-IT executives, so frequently it needs to be driven through fresh developments, such as new governance requirements.
* Business continuity plans and disaster recovery plans must be tested regularly. Independent assessment can often be beneficial.
* Virtualisation has the potential to have a major impact on business continuity and the ability of organisations to mirror their systems off site with ease.
Hewlett-Packard sponsored this roundtable on business continuity moderated by Paul Smith, editor of MIS, a sister publication of CIO New Zealand magazine.
Join the CIO New Zealand group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.