Home » ITIL » Two Bomb Blasts and a Major Fire in Ten Years – What Odds?

Two Bomb Blasts and a Major Fire in Ten Years – What Odds?

Businesses can be reluctant to go public about their trials and tribulations, particularly when it comes to disaster recovery.  This stems from the view that making details of major disasters public could possibly be detrimental to the organization’s share price and investor satisfaction.

With that in mind – some learning then from an incident I was involved in.  This was my third major incident in less than 10 years – the first two being terrorist bombs in the City of London but this one, a fire at a studio and production facility for a major youth broadcaster, had the most potential impact on the business.  In an ultra-competitive market with only one or two competitors, a channel outage would mean viewers switching channels, to the competition, with little hope of tempting them back.

Like many businesses 15 years ago, the world of technology was much simpler.  Networks were in their infancy. For many businesses, a LAN was already a stretch too far and wide-area networking was merely a twinkle in many a Chief Information Officer’s eye.  The internet was little more than a means of providing electronic brochures to clients; web commerce didn’t really exist at that time.

Application environments were heavily vendor focused.  At the music television channel that I was working for, new upgrades to core applications tended to come infrequently and were often all-encompassing.  A single annual upgrade would contain all the fresh functionality you were going to get for that year.  There were no such things as automatic upgrades to manage.  Your new desktop came once every two or three years when you wrestled with a new desktop rollout.  Live systems were mainly characterized as applications from large business vendors that ran on either dedicated workstations or in software applications that turned PCs into dumb workstations to run terminal sessions on larger mini and mainframe computers.

This environment (IBM mini computers, VAX servers and newly installed Unix servers) heavily influenced the approach to business continuity, which was based on a simple premise – back it all up, store the backup off-site and then throw as much technology and money as was needed because the insurance company would be paying.   There was no such central control of detailed records of hardware configuration, particularly at the workstation level as this would have been regarded as too much information.

Against this background, the fire took place early Thursday night and uploads to satellites were only interrupted for an hour in the immediate aftermath of the fire as power to the building was cut to make fire fighting safe.

Remarkably, within four days every user had every service restored.   Because so many systems were highly centralized, restoring services was relatively easy.  All that was needed was a backup, a new set of application software and the simplest of workstations.  The biggest dilemma we faced was ironing out unacceptably slow performance from the newly restored systems, and that problem was resolved by simply installing the next largest Unix server on the vendor’s product list.

Today’s businesses face a tough challenge when it comes to invoking disaster recovery after a similar incident.

Today’s IT environment is infinitely more complex.  Workstations are rarely standard; they can be running on a wide variety of operating systems or hardware platforms within the same organization. The rapid application development techniques of today mean that a far more detailed understanding of what applications and which versions are deployed where is required.  This really requires the disciplines of an IT operation that is rooted in ITIL® if an appropriate response to a similar disaster is to be achieved.   Vendors today offer integrated business recovery plans that are designed to bring the services you lose back online as quickly as possible.  In a well-run business, there will always be the insurance policy to take care of the money side of things but who provides the information regarding the environment to be created and the priorities?

Today sees complex networking environments across local, wide-area and internet environments.  These environments support applications that have become the lifeblood of a business’s cash flow, and these applications are, in turn, delivered using rapid, regular and small incremental upgrades to live applications.  Combine these factors, and the ability to recover is based on much more than an insurance company with deep pockets and the goodwill of staff to get stuck in and see the restoration of service through to the bitter end.

Deep within the Service Design phase of the ITIL model sits Continuity Management.  Easily overlooked when deploying the pragmatism that ITIL advocates, and that we have discussed before, very careful consideration should be given to the Continuity Management processes.  The temptation in any environment where budgets are under pressure can always be to gloss over the process on the grounds that it can be done tomorrow, or there’ll never be a disaster here. However, consider this.

In the days when I was responding to bomb blasts and major fires, how welcome would the question “Yes, but which ones do I restore first?” be?  Your ITIL processes will have created all of your Configuration Databases, your Release Management Databases and hopefully your Knowledge Management Databases to help make sense of everything else, but without a blue print for action, there is little chance of a smooth recovery to normal operations in an interim environment.

It’s all too easy to think that Business Impact Analysis and Business Continuity Management as tools leading to an appropriate Business Continuity Plan can be left until another day when there’s less pressure.  The review of what’s delivered today compared to the cost of delivering it can fail to miss the point that BCP delivers the majority of its value at some unknown point in the future, if at all.  It should always be stressed that processes of this nature can only ever deliver true value if there is effort up front for scope definition, policy drafting and continuity strategy definition but once undertaken there can be confidence that the organization’s planning is sufficient.

A final note, the studio and production facilities were restored and a very detailed set of Business Continuity Plans was developed and maintained. It was interesting to note that when a major studio overhaul that required re-locating satellite upload facilities and demolishing and rebuilding studios was undertaken, the BCP was dusted off and used as a valuable input in the new project.

Tags: , , , ,

About this author:

Jon Francum

Jon is the Director of Training at Ashford Global IT.

Leave a Reply

Your email address will not be published. Required fields are marked *

*