1. Review Service Level Agreement(s) and Service Catalogue(s) - Working with the Business Relationship Manager and Business representatives determine the mission critical services and components. In addition consider the implications of security breaches and also software virus attacks.
2. Identify IT Service Continuity Management Interfaces and Involvement - determining when and who needs to be communicated with from the IT Service Continuity Management team when a Major Incident occurs. Also capture what triggers or circumstances would invoke the ITSCM plan.
3. Define Incident Severities - It is pivotal to establish a simple clearly defined Incident Severity hierarchy covering low severity through too high or critical severity incidents (Major Incidents). The Incident Severities should be reflected in the generic "IT Service Support Model" if one exists. It is imperative to ensure there is no confusion regarding severities, especially regarding what constitutes a "Major Incident" and that they can be applied across the IT department and its third party suppliers as well as the organization's Business community.
4. Define Incident Escalations - A Major Incident has the potential to have a significant impact upon an organization for example, from a reputational, legal, trading and in some cases life and death perspective. Speed is of the essence and any delays can be very costly. By establishing an Escalation Hierarchy within the organization and associated third party suppliers, appropriate authorization, focus and resource can be committed, in a timely manner to the Major Incident, to resolve and re-establish the service(s) in question.
5. Define Major Incident Process - Defining and documenting the Major Incident process, including a high-level flow diagram, is invaluable. The process documentation will then assist with defining the associated procedures to be used by all parties
6. Define Roles and Responsibilities - Clearly define in generic terms the roles and responsibilities of each party, both internal and external to the organization, engaged in the Major Incident process.
7. Review Underpinning Contract(s) and Operating Level Agreement(s) - Examine Contracts and Operating Level Agreements (OLAs) with existing third party suppliers to determine whether they align with the Major Incident process.
8. Negotiate - Having examined existing third party contracts it may be necessary to renegotiate contracts so that all suppliers align to the Incident Severities and the Major Incident process. There may be a need in exceptional cases where a third party is not prepared or unable to align to the Major Incident process, to terminate the contract and replace the supplier with a third party supplier who is prepared to align to the process. Such decisions are obviously not taken lightly and should be evaluated against the risk to the organization versus the associated cost of termination.
It is possible that some internal support teams will be required to have staff available "out of hours" to assist with Major Incidents that may occur. Some form of compensation may need to be agreed with staff. The organizations Human Resources department would usually undertake this requirement.
For those organizations embarking on outsourcing aspects of their IT Services it is worthwhile defining a Service Support Model that would include Incident Severities and associated definitions. The model can then be referenced during the contract tender process and only those third parties who are prepared to align to the model should be considered first over those suppliers who are not prepared to or are not capable of aligning.
9. Agree and Sign - Obtain agreement and signatures from all the relevant internal and third party suppliers with regards their support and commitment to the Major Incident process.
On-going Major Incident Process
10. Contact List - Capture the names, job titles, telephone numbers (both landline and mobile), preferred methods of communication of the various individual team members and third party suppliers involved in the Major Incident process.
11. Communication Plan(s) - It is important to communicate out to the Business community and other relevant staff (e.g. Business Relationship Manager(s)) in a timely manner detailing when a Major Incident occurs, followed by progress update(s) and finally notification of the restoration of service. From experience it worthwhile from an effectiveness and efficiency perspective to target the communications to those who are affected by the Major Incident. In advance identify who is to be contacted, the method and frequency of the communication, the Business Relationship Manager(s) should be of significant value in identifying and agreeing the contacts. Email is often used and setting up and maintaining distribution lists often facilitates such communications. Consider setting up Communication templates for (i) Major Incident notification, (ii) Progress Updates and (iii) Service Restoration.
12. Escalation Plan - Establish the hierarchy of names, job titles, telephone numbers (both landline and mobile), and the time period following the occurrence of the Major Incident each individual will be contacted should the incident not be resolved. The further up the hierarchy the more influential the individual will be expected to be within the organization with the capability to expedite resources and the availability of key individuals. The scope of the Escalation Plan includes internal and third party suppliers.
13. Checklist(s) -Checklists save time, reduce stress and ensure all aspects of a Major Incident are considered. Establish checklist for Meeting Agendas, Communications, Escalations, Staff Rotation (shifts) and Staff Facilities.
14. Command Centre - Where possible identify a dedicated location, including meeting room equipped with conference call facilities, whiteboards, flipcharts and pens. Ensure out of hours facilities such as Security, parking, heating, toilets, food and water are available and maintained.The meeting room may well be used outside of Major Incidents but on the understanding that if a Major Incident occurs then the room will be commandeered and existing occupants expected to leave immediately.
15. Post Major Incident Review - A very important aspect of the Major Incident process. All relevant parties involved in the Major Incident attend the review. Supporting documentation such as the Incident Record is shared in advance if possible. A walk-through of the incident together with the actions taken. Attendees are asked what went well and what did not, what actions will be taken to prevent re-occurrences and/or assist with the resolution should it happen again in the future. In addition the Major Incident process as a whole is reviewed and improvements are again identified. Meeting minutes should be produced detailing attendees, actions identified, who has been assigned the action and expected completion date. It is the responsibility of the Major Incident Manager to ensure the actions are completed accordingly.
Maintenance / Improvement
16. Process Improvement - As part of the "Post Major Incident Review" the Major Incident process is reviewed. In addition the process should be periodically reviewed with the stakeholders. Any improvements would be raised as Request for Change and follow the Change process.
17. Data Maintenance - As defined in the roles and responsibilities supporting the Major Incident process everyone is required to provide any updates such as contact names, job titles, telephone numbers, email addresses and methods of communication to the Major Incident Process Owner. The Process Owner will update the relevant documents and communicate out accordingly.
18. Scenarios - As previously mentioned time is of the essence when dealing with a Major Incident and for those new to the process receiving education and training in advance can only be beneficial. Obviously understanding the process and procedures is important, but also consider using recent Major Incidents as training scenarios, including the lessons learnt from the post Major Incident Review.
19. Education /Awareness Plan - All organizations experience staff "churn" for one reason or another. Establish an Education and Awareness Plan incorporating scheduled sessions for both key internal staff and those of third party suppliers.