While major incident management isn’t an area of focus here, it’s worth calling out its existence. ITIL 4 defines a major incident as: “An incident with significant business impact, requiring an immediate coordinated resolution.” It also offers a model for major incident management that starts with setting clear criteria for distinguishing major incidents from disasters and other incidents. It’s important to remember that what defines a major incident will likely vary between organizations based on a variety of factors from how they’re structured to their portfolio of critical business services.
Major incidents, as business-impacting incidents, are often handled as “all hands to the pumps” emergencies where significant IT resources are involved to help ensure both speedy remediation and resumption of business operations.
ITIL v3/2011 recommended that incidents are managed through a process, this includes a number of formal steps or activities:
With continuous ownership, monitoring, tracking, and communication involved throughout.
ITIL 4 updated this, albeit only slightly, to be an incident handling and resolution process that forms part of the incident management practice (which also includes the periodic incident review process):
Explore the differences between ITIL v3/2011 and ITIL 4 here.
There are many benefits for an organization with a formal incident management capability, these include:
In addition to the generic ITIL 4 changes related to elements such as ITIL being service management not ITSM, practices rather than processes, and the service value system, ITIL 4 brought with it some incident-management-specific changes. The key one is that there’s no longer a prescriptive incident management process, with organizations encouraged to create their own value chain for incident management taking a customer, rather than IT, centric view.
There’s also the concept of swarming - where incident handling is a collaboration-based effort. There are no tiered support groups and no escalation between them. Instead, an issue is owned by an individual through to its resolution, with them bringing the right people in to assist as needed.
Incident management is one of the easiest ITSM processes/practices to justify despite it sometimes being viewed as “a cost of quality.” This justification is a great starting point for the introduction with incident management because it makes it more than “a good thing to do” and allows the scope of incident management coverage and its ambitions to be fleshed out as the costs and potential benefits are formulated. From staff numbers and operating times, through service-level targets, to the investment in ITSM tool enablement.
As mentioned earlier, there’s also the need to appreciate that your organization is probably already doing some form of incident management. It’s therefore important to thoroughly assess the status quo to see what can continue to be used rather than potentially lose something valuable in the mad rush to introduce a new way of working. For example, there might already be highly mature practices for remote support tool use.
In setting out - or improving - the scope of incident management, and in line with ITIL 4’s new focus, it’s important to understand how it will create business value. Ideally moving what’s traditionally been an issue-fixing IT support capability to one that’s focused on enabling end users and improving their productivity.
Tap into the wealth of available incident management practice that’s available in ITIL and other resources including ITSM tools. But ultimately, you need to create an incident management capability that’s best suited to your organization (rather than one that’s lifted from a completely unrelated organization). For example priority levels - some benchmarks might fit but others will need to be appraised and adjusted to suit what your organization needs.
Most ITSM tool vendors have created incident management templates, workflows, and reporting capabilities based on such best practices which might fit your organization’s needs or they might not. It’s therefore important that the ITSM tool “out-of-the-box” practices are changed if they go against your organization’s incident management needs but also important to retain the best practice that fits.
The setting of incident management priority levels, service-level targets, and performance metrics is another key activity that is best done in collaboration with key business stakeholders, albeit while recognizing that a balance needs to be found between meeting expectations and spending what will likely be limited funding wisely. This includes starting to appreciate the business impact of what IT support does and doesn’t do. For example, whether the delay in resolving an issue costs your organization more than if additional IT resources are used to ensure a quicker resolution.
Look to leverage all three of knowledge management, self-service, and automation to make incident management all three of “better, faster, cheaper.” Knowledge management will help with quicker and better resolutions (that are thus more cost-effective). While self-service, if done right, will provide end users with quicker access to - potentially automated - solutions that relieve the pressure on incident management staff and reduce labor costs. Automation is then the proverbial “icing on the cherry” making for quicker work and resolutions and extending the capabilities of unskilled people including end users via self-help.
Also, plan beyond the initial incident management capabilities using continual improvement to identify issues and to systematically find ways to improve the practice’s operations and outcomes. The aforementioned problem management should also be adopted to reduce the number of repeat incidents, even if only started in a small way to prove its value.
One could argue that incident management is a “backbone need” for ITSM tools. Not only because it’s the most highly adopted/used ITSM process or practice but because the evolution of IT help desk tools through to ITSM tools started with ticketing for the management of IT issues, i.e. incident management. It’s not unsurprising then that ITSM tools offer a high degree of enablement for incident management that goes above and beyond the core of workflow enablement, knowledge management, self-service, and reporting and analytics.
For example, native or third-party monitoring tools with event correlation capabilities for proactive issue detection. Access to performance and device status data - along with configuration management database (CMDB) and asset data - to facilitate incident diagnosis. Or the use of native or third-party capabilities for orchestration or remote administration in incident resolution.
As well as the many traditional incident management enablement capabilities of ITSM tools, there are increasingly opportunities for AI-enabled capabilities to help across all three of “better, faster, cheaper” incident resolution. These include:
While all of the above has been focused on incident management through an ITSM lens, enterprise service management - “the use of IT service management (ITSM) principles, practices, and capabilities by other business functions to improve their operations, services, experiences, and outcomes” - provides an extra dimension to the use of incident management. In fact, research by AXELOS and ITSM.tools found that incident management is the most commonly shared ITSM capability across organizations - at 78% of the organizations that already have an enterprise service management strategy in flight.
Finally, some ITAM best practices are aimed at ITAM professionals. There are multiple international standards, the main one of which is ISO/IEC 19770-1 which “specifies the requirements for the establishment, implementation, maintenance and improvement of a management system for IT asset management (ITAM), referred to as an “IT asset management system” (ITAMS).”
This makes it all the more important that your organization’s incident management practices are optimized. After all, sharing a sub-optimal ITSM practice in an attempt to improve the operations and outcomes of other business functions is a flawed approach.