What is incident management?

The ITIL body of service management best practice guidance offers a clear purpose statement for incident management:

“…to minimize the negative impact of incidents by restoring normal service operation as quickly as possible.”

Source: AXELOS, Incident Management ITIL 4 Practice Guide (2020)

Where an incident is defined as “An unplanned interruption to a service or reduction in the quality of a service.”

Importantly, In this guide on ITIL Incident Management, discover what it entails, best practices, how to implement it, and the reasons to do enterprise incident management.

Incident management as a capability has evolved since the first version of ITIL was introduced in 1989. While the process itself has stayed relatively similar, the ways in which incident management capabilities are made available to end users have changed considerably. As covered later, IT help desks, then IT service desks, and then ITSM tools provide enabling capabilities for IT support staff to effectively manage incidents via a variety of contact channels. Increasingly in ways that improve both the end-user experience and IT support personnel performance.

The addition of end-user self-help and self-service capabilities, and the introduction of “shift-left” strategies, then changed how incident management capabilities are employed. Such that some of the incidents that were handled by:

  • The IT service desk and service desk analysts (Level 1) are “deflected” to self-service (Level 0)
  • The Level 2 support staff are “shifted left” to Level 1 support

Finally, artificial intelligence (AI) - and machine learning in particular - has brought additional ways in which incidents are identified, reported, and remediated. Plus, Agile ways of working have offered new ways for managing incidents that are included in ITIL 4 as “swarming” - there’s more on both of these later.

No matter how incident management is invoked or enabled, the important thing to remember is that it helps keep IT and business services available and employees productive. Plus, if incident management is new to you, and your organization, you might find that you already do some form of it even if you call it something else (such as ticket management or issue handling).

What’s incident management?

Enterprise Service Management
Offers Additional ITAM Use Cases

While major incident management isn’t an area of focus here, it’s worth calling out its existence. ITIL 4 defines a major incident as: “An incident with significant business impact, requiring an immediate coordinated resolution.” It also offers a model for major incident management that starts with setting clear criteria for distinguishing major incidents from disasters and other incidents. It’s important to remember that what defines a major incident will likely vary between organizations based on a variety of factors from how they’re structured to their portfolio of critical business services.

Major incidents, as business-impacting incidents, are often handled as “all hands to the pumps” emergencies where significant IT resources are involved to help ensure both speedy remediation and resumption of business operations.

What incident management isn’t

What incident management isn’t

Sometimes as well as knowing what something is, it’s also important to know what it isn’t. In the case of incident management there are two similar, but different ITSM capabilities that need to be seen, and treated, as different to incident management:

  • Service request management
  • Problem management

The first of these capabilities is best differentiated by focusing on what is being handled, i.e. service requests rather than incidents. ITIL 4 defines service requests as “A request from a user or a user’s authorized representative that initiates a service action which has been agreed as a normal part of service delivery.” It’s a complicated definition that’s best understood through example types:

  • A request that initiates a service action, e.g. a request for new hardware, software, or a service
  • A request for information, e.g. how do I access Microsoft Teams?
  • An access request to a resource
  • The submission of feedback, compliments, and complaints

The various versions of ITIL best practice has long called out that it’s important to treat incidents and service requests separately due to their relative urgency.

The second of these capabilities is also best differentiated by starting with what’s being handled, i.e. problems rather than incidents. ITIL 4 defines a problem as “A cause, or potential cause, of one or more incidents.” Such problem management is focused on removing the things that cause repeat incidents and their impact.

As with service requests, a key differentiator between incident management and problem management is the need for urgency. That while speed is important to both, the reality is that the time needed to undertake problem management activities - including the identification of root causes - means that it operates at a far slower pace than incident management. This can be thought of as fire prevention versus fire-fighting.

To find out more about problem management, please read the InvGate Definitive Guide to Problem Management. Plus, ITIL offers best practice guidance on all three of incident management, service request management, and problem management.

What incident management entails

ITIL v3/2011 recommended that incidents are managed through a process, this includes a number of formal steps or activities:

  • Incident identification and logging
  • Initial categorization and prioritization
  • Escalation to the major incident management process if needed or invocation of the service request management process if not an incident
  • Initial diagnosis and escalation if needed
  • Investigation and diagnosis
  • Resolution and recovery
  • Incident closure

With continuous ownership, monitoring, tracking, and communication involved throughout.

ITIL 4 updated this, albeit only slightly, to be an incident handling and resolution process that forms part of the incident management practice (which also includes the periodic incident review process):

  • Incident detection
  • Incident registration
  • Incident classification
  • Incident diagnosis
  • Incident resolution
  • Incident closure

Explore the differences between ITIL v3/2011 and ITIL 4 here.

What incident management entails

The need for, and benefits of, incident management

There are many benefits for an organization with a formal incident management capability, these include:

  • Decreasing the impact of IT incidents to maximize employee and potentially business productivity and avoiding lost revenue, lost reputation, or even lost customers
  • Reducing the costs of IT support through best practice processes and tool enablement
  • Better using potentially scarce IT resources - making IT support personnel more efficient and effective
  • Leveraging shared knowledge to speed up incident resolution
  • Resolving adverse issues before they significantly impact business operations
  • Improving collaboration between IT teams and minimizing “handover” issues
  • Improving employee experience and the business’s perceptions of IT
Benefits of incident management

Incident management in ITIL 4
vs. incident management in ITIL v3/2011

The ITIL 4 Service Value System

In addition to the generic ITIL 4 changes related to elements such as ITIL being service management not ITSM, practices rather than processes, and the service value system, ITIL 4 brought with it some incident-management-specific changes. The key one is that there’s no longer a prescriptive incident management process, with organizations encouraged to create their own value chain for incident management taking a customer, rather than IT, centric view.

There’s also the concept of swarming - where incident handling is a collaboration-based effort. There are no tiered support groups and no escalation between them. Instead, an issue is owned by an individual through to its resolution, with them bringing the right people in to assist as needed.

How to start with incident management

Incident management is one of the easiest ITSM processes/practices to justify despite it sometimes being viewed as “a cost of quality.” This justification is a great starting point for the introduction with incident management because it makes it more than “a good thing to do” and allows the scope of incident management coverage and its ambitions to be fleshed out as the costs and potential benefits are formulated. From staff numbers and operating times, through service-level targets, to the investment in ITSM tool enablement.

As mentioned earlier, there’s also the need to appreciate that your organization is probably already doing some form of incident management. It’s therefore important to thoroughly assess the status quo to see what can continue to be used rather than potentially lose something valuable in the mad rush to introduce a new way of working. For example, there might already be highly mature practices for remote support tool use.

In setting out - or improving - the scope of incident management, and in line with ITIL 4’s new focus, it’s important to understand how it will create business value. Ideally moving what’s traditionally been an issue-fixing IT support capability to one that’s focused on enabling end users and improving their productivity.

Tap into the wealth of available incident management practice that’s available in ITIL and other resources including ITSM tools. But ultimately, you need to create an incident management capability that’s best suited to your organization (rather than one that’s lifted from a completely unrelated organization). For example priority levels - some benchmarks might fit but others will need to be appraised and adjusted to suit what your organization needs.

Most ITSM tool vendors have created incident management templates, workflows, and reporting capabilities based on such best practices which might fit your organization’s needs or they might not. It’s therefore important that the ITSM tool “out-of-the-box” practices are changed if they go against your organization’s incident management needs but also important to retain the best practice that fits.

The setting of incident management priority levels, service-level targets, and performance metrics is another key activity that is best done in collaboration with key business stakeholders, albeit while recognizing that a balance needs to be found between meeting expectations and spending what will likely be limited funding wisely. This includes starting to appreciate the business impact of what IT support does and doesn’t do. For example, whether the delay in resolving an issue costs your organization more than if additional IT resources are used to ensure a quicker resolution.

Look to leverage all three of knowledge management, self-service, and automation to make incident management all three of “better, faster, cheaper.” Knowledge management will help with quicker and better resolutions (that are thus more cost-effective). While self-service, if done right, will provide end users with quicker access to - potentially automated - solutions that relieve the pressure on incident management staff and reduce labor costs. Automation is then the proverbial “icing on the cherry” making for quicker work and resolutions and extending the capabilities of unskilled people including end users via self-help.

Also, plan beyond the initial incident management capabilities using continual improvement to identify issues and to systematically find ways to improve the practice’s operations and outcomes. The aforementioned problem management should also be adopted to reduce the number of repeat incidents, even if only started in a small way to prove its value.

How to start with incident management

How ITSM tools help with incident management

One could argue that incident management is a “backbone need” for ITSM tools. Not only because it’s the most highly adopted/used ITSM process or practice but because the evolution of IT help desk tools through to ITSM tools started with ticketing for the management of IT issues, i.e. incident management. It’s not unsurprising then that ITSM tools offer a high degree of enablement for incident management that goes above and beyond the core of workflow enablement, knowledge management, self-service, and reporting and analytics.

For example, native or third-party monitoring tools with event correlation capabilities for proactive issue detection. Access to performance and device status data - along with configuration management database (CMDB) and asset data - to facilitate incident diagnosis. Or the use of native or third-party capabilities for orchestration or remote administration in incident resolution.

As well as the many traditional incident management enablement capabilities of ITSM tools, there are increasingly opportunities for AI-enabled capabilities to help across all three of “better, faster, cheaper” incident resolution. These include:

  • Virtual agents (including chatbots) for end-user self-help. These can act as the first point of contact for end-user engagements where, if human involvement isn’t the required resolution can be provided too. For example, automated password resets, knowledge provision, incident ticket logging, incident ticket status checking, or scheduling a virtual or in-person engineer session
  • Virtual agents for IT support staff knowledge augmentation. For example, automatically providing known solutions or similar issues in real-time. Plus, recommending automated actions to provide one-click resolutions when appropriate
  • Intelligent automation. This is repetitive-task automation using machine learning to reduce the level of high-volume, low-value issues received by IT support staff. For example, the use of intelligent automation to assess, categorize, prioritize, and route incoming incident (and service request) tickets
  • Predictive issue identification. Whether this is the automated identification of problems or major incidents or the linking of an incoming ticket with a known major incident or problem
  • Improved analytics capabilities. This includes both the proactive identification of problems and increased visibility of IT support performance patterns and trends
How ITSM tools help with incident management
Enterprise service management

Enterprise service management adds an extra dimension to incident management

While all of the above has been focused on incident management through an ITSM lens, enterprise service management - “the use of IT service management (ITSM) principles, practices, and capabilities by other business functions to improve their operations, services, experiences, and outcomes” - provides an extra dimension to the use of incident management. In fact, research by AXELOS and ITSM.tools found that incident management is the most commonly shared ITSM capability across organizations - at 78% of the organizations that already have an enterprise service management strategy in flight.

Finally, some ITAM best practices are aimed at ITAM professionals. There are multiple international standards, the main one of which is ISO/IEC 19770-1 which “specifies the requirements for the establishment, implementation, maintenance and improvement of a management system for IT asset management (ITAM), referred to as an “IT asset management system” (ITAMS).”

This makes it all the more important that your organization’s incident management practices are optimized. After all, sharing a sub-optimal ITSM practice in an attempt to improve the operations and outcomes of other business functions is a flawed approach.

Frequently Asked Questions

Incident Management process refers to a set of procedures and actions taken to respond to and resolve critical incidents: the detection and communication of incidents, who is responsible, which tools are used, and the steps taken to resolve it.
Incident Management seeks to restore normal service operation as quickly as possible while minimizing impact to business operations and ensuring that quality is maintained.
Incidents can be defined as unplanned interruptions in the delivery of IT services. Meanwhile, service requests refer are additional requests made by users that are often pre-approved by the organization
Triaging an incident involves two major activities: classifying the incident into the right assignment group, and involving the right group of people in order to resolve the incident as quickly as possible. Identifying the best assignment group or person suited for the incident is the main purpose of triage in Incident Management.

Evaluate InvGate as your ITSM solution

30-day free trial - No credit card needed

Get Started