Clause 8.6–8.7: Resolution Practices and Service Assurance

Overview: The Operational Core of the SMS

Clauses 8.6 and 8.7 contain the service management practices that are most visible to customers, most frequently executed day-to-day, and most heavily tested in ISO 20000 certification audits. These clauses are the operational core of the SMS. Clause 8.6 addresses resolution practices (how the organization responds when incidents or problems occur) and control practices (how the organization maintains control of the IT environment). Clause 8.7 addresses assurance practices (how the organization maintains agreed service levels). Understanding and effectively implementing Clause 8.6 and 8.7 is essential for ISO 20000 compliance.

 

Clause 8.6: Resolution and Control Practices

Clause 8.6.1: Incident Management

Purpose and Scope

An incident is an unplanned interruption to a service or a reduction in service quality. Incident management is the process for responding to incidents: classifying them, determining their priority, escalating them, working to resolve them, restoring service, and closing the incident once service is restored.

Incident Classification and Prioritization

Not all incidents are equal. A critical application being down creates more urgency than a single user unable to print. Incident management requires classification (what type of incident is this) and prioritization (how urgently must it be addressed).

A common classification scheme includes:

• Hardware failures: network outages, server failures, storage issues • Software or application failures: crashes, hung processes, data corruption • Access or permission issues: users unable to access systems or data they need • Performance issues: systems running slowly due to resource exhaustion • Data or configuration issues: incorrect data or misconfiguration causing service degradation • Security incidents: intrusions, malware, unauthorized access

Prioritization typically combines impact (how many users or services affected) and urgency (how critically the business is affected) into priority levels:

• P1/Critical: wide-spread impact, multiple services down, significant business impact • P2/High: significant impact, important service affected, business process interrupted • P3/Medium: moderate impact, workaround available, single user or small group affected • P4/Low: minimal impact, cosmetic issues, non-critical function affected

SLA targets (resolution time) are typically set by priority level. P1 incidents might have 1-hour resolution targets; P2 might be 4 hours; P3 might be 24 hours. These targets drive escalation: if a P1 incident is not resolved within its SLA window, it is escalated to senior management.

Major Incident Management

When a P1 incident occurs or when an incident has very high business impact, separate major incident procedures often apply. Major incident management includes:

• Crisis team activation: senior leadership, service owners, and subject matter experts convened immediately • Frequent communication: crisis team updates every 15-30 minutes to leadership • Rapid diagnosis: using all available resources to identify the root cause quickly • Alternative solutions: if normal resolution is slow, can the service be restored through a workaround or alternative configuration • Post-incident review: once service is restored, a formal post-incident review (sometimes called a postmortem) is conducted

Incident Records and Evidence

ISO 20000 requires documented evidence of incidents and their resolution. Incident records should include:

• Incident ID and date/time opened • Description of the incident and affected services • Classification and priority • Assigned team and individual owner • Activities and diagnostics performed to resolve • Root cause (identified during resolution or problem management) • Solution implemented and date/time resolved • Date/time closed (after verification that service is restored) • SLA target and whether the incident was resolved within SLA

Auditors often examine incident records to verify that incidents are being handled according to procedures. Common findings include missing classifications, no SLA tracking, or incidents marked resolved without verification that service was actually restored to the customer.

Clause 8.6.2: Service Request Management

Service requests are distinct from incidents. An incident is an unplanned disruption; a service request is a standard, expected, pre-approved activity. Examples include:

• Password resets • Granting or revoking access to systems or data • Provisioning new hardware or software licenses • Requesting IT support or consultation • Requesting changes to existing configurations

Service request management requires:

• A service request catalogue: the list of standard requests that are available • Fulfillment procedures: how each request type is handled • SLA targets for request types: e.g., password resets fulfilled within 1 hour, access provisioning within 1 business day • Records: documentation that requests were received, fulfilled, and closed

Many organizations integrate service requests with a self-service portal (allowing users to request standard items directly) or with email and ticketing systems. The key requirement is that service requests be documented and tracked to closure.

Clause 8.6.3: Problem Management

Reactive and Proactive Problem Management

Problem management has two modes:

• Reactive: investigating the root cause of incidents that have already occurred • Proactive: identifying and eliminating potential causes before they result in incidents

Root Cause Analysis

When an incident occurs, reactive problem management investigates the root cause. Root cause analysis techniques include:

• 5 Whys: repeatedly asking "why" to trace the issue back to its fundamental cause • Fishbone (Ishikawa) diagram: visually mapping causes and sub-causes • Fault tree analysis: mapping how multiple factors combine to produce failure • Correlation analysis: examining logs and data to identify what events preceded the incident

The goal is not just to fix the immediate problem but to identify and eliminate the underlying cause, preventing recurrence.

Known Error Management

As problems are investigated and resolved, the organization builds a "known error database" or "known error record." A known error record documents:

• The problem and symptoms • Root cause • Workaround (if the problem cannot be immediately fixed) • Permanent fix (when available) • Related incidents: which incidents have the same root cause

When a new incident occurs, the help desk or support team can search the known error database to see if the incident matches a known problem. If it does, the known workaround can be applied immediately, reducing resolution time while the permanent fix is developed.

Problem Records and Evidence

ISO 20000 requires documented evidence of problems and their resolution. Problem records should include:

• Problem ID and date opened • Description and symptoms • Related incidents: which incidents have the same root cause • Root cause analysis documentation • Known error record (if applicable) • Permanent fix plan and progress • Status: whether the problem is active, in progress, or resolved • Review and closure documentation

A common audit finding is problem records opened but never progressed to closure. The organization may create a problem ticket but never complete the root cause investigation or track the problem to a permanent fix.

Clause 8.6.4: Configuration Management

Configuration management maintains an accurate record of all IT components (configuration items or CIs) that make up the services the organization manages. A CI might be a server, network switch, application, database, virtual machine, software license, or any other component that is tracked.

CMDB: The Configuration Management Database

The Configuration Management Database (CMDB) is the repository of CI data. The CMDB should include:

• CI attributes: name, type, version, owner, location, procurement date • CI relationships: which CIs depend on or connect to other CIs • CI lifecycle status: active, retired, in development • Associated documentation: for each CI, links to technical specifications, support contacts, etc.

A well-maintained CMDB enables other service management practices: incident management (tracing the impact of a CI failure through related CIs), change management (understanding the blast radius of a proposed change), and problem management (identifying which CIs are involved in a problem).

CMDB Verification and Accuracy

Many organizations populate a CMDB initially but fail to maintain it. CIs are added to the environment but not added to the CMDB. Configurations change but CMDB records are not updated. The result is a CMDB that diverges from reality.

ISO 20000 requires that the organization verify the accuracy of the CMDB at defined intervals. This means:

• Regular audits: comparing the CMDB to the actual environment to identify discrepancies • Reconciliation: updating CMDB records to match actual configuration • Documented evidence: audit reports showing verification activities and reconciliation actions

Clause 8.6.5: Change Management

Change Types and Processes

Not all changes are handled the same way. ISO 20000 recognizes three change types:

• Standard changes: pre-approved, low-risk changes that can be implemented quickly using defined procedures (e.g., password resets, adding users to standard groups) • Normal changes: changes that require assessment and formal approval before implementation • Emergency changes: urgent changes required to resolve critical incidents; approval is expedited, but the change is reviewed post-implementation

Standard changes have minimal bureaucracy; they use established, tested procedures. Normal changes go through a change review process (often including a Change Advisory Board or CAB) where the change is assessed for impact and risk before approval. Emergency changes bypass pre-approval but undergo post-implementation review to ensure the emergency was justified.

Change Records and Required Information

The organization must maintain documented change records. Each change record should include:

• Change description: what is being changed and why • Change type: standard, normal, or emergency • Reason for change: business justification • Risk assessment: what could go wrong if the change is implemented • Impact analysis: which services, CIs, and customers are affected • Implementation plan: step-by-step plan for implementing the change • Rollback plan: how to revert if the change causes problems • Approval: sign-off from appropriate stakeholders (CAB, service owner, etc.) • Implementation evidence: records showing that the change was executed as planned • Post-implementation review: verification that the change achieved its objective and caused no unexpected issues • Change success: did the change achieve its intended purpose without creating new problems

The Change Advisory Board (CAB)

The Change Advisory Board is a forum for reviewing and approving normal changes. CAB membership typically includes:

• Service owners: representatives for each service that might be affected by changes • Infrastructure and operations: technical experts who understand system dependencies • Risk and compliance: ensuring changes do not violate policies or regulatory requirements • Change management: facilitating the process

The CAB meets regularly (weekly, bi-weekly, etc.) to review proposed changes, assess risks, and make approval decisions. Decisions and reasoning must be documented in meeting minutes.

Clause 8.6.6: Release and Deployment Management

Release and deployment management addresses how software, patches, and updates are packaged and deployed to the production environment. This includes:

• Release planning: determining what components will be included in a release and when the release will be deployed • Release composition: gathering components and conducting release-level testing • Deployment: rolling out the release to production environments • Documentation and approval: release notes, deployment verification, sign-off

Most releases require a change request (Clause 8.6.5) and must go through the change management approval process.

KEY CONCEPTThe incident/problem/change/configuration quadrant shows how these four practices are deeply interdependent. The CMDB feeds change impact assessment; incidents feed problem analysis; problems drive changes; changes must be reflected in the CMDB. Failure in any one area weakens all four.

 

Clause 8.7: Service Assurance Practices

While Clause 8.6 practices address how the organization responds to events (incidents, problems) and controls changes, Clause 8.7 practices address how the organization maintains and improves agreed service levels. There are four main service assurance practices.

Clause 8.7.1: Availability Management

Availability is a measure of whether the service is accessible and functioning when users need it. Availability management includes:

• Availability targets: defining what availability the service should provide (e.g., 99.5% availability) • Availability planning: designing and architecting services to meet availability targets • Availability monitoring: continuous measurement of actual availability • Availability reporting: regular reports comparing actual availability to targets • Availability improvement: identifying causes of unavailability and taking action to improve

Clause 8.7.2: Service Continuity Management

Service continuity management addresses how the organization prepares for and responds to disasters or major disruptions. It includes:

• Continuity planning: developing plans to maintain critical services during or after a disaster • Recovery objectives: defining Recovery Time Objective (RTO—how quickly service must be restored) and Recovery Point Objective (RPO—how much data loss is acceptable) • Backup and recovery: ensuring that data is backed up and can be recovered • Testing: regularly testing continuity plans to ensure they work • Training: ensuring staff know their roles in a disaster recovery scenario

For organizations certified to ISO 22301 (business continuity management), Clause 8.7.2 aligns with and supports ISO 22301 requirements.

Clause 8.7.3: Capacity Management

Capacity management ensures that services have sufficient resources (computing, storage, network) to meet user demand and performance targets. It includes:

• Capacity monitoring: continuously measuring resource utilization (CPU, memory, disk, network) • Capacity planning: forecasting future demand based on growth trends • Capacity incidents: managing situations where resources become exhausted and service performance degrades • Optimization: rightsizing resources to balance cost and performance

Clause 8.7.4: Information Security Management

Information security management ensures that services and the data they process are protected from unauthorized access, modification, or loss. ISO 20000 requires that the organization implement information security controls relevant to service management. For organizations certified to ISO 27001 (information security management), Clause 8.7.4 aligns with ISO 27001 requirements.

Information security management in the SMS context includes:

• Access controls: ensuring that users can access only the data and systems they are authorized to use • Data protection: protecting data at rest and in transit • Incident response: procedures for responding to security incidents (unauthorized access, data breaches, malware) • Security monitoring: continuous monitoring for security threats and anomalies

IMPORTANTClause 8.6 generates more audit nonconformities than any other section of ISO 20000. Most commonly, the issue is not the absence of processes but the absence or incompleteness of records. Processes may exist, but evidence that the processes have been executed—incident records, change approvals, problem investigations—is missing or incomplete.
BITLION INSIGHTBitlion GRC integrated ITSM and GRC capabilities enable comprehensive management of Clause 8.6 practices: incident and problem tracking, change management workflows, configuration item registers, and integrated performance monitoring all within a unified platform.

 

Resolution Practice Requirements Summary

PracticeKey ISO 20000 RequirementsRequired RecordsCommon Audit Finding
Incident ManagementClassify incidents; set priorities; escalate; resolve within SLA; close with verificationIncident records with classification, priority, resolution, SLA statusIncidents logged but not classified consistently; no escalation evidence; SLA breaches not tracked
Service Request ManagementMaintain service request catalogue; define fulfillment procedures; SLA targets by request typeService request records from receipt to fulfillmentNo service request catalogue; requests tracked informally without documented closure
Problem ManagementInvestigate incident root causes; maintain known error database; progress problems to resolutionProblem records with RCA documentation; known error records; problem closure evidenceProblem records opened but never progressed; no documented RCA; known error database not maintained or used
Configuration ManagementIdentify all CIs; maintain CMDB with CI relationships; verify CMDB accuracy periodicallyCI records; relationship maps; CMDB verification audit reportsCMDB populated initially but not maintained; no verification audits; significant divergence from actual environment

 

Service Assurance Practice Summary

PracticePlanning RequirementOperational RequirementEvidence for Audit
Availability ManagementDefine availability targets (uptime %); establish availability metricsContinuously monitor actual availability; compare to targets; investigate and remediate failuresAvailability reports; incident records; improvement action tracking
Service Continuity ManagementDocument recovery objectives (RTO, RPO); develop continuity plans; identify recovery proceduresTest continuity plans at defined intervals; maintain backup and recovery capability; update plans when services changeContinuity plans; test results and sign-off; recovery procedure documentation
Capacity ManagementForecast capacity demand; establish capacity thresholds and alertsMonitor utilization; alert on threshold exceedance; plan for capacity additions before exhaustionCapacity reports; monitoring alerts; capacity incident records; expansion planning documentation
Information Security ManagementIdentify security requirements and threats; design security controls into servicesMonitor for security threats; respond to security incidents; verify security control effectivenessSecurity policies; incident records; monitoring logs; control effectiveness evidence

 

Integration and Interdependency

The practices in Clause 8.6 and 8.7 do not operate in isolation. They form an integrated ecosystem:

• Incidents reveal problems; problems drive changes; changes are tracked in the CMDB; the CMDB supports impact assessment for future changes • Availability targets (Clause 8.7.1) inform incident priority (Clause 8.6.1) and define the SLAs that drive incident resolution timeframes • Capacity incidents (Clause 8.7.3) may be classified as incidents and tracked through incident management (Clause 8.6.1) • Security incidents (Clause 8.7.4) are a high-priority category of incident requiring escalated response • Configuration accuracy (Clause 8.6.4) enables effective change impact assessment (Clause 8.6.5) and availability improvement (Clause 8.7.1)