• /
  • EnglishEspañol日本語한국어Português
  • Inicia sesiónComenzar ahora

Manage reliability with service levels

Service level management is the practice of standardizing data into a universal language. Within an organization, communication between stakeholders can often prove more difficult than needed. IT departments usually don't speak in terms that the business-focused parts of an organization can understand, and they in turn don't normally communicate in terms that IT teams find useful. To improve reliability, resolving this language barrier helps to prevent issues. Service level management, or the practice of standardizing data into a universal language that all parties can understand, aids in doing just that.

Service level management best applies to the practice of Uptime, performance and reliability, but it also applies to the other practices of Customer experience, Innovation and growth, and Operational efficiency (learn more about these practices). Regardless of what area you want to improve, using SLM has two major areas of impact: business outcomes and operational outcomes.

The required business outcome in terms of reliability focuses on reducing the number of business-impacting incidents, their duration, and the number of people involved.

  • Reduce the number of business-disrupting incidents.
  • Reduce mean time to resolution (MTTR).
  • Reduce average number of people engaged (FTEs) per severe incident.

The required operational outcome in terms reliability focuses on the health of your digital product.

  • Measure operational success by what percentage of critical product applications you cover with your standard service levels.
  • Examine at the percentage of policy adoption by your primary stakeholders.
  • Focus on what's important to the stakeholders, ensure simplicity, and prove the effectiveness of service level management.

Why use service level management?

Improving service levels and reliability requires adoption of the practice by all the stakeholders of the service. This includes engineering management, product management, and executive management to quickly show the power and value of service levels and start discussions about what matters to each group. The steps in this guide will get you those meaningful discussions very quickly.

A common method first establishes output performance and input performance service levels for one digital product and its critical capabilities. This usually involves one overall output and input service level for each endpoint application (usually one or two), and then approximately 4-7 output performance service levels for assumed critical capabilities measured at the endpoint transaction.

This method doesn't survey each stakeholder for what should and shouldn't be measured, as surveys often take too long and have too many parts in scenarios like this. The important thing is to start with baselines and key transactions as "capabilities."

Successful implementations of service levels demonstrate the ability to easily measure and communicate overall system health. That initial demonstration will show the value in investing more time to refine what your service levels measure. The sooner you provide that complete demonstration, the sooner you can achieve broad adoption and begin the reliability improvement process in collaboration with all your stakeholders!

Let's take a look at some of the common terms and define our KPIs before proceeding.

Terminology

Prerequisites

  • Have data to monitor and establish SLIs for, such as through APM instrumentation of your primary application.
  • You can also use synthetics tests of your application to generate data to monitor without relying on customers.
  • While not required, we highly recommend you achieve basics skills with New Relic dashboards and NRQL.

What's in this series?

Establish an output performance SLI

Learn how to ensure proper measurement of your service output

Establish an input performance SLI

Learn how to create input indicators and objectives

Establish capability SLIs

Learn how to create service level scores on capabilities

Improve with service level management

Learn how to use SLM techniques to improve your reliability

Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.