Guiding IT Performance with OKRs

by Adam
September 9, 2018

I recently read John Doerr’s book Measure What Matters on OKRs. (OKRs stands for Objectives and Key Results.) They’re part of a goal-setting process that sets goals with key results to measure your progress along the way. The process is simple, but not easy. (I highly recommend this talk on “simple but not easy” even though it’s about software engineering).

The book got me thinking about DevOps and how we constantly strive to measure and improve our own processes. Accelerate nails down the metrics behind IT performance. Pairing Accelerate’s metrics with OKRs creates a way for teams to move the needle. In this post, I want to discuss what OKRs would look like for improving IT performance metrics.

First, a Quick Recap

Accelerate boils IT performance down to four metrics: Lead Time, Deploy Frequency, MTTR, and Change Failure Rate. Lead time is how long it takes to go from development to production. Deployment frequency is how often changes are deployed to production. MTTR is the standard mean time to resolve production issues. Lastly, Change Failure Rate is the percentage of changes (i.e. deploys) that have negative consequences in production (e.g. a service outage or reduced functionality). OKRs provide a way forward regardless of where you are now. Let’s start with lead time.

Lead Time & Deployment Frequency

Accelerate’s study finds that high performance teams see lead times under an hour. Low- and mid-range teams see lead times of between once a week and once a month. In other words, the best teams deploy multiple times a day and far more than their low- and mid-range competition. Of course, it makes sense to try to improve lead times. Everyone would like to go faster, but what does that look like for teams deploying once a month? What are their objectives and key results for improving that metric?
Teams stuck in this rut are likely burdened by long-running branches, which makes things harder to integrate and test. They have CI and the tools to deploy quickly, but are stuck getting the code out the door faster. So the key results need to be about eliminating bottlenecks in the process before tackling tooling. Here’s an OKR sheet for improving lead time:

Objective: Double team velocity.

Key Result: Merge and integrate all topic branches within three working days.

Key Result: Identify and fast track small changes by the end of the quarter.

This example assumes there is some upfront understanding on what the bottlenecks are and what can be done to improve them. Note that each key result requires a number and a date. In other words, reading the objective followed by “as measured by” the key results should make sense.
Undoubtedly, focusing on dropping lead times (or improving velocity) requires working in smaller batches that are easier to develop, test, and deploy. Accelerate finds high-performance teams deploy on-demand while low and medium deploy between once a week and once a month. This means there’s a strong correlation between lead time and deployment frequency, which makes sense. Teams pushing things through development faster can naturally deploy faster as long as their pipelines are fast enough. Conversely, fast deployment pipelines don’t evolve without the back pressure from development. Identifying where your team in this loop is required to move out of it. Teams stuck with slow deployment pipelines may be blocked by technical issues, not necessarily people nor process issues. One set of OKRs can cover improving the deployment pipeline:

Objective: Continuously deliver valuable to our customers.
Key Result: Automate the promotion of QA, staging, and production by the end of the quarter.
Key Result: Reduce (longest step: example QA) time by 30% by end of the quarter.
Key Result: Parallelize test steps by end of the month

This isn’t to say that these are the only possible key results. The point is to illustrate what key results may look like for teams trying to improve their deployment frequency. Again, focus on bottlenecks. If you’re not addressing the bottleneck, then you’re not making progress. So ensure that your key results map to known and obvious bottlenecks. That will reduce lead time and increase deployment frequency which impacts the other two metrics.

MTTR and Change Failure Rate

Accelerate finds that high performance teams achieve MTTR numbers of less than an hour and change failure rates between 0 and 15%. The goal is to move your team closer to those numbers. This begins by understanding the technical practices employed by high performance teams. High performance teams leverage high-value telemetry, version control, automation, and continuous delivery to name a few. My guess is that teams have the most to gain from improving their telemetry and continuous delivery pipeline. I argue that telemetry is the most important practice to focus on because telemetry is ultimately responsible for telling the team if something is wrong or not. That’s why it’s a great place to start with a wonderful (and actionable) set of OKRs:

Objective: Gain real-time understanding into business and technical operations.

Key Result: Instrument all revenue generating flows by end of the quarter.

Key Result: Create real-time deploy dashboard with top 5 operational metrics by end of the month.

Key Result: Test telemetry to cover the last 5 production regressions by end of the quarter.

Key Result: Create page on spikes in customer support requests by Friday.

I find that teams have the most to gain by improving their telemetry. Not only does it make problems easier to identify and resolve, it also helps all engineers better understand the system and impact future engineering decisions. Once telemetry improves, then it’s possible to set OKRs for dropping change failure rate:

Objective: Become the most stable product our customer users.

Key Result: Use canary releases for production deployments by end of the month.

Key Result: Deploy 3 features behind a feature flipper by end of the quarter.

Key Result: Test the deployment pipeline covers the last 3 major production regressions by the end of month.

These examples are not exhaustive, but they are enough to frame an approach for improving your team’s performance. I think it’s effective because it makes decisions objective and focuses improvements on specific areas. More important, it clarifies goals and steers teams away from toiling away at minimal improvements.

What Are Your OKRs?

OKRs are a great model for achieving organizational goals and ultimately pushing them to stretch beyond what they think possible. In fact, John Doerr calls this one of OKR’s super powers. That’s why I’m excited Accelerate provides the metrics and technical practices to go with OKRs. If you’re unsure where to start, then choose one of the 4 metrics and associated technical practices. Set an objective to change the metric and key results for changing the relevant technical or organizational practices. Just keep in mind that each key result must be measured (remember “as measured by”), time bound, and hopefully tied to a bottleneck. Those three things will keep you on track to your objective and ultimately a more productive, confident, and happy team.

by Ofir Nachmani
June 1, 2016

5 Years of Building A Cloud: Interview With LivePerson Openstack Cloud Leader

by Alberto Namias
April 18, 2024

How to Plan Ahead for RSA Conference 2024 in San Francisco

by IOD Team
April 3, 2024

Wiz Partners with IOD, Building a Content Engine That Pays For Itself with 18X Organic Traffic – IOD and Wiz Case Study

Adam Hawkins

IOD Expert

Adam is backend/service engineer turned deployment and infrastructure engineer. His passion is building rock solid services and equally powerful deployment pipelines. He has been working with Docker for years and leads the SRE team at Saltside. Outside of work he's a traveller, beach bum, and trance addict.

[Like this tech content?]

Our tech experts can create high quality content for your tech brand.

Guiding IT Performance with OKRs

First, a Quick Recap

Lead Time & Deployment Frequency

MTTR and Change Failure Rate

What Are Your OKRs?

Related posts

5 Years of Building A Cloud: Interview With LivePerson Openstack Cloud Leader

How to Plan Ahead for RSA Conference 2024 in San Francisco

Wiz Partners with IOD, Building a Content Engine That Pays For Itself with 18X Organic Traffic – IOD and Wiz Case Study

Adam Hawkins

IOD Expert

[Like this tech content?]

Want to know how to maximize your reach with your content?

I'm interested in:

Search our site:

Recent Posts

How to Plan Ahead for RSA Conference 2024 in San Francisco

Wiz Partners with IOD, Building a Content Engine That Pays For Itself with 18X Organic Traffic – IOD and Wiz Case Study

Here’s What You Need to Know About the State of Kubernetes, Direct from KubeCon 2024 Paris

Why We Wouldn’t Miss KubeCon Paris 2024: See You There?

12 Big Don’ts to Make Tech Tutorial Videos Your Practitioner Audience Will Love

Contact Us

What We Do

Resources

Company

Get Started

Find us online