About this Course
100% online

100% online

Start instantly and learn at your own schedule.
Flexible deadlines

Flexible deadlines

Reset deadlines in accordance to your schedule.
Advanced Level

Advanced Level

Hours to complete

Approx. 16 hours to complete

Suggested: 4 weeks of study, estimated 2 hours per week....
Available languages

English

Subtitles: English

What you will learn

  • Check

    How to make systems reliable

  • Check

    Understanding SLIs, SLOs and SLAs

  • Check

    Quantifying risks to and consequences of SLOs

100% online

100% online

Start instantly and learn at your own schedule.
Flexible deadlines

Flexible deadlines

Reset deadlines in accordance to your schedule.
Advanced Level

Advanced Level

Hours to complete

Approx. 16 hours to complete

Suggested: 4 weeks of study, estimated 2 hours per week....
Available languages

English

Subtitles: English

Syllabus - What you will learn from this course

Week
1
Hours to complete
27 minutes to complete

Introduction to SRE

This module is intended to bring you up to speed on the concepts underpinning SRE, CRE, and SLOs. If you're already familiar with these concepts, you may still find new information and perspectives in this module, but it is not necessary to complete it. ...
Reading
9 videos (Total 15 min), 1 quiz
Video9 videos
Introduction15s
Intro10s
CRE's Three Reliability Principles3m
Reliability in the Cloud3m
How SLOs help your business make decisions1m
How SLOs help you build features faster1m
How SLOs help you balance operational and project work1m
Making SLOs work for your organization59s
Quiz1 practice exercise
DevOps/SRE1m
Hours to complete
1 hour to complete

Targeting Reliability

In this module we’re going to talk about how you measure the desired reliability of a service. We will address what to consider when setting SLOs for your application within your organization. We'll look at the three principles we use to measure the desired reliability of a service: figuring out what you want to promise and to whom, figuring out the metrics you care about that make your service reliability “good", and finally, deciding how much reliability is good enough....
Reading
7 videos (Total 14 min), 4 quizzes
Video7 videos
SLOs vs SLAs2m
The happiness test2m
How do we measure reliability?3m
Edge cases2m
100% is the wrong target1m
Iterating1m
Quiz4 practice exercises
A working service5m
SLOs and SLAs7m
Reliability and iterating1m
Targeting Reliability Assessment7m
Hours to complete
1 hour to complete

Operating for Reliability

In this module, we’ll start by introducing a mechanism for quantifying unreliability using something called an error budget. We'll show how error budgets help you decide when to focus on making a service more reliable. And then we'll learn about some of the engineering and operational improvements that can help you do that....
Reading
7 videos (Total 19 min), 3 quizzes
Video7 videos
Error budgets3m
Everything is a trade-off3m
Error budgets: advanced concepts2m
Axes of improvement4m
Operational approach to increasing reliability2m
Module summary50s
Quiz3 practice exercises
Error budgets5m
Increasing reliability3m
Operating for Reliability Assessment5m
Week
2
Hours to complete
1 hour to complete

Choosing a Good SLI

In this module we will start off by taking a look at some characteristics of monitoring metrics that can make them useful as SLIs and contrast these against other metrics that are less useful. Because the choice of where to measure an SLI is a key variable, we'll cover the five main ways you can measure an SLI and compare their pros and cons....
Reading
14 videos (Total 41 min), 3 quizzes
Video14 videos
User happiness in metric form1m
The properties of good SLI metrics4m
Ways of measuring SLIs4m
The SLI menu2m
The SLI equation1m
Request / Response SLIs5m
Data processing SLIs6m
"But my system is really complex!"2m
Managing complexity with aggregation2m
Managing complexity with bucketing3m
Achieveable SLOs1m
Aspirational SLOs1m
Continuous improvement1m
Quiz3 practice exercises
Measuring happiness1m
Commonly used SLIs2m
Correctness and Coverage2m
Week
3
Hours to complete
5 hours to complete

Developing SLOs and SLIs

In this module, we'll start off with an overview of our four step process for developing SLOs and SLIs for a user journey. We'll introduce the fictional company that created our example mobile game, the infrastructure that we'll be working with, and the simple user journey we'll be applying the four step process to....
Reading
7 videos (Total 18 min), 4 quizzes
Video7 videos
The 4 step process1m
Our example game1m
Loading the profile page1m
Refining SLI specifications4m
Looking for observability gaps2m
Failure modes4m
Quiz2 practice exercises
Postmortem!15m
Setting Achievable SLO targets15m
Week
4
Hours to complete
4 hours to complete

Quantifying Risks to SLOs

In this module we'll be taking a critical look at the availability risks for our example service. We want to answer the question: "are our SLO targets and error budgets realistic?" ...
Reading
4 videos (Total 20 min), 2 quizzes
Video4 videos
Is your error budget realistic?3m
Modeling risks in our spreadsheet5m
Analyzing risk9m
Hours to complete
1 hour to complete

Consequences of SLO Misses

In this module, we'll cover best practices for documenting your SLOs, the rationale behind a formal error budget policy and how best to create one and finally, we'll look at an example error budget policy in order to understand the trade-offs and incentives that play out during negotiations when trying to write an error budget policy....
Reading
9 videos (Total 21 min), 3 quizzes
Video9 videos
No surprises2m
A dashboard example1m
Why an error budget policy?2m
Fundamentals of an error budget policy3m
How to draft and error budget policy3m
Example policy thresholds3m
A hypothetical policy scenario3m
Course conclusion and video wrap up47s
Quiz3 practice exercises
Error budget policies1m
Error budget policy -- considerations2m
Consequences of SLO Misses1m

About Google Cloud

We help millions of organizations empower their employees, serve their customers, and build what’s next for their businesses with innovative technology created in—and for—the cloud. Our products are engineered for security, reliability, and scalability, running the full stack from infrastructure to applications to devices and hardware. Our teams are dedicated to helping customers apply our technologies to create success....

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.