Actionable Alerting for Site Reliability Engineers (class SRE implements DevOps)

September 4, 2018

Are you tired of getting paged in the middle of the night for noisy alerts or flapping systems, only to find no action can be taken? In this episode, Liz and a very sleepy Seth discuss how to build actionable alerts from your SLOs and SLIs as a Site Reliability Engineer. Alerting on low-level metrics such as CPU usage or disk space doesn’t actually show whether our users are experiencing issues with our product or service. Instead, we should build our alerts using our SLOs. By integrating our remaining error budget over time, we can see how outages or partial outages will affect our SLO. Liz discusses strategies for deciding when to alert, how to alert, and what to do with those old alerts.

Actionable alerts tie closely into the DevOps principles of expecting failure and creating a blameless culture. This is why we say “class SRE implements DevOps”.

Reference Links:
Stackdriver Service Monitoring → http://bit.ly/2wJdVS7
Creating a Dashboard with Stackdriver SLI Monitoring Metrics → http://bit.ly/2wHwGWo

Have questions? Reach out to Liz and Seth on Twitter:
@sethvargo – twitter.com/sethvargo
@lizthegrey – twitter.com/lizthegrey

Watch more episodes here → http://bit.ly/2PPL6f0
Subscribe to the channel → http://bit.ly/GCloudPlatform

Duration: 5:13
Publisher: Google Cloud
You can watch this video also at the source.

Actionable Alerting for Site Reliability Engineers (class SRE implements DevOps)

Most Viewed

How do I allow a domain user to access the SQL...

Managing files in FileZilla

10th European Software, Solutions & Services Summit 2018 announced by IT...

Cloud Brokerage Platform Primary Hosting Acquires Mobile Agent Now

Trending Now

Vantage Data Centers Secures $3B Green Loan for North American Expansion

Lenovo Launches Advanced AI-centric Server Systems for Diverse Industries

Prime Data Centers Unveils New Sustainability Strategy

Cloudbrink Launches Free Tool to Tackle Packet Loss in Hybrid Work Environments

Most Viewed

Trending Now

Cookie policy