Improving Reliability with Error Budgets, Metrics, and Tracing in Stackdriver (Cloud Next ’18)

Bit Ninja


Learn about the Site Reliability Engineering principle of error budgets and best practices for SLO-focused alerting and focused debugging. Members of the Stackdriver and Customer Reliability Engineering teams will demonstrate how Stackdriver tooling inspired by the needs of SREs at Google brings you the ability to run services more reliability and with fewer false positive signals through tracking and alerting upon error budgets and debugging with the exemplar technique during an outage.

Event schedule → http://g.co/next18

Watch more Infrastructure & Operations sessions here → http://bit.ly/2uEykpQ
Next ‘18 All Sessions playlist → http://bit.ly/Allsessions

Subscribe to the Google Cloud channel! → http://bit.ly/NextSub


Duration: 49:2
Publisher: Google Cloud
You can watch this video also at the source.