From Site Reliability Engineering (SRE) to Customer Reliability Engineering (CRE) and Cloud Ops, there’s a lot involved in keeping the Google cloud running, scaling and performing, across our organization and by extension for our customers. In this video, Mahesh Kallahalla, Luke Stone, and William Bonnell give you a close look into the internal procedures we use to continually improve reliability. They also discuss best practices for interacting with Google in order to reduce mean-time-to-detect and mean-time-to-recovery.
Missed the conference? Watch all the talks here: https://goo.gl/c1Vs3h
Watch more talks about Infrastructure & Operations here: https://goo.gl/k2LOYG
Publisher: Google Cloud
You can watch this video also at the source.