Delivering Highly Reliable, High Availability Clusters in GKE (Cloud Next ’18)

What happens to your mission critical applications if an asteroid hits your datacenter? Kubernetes has the power to keep your applications running through network failures, configuration mishaps, and outages but requires the right infrastructure and management to keep doing its job.

In this talk, we’ll show you the features in Google Kubernetes Engine that deliver High Availability (HA) Kubernetes with multi-zonal masters. We’ll demo the resiliency of HA clusters by taking down a GCP Zone as well as the challenges of running a highly available distributed system. We’ll take a peek under the covers at master upgrades and learn about how the Site Reliability Engineers at Google approach the challenge of keeping GKE clusters running across 15 regions.

Duration: 46:10
Publisher: Google Cloud
