Building a data pipeline on Google Cloud is one of the most common things customers do. Increasingly, customers want to build these data pipelines across hybrid infrastructures. Using Apache Kafka as a way to stream data across hybrid and multi-region infrastructures is a common pattern to create a consistent data fabric. Using Kafka and Confluent allows customers to integrate legacy systems and Google services like BigQuery and Dataflow in real time.
Learn how to build a robust, extensible data pipeline starting on-premises by streaming data from legacy systems into Kafka using the Kafka Connect framework.
This session highlights how to easily replicate streaming data from an on-premises Kafka cluster to Google Cloud Kafka cluster. Doing this integrates legacy applications and analytics in the cloud, using different Google services like AI Platform, AutoML, and BigQuery.
Speakers: Sarthak Gangopadhyay, Josh Treichel
Google Cloud Next ’20: OnAir → https://goo.gle/next2020
Subscribe to the GCP Channel → https://goo.gle/GCP
product: BigQuery, Cloud Dataflow; fullname: Sarthak Gangopadhyay;
Publisher: Google Cloud
You can watch this video also at the source.