Exploring GitHub with BigQuery at GitHub


Felipe Hoffa meets Alyson La, Data Scientist at GitHub. They explore how she uses BigQuery and other big data tools to do her job at GitHub.

Featuring 3 open datasets: GitHub Archive (a timeline of all GitHub events, https://www.githubarchive.org/), GitHub Data (the contents of GitHub open source files ready to be analyzed https://cloud.google.com/bigquery/public-data/github) and GHTorrent (similar to GitHub Archive, plus additional tables http://ghtorrent.org/).

On the tools side we show how Alyson works with the BigQuery web UI, and the connections between BigQuery, Tableau, and Looker.

Sample queries from the GitHub Octoverse report: https://gist.github.com/alysonla/e14c01ec7a0d2823e7317f7b58b22926

GitHub Event Types & Payloads docs: https://developer.github.com/v3/activity/events/types/

Blog post: https://github.com/blog/2298-github-data-ready-for-you-to-explore


Duration: 19:9
Publisher: Google Cloud
You can watch this video also at the source.