Scientific Computing with Google Cloud Platform: Particle Physics & Earth Sciences (Cloud Next ’18)

Atmospheric and oceanographic scientists need to analyze vast quantities of data coming from satellite imagery and supercomputer simulations to better understand our changing climate. To support these scientists, the Pangeo project addresses two problems using Google Cloud Platform: Pangeo scales Python’s scientific ecosystem (Numpy, Pandas, Scikit-Learn) with Dask and XArray for distributed computing; and scales human access by deploying a public JupyterHub instance using Google Kubernetes Engine, allowing anyone to log in and analyze public cloud-hosted data with scalable resources.

Large-scale scientific experiments today leverage heterogeneous data centers distributed across the globe that belong to different administrative domains. ATLAS is one such experiment constructed at the Large Hadron Collider (LHC) at CERN and has successfully integrated the Google Cloud Platform into its data processing workflows using Rucio. The Rucio system is a free and open-source software framework principally developed by CERN that allows scientific collaborations to organise, manage, and access their volume of data, from Terabytes to Exabytes, over private and public networks, with full experiment control. We show how ATLAS is using Rucio to dynamically extend its science data centers seamlessly with Google Cloud Storage in order to boost their computational physics workflows.

Event schedule →

Watch more Infrastructure & Operations sessions here →
Next ‘18 All Sessions playlist →

Subscribe to the Google Cloud channel! →

Duration: 42:39
Publisher: Google Cloud
You can watch this video also at the source.