Big Data Discount: How UC Santa Cruz Uses Mesos & Amazon EC2 Spot to Enable Low Cost Cancer Research


On this episode of This is My Architecture, Mary Goldman, Design and Outreach Engineer at the UC Santa Cruz Genomics Institute explains how they process genomic sequencing data on AWS. With a need to crunch data measured in petabytes, they designed a low cost solution using a combination of Docker containers and EC2 Spot instances. TOIL, the pipeline management system they built is open source (link: https://github.com/BD2KGenomics/toil) and recently published (link: http://dx.doi.org/10.1038/nbt.3772) in Nature Biotechnology.

Learn more about This Is My Architecture at – http://amzn.to/2qfaOQc.


Duration: 6:20
Publisher: Amazon Web Services
You can watch this video also at the source.