Amazon Web Services (AWS) has announced the public availability of Amazon Elastic Compute Cloud (Amazon EC2) DL1 instances, a new instance type for training machine learning models. DL1 instances are powered by Gaudi accelerators from Habana Labs, an Intel company. These would give up to 40% better price/performance for training machine learning models than the most recent GPU-powered Amazon EC2 instances.
Using DL1 instances, AWS clients may train machine learning models for use cases including natural language processing, object recognition and classification, fraud detection, recommendation and personalization engines, intelligent document processing, business forecasting, and more. DL1 instances are accessible on demand with no upfront obligations through a pay-as-you-go usage approach.
New DL1 instances use Gaudi accelerators designed particularly to enhance machine learning model training by offering better compute efficiency at a lower cost than general purpose GPUs. Up to eight Gaudi accelerators, 256 GB of high-bandwidth memory, 768 GB of system memory, 2nd generation Amazon proprietary Intel Xeon Scalable (Cascade Lake) CPUs, 400 Gbps networking performance, and up to 4 TB of local NVMe storage are all included in the DL1 instance.
Machine Learning Models
For training typical machine learning models, these advancements would result in up to 40% better pricing performance than the newest GPU-powered Amazon EC2 instances. Clients can get started with DL1 instances “quickly and easily” using the included Habana SynapseAI SDK, which is integrated with leading machine learning frameworks (e.g. TensorFlow and PyTorch). It allows them to “seamlessly” migrate their existing machine learning models running on GPU- or CPU-based instances to DL1 instances with minimal code changes.
Developers and data scientists can also start with reference models optimized for Gaudi accelerators, which are available in Habana’s GitHub repository and include popular models for image classification, object detection, natural language processing, and recommendation systems, among other applications.
“The use of machine learning has skyrocketed. One of the challenges with training machine learning models, however, is that it is computationally intensive and can get expensive as customers refine and retrain their models,” said David Brown, Vice President, of Amazon EC2, at AWS. “AWS already has the broadest choice of powerful compute for any machine learning project or application. The addition of DL1 instances featuring Gaudi accelerators provides the most cost-effective alternative to GPU-based instances in the cloud to date. Their optimal combination of price and performance makes it possible for customers to reduce the cost to train, train more models, and innovate faster.”
AWS Nitro System
Clients can launch DL1 instances using AWS Deep Learning AMIs or using Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS) for containerized applications. Customers may access DL1 instances through Amazon SageMaker for a more controlled experience, making building, training, and deploying machine learning models in the cloud and at the edge even easier and quicker for developers and data scientists.
The AWS Nitro System, which is a set of building components that offload many of the traditional virtualization duties to specialized hardware and software to achieve high performance, high availability, and high security while lowering virtualization overhead, is available for DL1 instances. On-Demand Instances, Savings Plans, Reserved Instances, and Spot Instances are all options for purchasing DL1 instances. DL1 instances are now available in the AWS Regions US East (North Carolina) and US West (Oregon).