At AWS re:Invent 2023 in Las Vegas, the next generation of two AWS-designed chip families, AWS Graviton4 and AWS Trainium2, has been announced. These chips would offer improvements in price performance and energy efficiency for a variety of customer workloads, including generative artificial intelligence (AI) and machine learning (ML) training.
With each successive generation of chip, AWS intends to deliver better price performance and energy efficiency. AWS would now provide its customers with more options – in addition to chip/instance combinations featuring the newest chips from third parties like AMD, Intel, and NVIDIA – to run almost any application or workload on Amazon Elastic Compute Cloud (Amazon EC2).
Compared to current generation Graviton3 processors, Graviton4 would offer “up to 30% greater compute performance, 50% more cores, and 75% more memory bandwidth.” Graviton4 would also offer a good pricing performance and energy efficiency for a wide variety of applications operating on Amazon EC2.
Large language models (LLMs) and foundation models (FMs) can be trained in a fraction of the time while increasing energy efficiency by up to two times thanks to Trainium2, which can be deployed in EC2 UltraClusters of up to 100,000 chips and is intended to deliver training at a rate of up to four times faster than first generation Trainium chips.
As every client task is based on silicon, David Brown, Vice President of Compute and Networking at AWS, said that silicon is a crucial area for innovation in the company. “We are able to provide our clients with the most cutting-edge cloud infrastructure by concentrating our chip designs on actual workloads that matter to them. In only five years, we have produced four generations of chips, and Graviton4 is the most potent and energy-efficient model we have ever created for a wide variety of workloads. Additionally, Tranium2 will assist clients in training their machine learning models more quickly, more affordably, and more energy-efficiently in light of the growing interest in generative AI.”
SAP HANA Cloud
More than 50,000 customers – including the top 100 EC2 customers – would use Graviton-based instances for their applications. AWS has built more than 2 million Graviton processors and offers more than 150 different Graviton-powered Amazon EC2 instance types globally at scale.
Graviton-based instances are used by customers such as Datadog, DirecTV, Discovery, Formula 1 (F1), NextRoll, Nielsen, Pinterest, SAP, Snowflake, Sprinklr, Stripe, and Zendesk to run a variety of workloads, including web servers, databases, analytics, batch processing, ad serving, application servers, and microservices. The advantages of Graviton are available to customers of several AWS managed services, such as Amazon Aurora, Amazon ElastiCache, Amazon EMR, Amazon MemoryDB, Amazon OpenSearch, Amazon Relational Database Service (Amazon RDS), AWS Fargate, and AWS Lambda.
SAP HANA Cloud, SAP’s cloud-native in-memory database, is the data management foundation of SAP Business Technology Platform (SAP BTP). “Customers rely on SAP HANA Cloud to run their mission-critical business processes and next-generation intelligent data applications in the cloud,” said Juergen Mueller, CTO and member of the Executive Board of SAP SE. “As part of the migration process of SAP HANA Cloud to AWS Graviton-based Amazon EC2 instances, we have already seen up to 35% better price performance for analytical workloads. In the coming months, we look forward to validating Graviton4, and the benefits it can bring to our joint customers.”
AWS Trainum2 Chips
The goal of Trainum2’s EC2 UltraClusters is to provide the best AI model training infrastructure available in the cloud in terms of both performance and energy efficiency.
Large datasets are used to train the FMs and LLMs that power the generative AI applications of today.
Customers may utilize these models to create a wide range of new content, including text, music, photos, video, and even software code. With parameters ranging from hundreds of billions to trillions, today’s most sophisticated FMs and LLMs would need reeliable, high-performance computational power that can expand over tens of thousands of ML chips. The widest selection of Amazon EC2 instances with ML chips is already offered by AWS, and it includes the newest NVIDIA GPUs, Trainium, and Inferentia2. Trainium’s characteristics would make it an ideal choice for customers training large-scale deep learning models. Examples of current clients using Trainium include Databricks, Helixon, Money Forward, and the Amazon Search team.
More than 10,000 organizations worldwide – including Comcast, Condé Nast, and over 50% of the Fortune 500 – rely on Databricks to unify their data, analytics, and AI. “Thousands of customers have implemented Databricks on AWS, giving them the ability to use MosaicML to pre-train, finetune, and serve FMs for a variety of use cases,” said Naveen Rao, Vice President of Generative AI at Databricks. “AWS Trainium gives us the scale and high performance needed to train our Mosaic MPT models, and at a low cost. As we train our next generation Mosaic MPT models, Trainium2 will make it possible to build models even faster, allowing us to provide our customers unprecedented scale and performance so they can bring their own generative AI applications to market more rapidly.”