More news from NVIDIA today. Next to its new DGX GH200 offering, it has introduced its latest networking innovation, the NVIDIA Spectrum-X. This expedited networking infrastructure intends to improve the performance and efficacy of Ethernet-based AI clouds – to meet the requirements of hyperscale generative AI services.
The integration of the NVIDIA Spectrum-4 Ethernet switch and the NVIDIA BlueField-3 DPU (Data Processing Unit) forms the basis of NVIDIA Spectrum-X. This “powerful” combination would enable the platform to provide “1.7 times” greater overall AI performance and energy efficiency, while maintaining consistent and predictable performance in multi-tenant environments. Spectrum-X’s capabilities are enhanced by NVIDIA acceleration software and software development kits (SDKs), which enable developers to build software-defined, cloud-native AI applications.
NVIDIA Spectrum-X enables network engineers, AI data scientists, and cloud service providers (CSPs) to enhance outcomes and make more informed decisions more quickly by substantially reducing the run-times of enormous transformer-based generative AI models. Some of the world’s most prominent hyperscalers, as well as notable cloud innovators, have already adopted the platform.
Ethernet that Adheres to All Standards
NVIDIA is constructing Israel-1, a hyperscale generative AI supercomputer that will be deployed in its Israeli data center, to demonstrate the capabilities of the Spectrum-X reference designs. This supercomputer will utilize Dell PowerEdge XE9680 servers equipped with an NVIDIA HGX H100 eight-GPU platform, BlueField-3 DPUs, and Spectrum-4 switches.
The Senior Vice President of Networking at NVIDIA, Gilad Shainer, emphasized the transformative nature of generative AI technologies and their influence on data center performance. He stated, “NVIDIA Spectrum-X is a new class of Ethernet networking that removes barriers for next-generation AI workloads that have the potential to transform entire industries.”
The NVIDIA Spectrum-X networking platform would offer remarkable versatility and can be leveraged across various AI applications. It employs Ethernet that adheres to all standards and ensures interoperability with Ethernet-based systems, thereby offering flexibility and compatibility.
The core of the platform is the Spectrum-4, which would be the first 51Tb/sec Ethernet switch designed specifically for AI networks. Through the collaboration of advanced RDMA over Converged Ethernet (RoCE) extensions, Spectrum-4 switches, BlueField-3 DPUs, and NVIDIA LinkX optics, an AI-optimized 400GbE end-to-end network is created.
NVIDIA Spectrum-X’s capacity to enhance multi-tenancy is one of its primary strengths. The platform would guarantee performance isolation, enabling AI workloads from various tenants to perform optimally and consistently. In addition, it would provide enhanced visibility into AI performance, allowing for the identification of performance bottlenecks. In addition, the platform includes an entirely automated fabric validation procedure.
Spectrum-X’s acceleration capabilities are powered by a variety of potent NVIDIA SDKs, such as Cumulus Linux, pure SONiC, and NetQ. These SDKs would contribute to the extreme efficacy of the platform. Furthermore, the NVIDIA DOCA (Datacenter-on-a-Chip Architecture) software framework, which rests at the core of BlueField DPUs, adds to the platform’s functionality and efficiency.
NVIDIA Spectrum-X would enable unmatched scalability, enabling the connection of 256 200Gb/s ports via a single switch and supporting up to 16,000 ports in a two-tier leaf-spine topology. This scalability is necessary for accommodating the growth and expansion of AI clouds while maintaining high performance levels and minimizing network latency.
Companies offering NVIDIA Spectrum-X include Dell Technologies, Supermicro, and Lenovo.
NVIDIA Spectrum-X, Spectrum-4 switches, BlueField-3 DPUs, and 400G LinkX optics are available now.