Global computing, storage, networking solutions and server system vendor, Supermicro, has released new servers based on NVIDIA Ampere architecture GPUs and 3rd Gen Intel Xeon Scalable processors with built-in AI accelerators (Supermicro X12 series). These servers are built for high-performance AI applications that require minimal latency and excellent application performance.
With high-speed CPU-GPU and GPU-GPU interconnect, the 2U NVIDIA HGX A100 4-GPU system is ideal for implementing contemporary AI training clusters at scale. By pooling power supply and cooling fans, the Supermicro 2U 2-Node system would save energy and money while also lowering carbon emissions.
It also supports a variety of discrete GPU accelerators that may be tailored to the workload. The newest Intel Software Guard Extensions offer enhanced hardware security measures on each of these platforms (Intel SGX).
“Supermicro engineers have created another extensive portfolio of high-performance GPU-based server systems that reduce costs, space, and power consumption compared to other designs in the market,” said Charles Liang, President and CEO of Supermicro. “With our innovative design, we can offer customers NVIDIA HGX A100 (code name Redstone) 4-GPU accelerators for AI and HPC workloads in dense 2U form factors. Also, our 2U 2-Node system is uniquely designed to share power and cooling components which reduce OPEX and the impact on the environment.”
Energy-Efficient Server Architecture
The 2U NVIDIA HGX A100 server is geared for analytics, training, and inference applications and is built on 3rd Gen Intel Xeon Scalable processors with Intel Deep Learning Boost technology. With four A100 GPUs completely networked with NVIDIA NVLink and up to 320GB of GPU RAM to expedite breakthroughs in business data science and AI, the system can provide up to 2.5 petaflops of AI performance. For sophisticated conversational AI models like BERT large inference, the server system would be up to 4x quicker than prior generation GPUs, while BERT large AI training gets a 3x performance increase, according to Supermicro.
Furthermore, the improved thermal and cooling designs would make these server systems excellent for high-performance clusters with a focus on node density and power efficiency. These servers can also use liquid cooling, which may save considerably more money on OPEX. This platform also supports Intel Optane Persistent Memory (PMem), which allows much bigger models to be stored in memory near to the CPU before being processed by the GPUs. The system may additionally be fitted with four NVIDIA ConnectX-6 200Gb/s InfiniBand cards to provide GPUDirect RDMA with a 1:1 GPU-to-DPU ratio for applications that require multi-system interaction.
The new 2U 2-Node is a resource-saving, energy-efficient architecture that can handle up to three double-width GPUs per node. A single 3rd Gen Intel Xeon Scalable CPU with up to 40 cores and built-in AI and HPC acceleration is also included in each node. This balance of CPUs and GPUs will assist a wide range of AI, rendering, and VDI applications.
The server system can process enormous data flow for demanding AI/ML applications, deep learning training, and inferencing while protecting the workload and learning models thanks to Supermicro’s advanced I/O Module (AIOM) expansion slots for fast and flexible networking capabilities. It would also be perfect for multi-instance high-end cloud gaming and a variety of other compute-intensive VDI apps. Virtual Content Delivery Networks (vCDNs) will also be able to meet the growing demand for streaming services. Power supply redundancy is built-in, with any node able to utilize the power supply of the next node in the case of a failure.