In early September 2020, NVIDIA debuted its second generation GeForce RTX 30 family of graphics cards, the Ampere RTX architecture. NVIDIA broke with tradition when its new generations of cards were sold more expensive than their predecessors, which means that the cost of training models has remained more or less the same.
This time NVIDIA has set the price of new and more popular cards at the level of the previous generation of cards at the time of sale. For AI developers, this event is significant — in fact, the RTX 30 cards open up access to performance comparable to the Titan RTX, but with a much lighter price tag. Data science developers now have the ability to train models faster without increasing costs.
The flagship cards of the new series – GeForce RTX 3090, received 10,496 NVIDIA CUDA cores with a clock frequency of 1.44 GHz (acceleration up to 1.71 GHz), 328 third-generation tensor cores and 24 GB of 384-bit GDDR6X graphics memory. The even more affordable GeForce RTX 3080 features 8,704 CUDA cores at the same clock speeds, 272 tensor cores, and 10GB of 320-bit GDDR6X memory.
Despite the shortage of new video cards (NVIDIA has even had to apologize to the market for the resulting shortage of cards at launch), in early October the first GPU servers appeared in the product lines of hosting providers. Netherlands-based hosting provider HostKey is one of the first European providers to test and present GPU servers based on the new GIGABYTE RTX3090 / 3080 TURBO cards. Starting October 26th, configurations based on RTX 3090 / Xeon E-2288G and RTX 3080 / AMD Ryzen 9 3900X have become available to all Hostkey customers in their data centers in the Netherlands and Moscow.
NVIDIA RTX 30: the golden middle?
The RTX3090 / 3080 cards are positioned by the manufacturer as a more productive solution to replace the RTX 20 series cards with the previous Turing architecture. And, of course, the new cards are much more efficient than the available “folk” GPU servers based on GTX1080 / GTX1080 Ti graphics cards. They are also suitable for working with neural networks and other machine learning tasks – albeit with reservations, but at the same time they are available at very ‘democratic’ prices.
Positioned above the NVIDIA RTX 30 series are all-powerful solutions based on A100 / A40 (Ampere) cards with up to 432 third-generation tensor cores, Titan RTX / T4 (Turing) with up to 576 second-generation tensor cores, and V100 (Volta) with 640 first-generation tensor cores.
The price tag for these powerful cards, as well as for renting GPU servers with them, significantly exceeds the listings for the RTX 30, so it is especially interesting to evaluate in practice the gap in performance in AI / ML tasks.
Case Study: Face Reenactment
One of the working studies for operational testing of GPU servers based on the new RTX 3090 and RTX 3080 cards was the Face Reenactment process for the U-Net + ResNet neural network with SPADE spatially adaptive normalization and a patch discriminator. Facebook PyTorch version 1.6 with built-in Automated Mixed Precision (AMP) mode and torch.backend.cudnn.benchmark = True flag mode was used as a framework.