GPU server simplifies and futureproofs, claims Super Micro Computer

Customisation is brought to AI, machine learning and high performance computing (HPC) applications with the Universal GPU server system, said Super Micro Computer. According to the company the “revolutionary technology” simplifies large scale GPU deployments and is a futureproof design that supports yet to be announced technologies. 

The Universal GPU server combines technologies supporting multiple GPU form factors, CPU choices, storage and networking options optimised to deliver configured and highly scalable systems for specific AI, machine learning and HPC with the thermal headroom for the next generation of CPUs and GPUs, said the company.

Initially, the Universal GPU platform will support systems that contain the third generation AMD EPYC 7003 processors with either the MI250 GPUs or the Nvidia A100 Tensor Core 4-GPU, and the third generation Intel Xeon Scalable processors with built-in AI accelerators and the Nvidia A100 Tensor Core 4-GPU. These systems are designed with an improved thermal capacity to accommodate up to 700W GPUs.

The Supermicro Universal GPU platform is designed to work with a wide range of GPUs based on an open standards design. By adhering to an agreed upon set of hardware design standards, such as Universal Baseboard (UBB) and OCP Accelerator Modules (OAM), as well as PCI-E and platform-specific interfaces, IT administrators can choose the GPU architecture best suited for their HPC or AI workloads. In addition to meeting requirements, this will simplify the installation, testing, production, and upgrades of GPUs, said Super Micro Computer. In addition, IT administrators will be able to choose the right combination of CPUs and GPUs to create an optimal system based on the needs of their users.

The 4U or 5U Universal GPU server will be available for accelerators that use the UBB standard, as well as PCI-E 4.0, and soon PCI-E 5.0. In addition, 32 DIMM slots and a range of storage and networking options are available, which can also be connected using the PCI-E standard. 

The Supermicro Universal GPU server can accommodate GPUs using baseboards in the SXM or OAM form factors that use high speed GPU-to-GPU interconnects such as Nvidia NVLink or the AMD xGMI Infinity fabric, or which directly connect GPUs via a PCI-E slot. All major current CPU and GPU platforms will be supported, confirmed the company.

The server is designed for maximum airflow and accommodates current and future CPUs and GPUs where the highest TDP (thermal dynamic performance) CPUs and GPUs are required for maximum application performance. Liquid cooling options (direct to chip) are available for the Supermicro Universal GPU server as CPUs and GPUs require increased cooling. 

The modular design means specific subsystems of the server can be replaced or upgraded, extending the service life of the overall system and reducing the e-waste generated by complete replacement with every new CPU or GPU technology generation.

http://www.supermicro.com

> Read More

Nvidia prepares for AI infrastructure with Hopper architecture

At GTC 2022, Nvidia announced its next generation accelerated computing platform “to power the next wave of AI data centres”. The Hopper architecture succeeds the Ampere architecture, launched in 2020.

The company also announced its first Hopper-based GPU, the NVIDIA H100, equipped with 80 billion transistors. Described as the world’s largest and most powerful accelerator, the H100 has a Transformer Engine and a scalable NVLink interconnect for advancing gigantic AI language models, deep recommender systems, genomics and complex digital twins. 

The Nvidia H100 GPU features major advances to accelerate AI, HPC, memory bandwidth, interconnect and communication, including nearly 5Tbytes per second of external connectivity. H100 is the first GPU to support PCIe Gen5 and the first to utilise HBM3, enabling 3Tbytes per second of memory bandwidth, claimed Nvidia. 

According to Jensen Huang, CEO, 20 H100 GPUs can sustain the equivalent of the entire world’s internet traffic, making it possible for customers to deliver advanced recommender systems and large language models running inference on data in real time. 

The Transformer Engine is built to speed up these networks as much as six times compared with the previous generation without losing accuracy. 

MIG technology allows a single GPU to be partitioned into seven smaller, fully isolated instances to handle different types of jobs. The Hopper architecture extends MIG capabilities by up to a factor of seven over the previous generation by offering secure multi-tenant configurations in cloud environments across each GPU instance.

According to the company, H100 is the first accelerator with confidential computing capabilities to protect AI models and customer data while they are being processed. Customers can also apply confidential computing to federated learning for privacy-sensitive industries like healthcare and financial services, as well as on shared cloud infrastructures. 

To accelerate the largest AI models, NVLink combines with an external NVLink Switch to extend NVLink as a scale-up network beyond the server, connecting up to 256 H100 GPUs at nine times higher bandwidth versus the previous generation using NVIDIA HDR Quantum InfiniBand. 

New DPX instructions accelerate dynamic programming by up to 40 times compared with CPUs and up to a factor of seven compared with previous generation GPUs, said Nvidia. This includes, for example, the Floyd-Warshall algorithm to find optimal routes for autonomous robot fleets and the Smith-Waterman algorithm used in sequence alignment for DNA and protein classification and folding. Availability NVIDIA H100 will be available starting in Q3. 

https://www.nvidia.com

> Read More

AMD implements 3D die stacking for data centre CPU 

Claimed to be the world’s first data centre CPU using 3D die stacking, the third generation AMD EPYC processors with AMD 3D V-Cache technology (codenamed Milan-X) has been released. AMD built the CPU family on the Zen 3 core architecture and can deliver up to 66 per cent performance uplift across a variety of targeted technical computing workloads versus comparable, non-stacked third gen AMD EPYC processors, reported the company.

The new processors are also claimed to feature the industry’s largest L3 cache, delivering the same socket, software compatibility and modern security features as third gen EPYC CPUs but with features for technical computing workloads such as computational fluid dynamics (CFD), finite element analysis (FEA), electronic design automation (EDA) and structural analysis, used in testing and validating designs.

“Building upon our momentum in the data centre . . .  [third generation] AMD EPYC processors with AMD 3D V-Cache technology . . . offer the industry’s first workload-tailored server processor with 3D die stacking technology,” said Dan McNamara, senior vice president and general manager, Server Business Unit, AMD. “Our latest processors with AMD 3D V-Cache technology provide breakthrough performance for mission-critical technical computing workloads leading to better designed products and faster time to market,” he added.

Technical computing workloads relying heavily on large data sets. These workloads benefit from increased cache size, however 2D chip designs have physical limitations on the amount of cache that can effectively be built on the CPU. AMD 3D V-Cache technology solves these physical challenges by bonding the AMD Zen 3 core to the cache module, increasing the amount of L3 while minimising latency and increasing throughput, according to AMD. This represents a step forward in CPU design and packaging and enables breakthrough performance in targeted technical computing workloads, commented the company.

The third generation AMD EPYC processors with AMD 3D V-Cache technology deliver faster time-to-results on targeted workloads. For example, the 16-core, AMD EPYC 7373X CPU can deliver up to 66 per cent faster simulations on Synopsys VCS, when compared to the EPYC 73F3 CPU. 

The 64-core AMD EPYC 7773X processor can deliver, on average, 44 per cent more performance on Altair Radioss simulation applications compared to the competition’s top of stack processor, claimed AMD. In a third comparison, AMD reported the 32-core AMD EPYC 7573X processor can solve an average of 88 per cent more computational flow dynamic (CFD) problems per day than a comparable competitive 32-core count processor, while running Ansys CFX.

These performance capabilities enable customers to deploy fewer servers and reduce power consumption in the data centre, helping to lower total cost of ownership (TCO), reduce carbon footprint and address environmental sustainability goals, said AMD. 

The third generation AMD EPYC processors with AMD 3D V-Cache technology are available today from OEM partners, including, Atos, Cisco, Dell Technologies, Gigabyte, HPE, Lenovo, QCT, and Supermicro.

Third generation AMD EPYC processors with AMD 3D V-Cache technology are also broadly supported by AMD software ecosystem partners, including, Altair, Ansys, Cadence, Dassault Systèmes, Siemens, and Synopsys.

Microsoft Azure HBv3 virtual machines (VMs) have now been fully upgraded to third generation AMD EPYC with AMD 3D V-Cache technology. 

http://www.amd.com

> Read More

RF amplifier spaces 380MHz to 6GHz for single amp testing

Broadband RF amplifiers in the R&S BBA300 family span a wide frequency range between 380MHz to 6GHz continuously within a single amplifier. Rohde & Schwarz claimed that they are the widest band, most flexible broadband RF amplifiers available.

The BBA300 family provides a single amplifier that eliminates the need to switch instruments or bands during a test, making automotive and wide band wireless communication radios testing easier, said Rohde & Schwarz. 

The continuous wide frequency range is believed to be an industry first. It addresses the requirements of EMC test centres, RF component designers and product design and validation. 

Two models are initially available: the BBA300-DE operates from 1.0 to 6.0GHz and the BBA300-CDE has a capability of 380MHz to 6.0GHz. 

The BBA300 family’s design and manufacture provide linearity and noise power density as low as -110dBm/Hz. It is also claimed to ensure excellent harmonic performance (down to -25dBc). Intelligent protection features ensure high availability, even in the event of component failure. Self-protection is built-in to ensure the instruments are robust against RF mismatch up to VSWR of 6:1. 

The BBA300 models have smart settings and activation functions which adapt instruments’ settings and behaviour according to the application. The amplifier design offers the ability to realise a broad array of customer system set ups for flexible, scalable working, with upgradable frequency and power. 

The R&S BBA300-DE and R&S BBA300-CDE models are available now from Rohde & Schwarz and selected distribution partners.

The Rohde & Schwarz technology group provides products for test and measurement, technology systems, networks and cybersecurity. The group was founded more than 85 years ago and is headquartered in Munich, Germany.

http://www.rohde-schwarz.com 

> Read More

About Smart Cities

This news story is brought to you by smartcitieselectronics.com, the specialist site dedicated to delivering information about what’s new in the Smart City Electronics industry, with daily news updates, new products and industry news. To stay up-to-date, register to receive our weekly newsletters and keep yourself informed on the latest technology news and new products from around the globe. Simply click this link to register here: Smart Cities Registration