Intel engineers Flex series GPUs for the intelligent visual cloud

The data centre (GPU) graphics processing codenamed Arctic Sound-M has been unveiled as the Flex series. The Flex series GPUs are designed to meet the requirements for intelligent visual cloud workloads, said Intel. The Flex 170 is designed for maximum peak performance while the Flex 140 is for maximum density.

The Flex series GPU are capable of processing up to 68 simultaneous cloud gaming streams and handle workloads without having to use separate, discrete solutions or rely on siloes or proprietary environments, said the company. This helps lower and optimise the total cost of ownership for diverse cloud workloads like media delivery, cloud gaming, AI, metaverse and other emerging visual cloud use cases.

“We are in the midst of a pixel explosion driven by more consumers, more applications and higher resolutions,” explained Jeff McVeigh, Intel vice president and general manager of the Super Compute Group, Intel. “Today’s data centre infrastructure is under intense pressure to compute, encode, decode, move, store and display visual information”.

The Flex series GPUs have what is claimed to be the first hardware-based AV1 encoder in a data centre GPU to provide five times the media transcode throughput performance and two times the decode throughput performance at half the power of the Nvidia A10 in the case of the Intel Flex series 140 GPU, for example. According to Intel, the series also delivers more than 30 per cent bandwidth improvement to save on the total cost of ownership and has broad support for popular media tools, APIs, frameworks and the latest codecs, including HEVC, AVC and VP9.

The GPUs are powered by Intel’s Xe-HPG architecture and can provide scaling of AI inference workloads from media analytics to smart cities to medical imaging between CPUs and GPUs without “locking developers into proprietary software”.

The video processing demands of video conferencing, streaming, and social media have transformed the compute resource requirements of the data centre. The increased media processing, media delivery, AI visual inference, cloud gaming and desktop virtualisation has presented a challenge for an industry largely dependent on proprietary, licensed coding models, like CUDA for GPU programming, said Intel.

The Flex series GPU software stack includes support for oneAPI and OpenVINO. Developers can use Intel’s oneAPI tools to deliver accelerated applications and services, including the Intel oneAPI Video Processing Library (oneVPL) and Intel VTune Profiler, for example. The open alternative to proprietary language lock-in enables the performance of the hardware and has a set of tools that complement existing languages and parallel models. This allows users to develop open, portable code that will take maximum advantage of various combinations across Intel CPUs and GPUs. It also means developers are not tied to proprietary programming models, which can be financially or technically restrictive, said Intel.

The Flex series GPU media architecture is powered by up to four Xe media engines, for streaming density and to deliver up to 36 streams 1080p60 transcode throughput per card. It is also capable of delivering eight streams 4K60 transcode throughput per card.

When scaled to 10 cards in a 4U server configuration, it can support up to 360 streams HEVC-HEVC 1080p60 transcode throughput.

Leveraging the Intel Deep Link Hyper Encode feature, the GPU Flex series 140 with two devices on a single card can meet the industry’s one-second delay requirement while providing 8K60 real-time transcode, reported Intel. This capability is available for AV1 and HEVC HDR formats.

To meet the growth in Android cloud gaming, the GPUs are validated on nearly 90 of the most popular Google Play Android game titles. A single Flex series 170 GPU can achieve up to 68 streams of 720p30 while a single Flex series 140 GPU can achieve up to 46 streams of 720p30 (measured on select game titles).

When scaled with six Flex Series 140 GPU cards, it can achieve up to 216 streams of 720p30.

Systems featuring Flex series GPUs will be available from providers including Dell Technologies, HPE, H3C, Inspur, Lenovo and Supermicro. Solutions with the Flex series GPU will ramp over the coming months, starting with media delivery and Android cloud gaming workloads, followed by Windows cloud gaming, AI and VDI workloads.

http://www.intel.com

> Read More

Rutronik helps designers reduce time to market for IoT with LTE data card

Telit’s LN920 LTE data card with an integrated Qualcomm Snapdragon X12+ LTE modem is now available from Rutronik. The compact M.2 form factor measures just 30 x 42 x 2.4mm, making it suitable for laptops, tablets as well as enterprise routers and IIoT gateways. 

The LN920 data card has high download speeds of 300 to 600Mbit per second (the Cat. 6 (LN92046-WW) and LTE Cat. 12 / LN920A12-WW respectively). It is also characterised by GNSS positioning. The broad frequency band support ensures it works worldwide, said Rutronik. The distributor offers the LN920 LTE variants Cat. 6 or Cat. 12.

The data card supports all cellular bands between 600MHz and 3.7GHz, including Band 14 (First Responder) and Band 48 (CBRS). This makes it suitable for private and public LTE applications worldwide, confirmed Rutronik. Combined with pre-certifications by regulators and Tier 1 operators in Europe, Australia, the USA, Japan and Canada, the LN920 supports rapid development of global SKU (stock keeping unit) devices. 

Maintenance for Telit’s standard 3GPP and custom AT commands are included in both Cat.6 and Cat. 12 variants. The data cards support security features qualified for enterprise applications, such as secure environment SE Linux and secure boot. 

Operating temperature range is -40 to + 85 degrees C.

Rutronik Elektronische Bauelemente was founded in 1973 and is an independent family-owned company which claims to be one of the world‘s leading broadline distributors. Rutronik is represented worldwide with more than 80 offices and can guarantee comprehensive customer support including Europe, Asia, and the USA.

The broad product portfolio includes semiconductors, passive and electromechanical components as well as embedded boards, storage and displays, and wireless products. The distributor has automotive, embedded, power and smart divisions offering both specific products and services. The company also provide technical support for product development and design-in right up to the research area, individual logistics and supply chain management solutions.

http://www.rutronik.com

> Read More

WE-BMS signal transformers manage battery systems

Signal transformers with galvanic isolation of 4,300V DC / one minute and high operating voltage of up to 1,000V DC are part of the WE-BMS components for battery management systems by Würth Elektronik. 

The WE-BMS series components also include at least one common-mode choke to filter common-mode interference. Würth Elektronik claimed its design offers longer physical creepage than comparable products on the market despite their size. The low-profile models, for example are available with heights of 3.45mm. 

The transformers are optimised for battery management systems, primarily to ensure reliable operation and provide information about the charging status. The individual cells of a battery pack are connected in series, as are the downstream battery management system controllers. Voltage differences and electromagnetic interference (EMI) may occur between series-connected components or boards. To counter this, the WE-BMS transformer isolate components and suppress EMI. Applications include storage systems for solar installations or uninterrupted power supplies (UPS). The WE-BMS series is AEC-Q200-qualified, and is therefore suitable for energy storage in e-mobility applications, such as e-bikes and e-scooters. WE-BMS supports a variety of interfaces, including serial daisy chain, isoSPI and SPI.

The transformers are designed for the extended operating temperature range of -40 to +125 degrees C. The components are available from stock in various sizes, ranging from 7.6 x 9.5 x 5.5mm to 15.1 x 15.4 x 3.45mm. There is no minimum order quantity. In addition, Würth Elektronik provides free samples for engineers.

Würth Elektronik eiSos Group is a manufacturer of electronic and electromechanical components for the electronics industry and a technology company. Würth Elektronik eiSos says it is one of the largest European manufacturers of passive components and is active in 50 countries. Production sites are in Europe, Asia and North America. 

The company’s product range includes EMC components, inductors, transformers, RF components, varistors, capacitors, resistors, quartz crystals, oscillators, power modules, wireless power transfer, LEDs, sensors, connectors, power supply elements, switches, push-buttons, connection technology, fuse holders and solutions for wireless data transmission.

Würth Elektronik is part of the Würth Group, specialising in assembly and fastening technology. 

http://www.we-online.com

> Read More

Untether AI introduces speedAI architecture

At this week’s HotChips 2022, Untether AI has announced its second generation at-memory computation architecture for AI workloads. speedAI architecture delivers 2Petaflops of performance at 30TFLOPs per W.

It is designed to meet the neural network demands to use AI in a variety of markets, from financial technology, smart city and retail, natural language processing, autonomous vehicles, and scientific applications. These demanding applications require increasing levels of accuracy to ensure safety and quality of results, said the company.

Untether AI’s second generation speedAI architecture enhances the energy efficiency, throughput, accuracy, and scalability which is claimed to be unmatched by any other inference offering available today.

At-memory compute is significantly more energy efficient than traditional von Neumann architectures, said the company, with more TFlops performed for a given power envelope.

The speedAI architecture dramatically improves upon the first generation (runAI) by delivering 30TFLOPs per watt. This energy efficiency is a product of the second generation atmemory compute architecture, over 1,400 optimised RISC-V processors with custom instructions, energy efficient dataflow, and the adoption of a new FP8 datatype, which quadruples efficiency compared to runAI. 

The first member of the family, the speedAI240 device provides 2PetaFlops of FP8 performance and 1 PetaFlop of BF16 performance. This translates into industry leading performance and efficiency on neural networks like BERT-base, which speedAI240 can run at over 750 queries per second per watt, 15x greater than the current state of the art from leading GPUs, said Untether AI.

Each memory bank of the speedAI architecture has 512 processing elements with direct attachment to dedicated SRAM. These processing elements support INT4, FP8, INT8, and BF16 datatypes, along with zero-detect circuitry for energy conservation and support for 2:1 structured sparsity. Arranged in eight rows of 64 processing elements, each row has its own dedicated row controller and hardwired reduce functionality to allow flexibility in programing and efficient computation of transformer network functions such as Softmax and LayerNorm. The rows are managed by two RISC-V processors with over 20 custom instructions designed for inference acceleration. The flexibility of the memory bank allows it to adapt to a variety of neural network architectures, including convolutional, transformer, and recommendation networks as well as linear algebra models.

Two FP8 formats are claimed to provided the best mix of precision, range, and efficiency. A 4-mantissa version (FP8p for “precision”) and a 3-mantissa version (FP8r for “range”) were found to provide the best accuracy and throughput for inference across a variety of different networks. For both convolutional networks like ResNet-50 and transformer networks like BERT-Base, Untether AI’s implementation of FP8 results in less than 1/10th of one per cent of accuracy loss compared to using BF16 data types, with a four fold increase in throughput and energy efficiency.

The speedAI240 device is designed to scale to large models. The memory architecture is multi-leveled, with 238Mbytes of SRAM dedicated to the processing elements offering 1 Petabyte per second memory bandwidth, four 1MB scratchpads, and two 64-bit wide ports of LPDDR5, providing up to 32Gbyte of external DRAM. Host and chip-to-chip connectivity is provided by high-speed PCIExpress Gen5 interfaces.

The imAIgine software development kit provides a path to running networks at high performance, with push-button quantisation, optimisation, physical allocation, and multi-chip partitioning. The imAIgine SDK also provides an extensive visualisation toolkit,

cycle-accurate simulator, and a runtime API and is available now.

The speedAI devices will be offered as standalone chips, m.2 and PCI-Express form factor cards. Sampling is expected to begin in the first half of 2023.

http://www.untether.ai

> Read More

About Smart Cities

This news story is brought to you by smartcitieselectronics.com, the specialist site dedicated to delivering information about what’s new in the Smart City Electronics industry, with daily news updates, new products and industry news. To stay up-to-date, register to receive our weekly newsletters and keep yourself informed on the latest technology news and new products from around the globe. Simply click this link to register here: Smart Cities Registration