AMD EPYC 9004 Genoa

From HPCWIKI
Jump to navigation Jump to search

EPYC Genoa

Epyc genoa.png

In brief : At OCP Summit 2022, AMD launching next-generation AMD EPYC 9004 4th-gen CPU, code name Genoa on November 10, 2022. AMD Genor supports up to 96 cores with 192 threads with 5nm manufacturing process. 12-channel DDR5 memory, Compute Express Link 1.1 standards as well as more PCIe Gen5 capabilities. HPCMATE with local and global partners are preparing HPC optimized AMD based server in Q2 2023.

Genoa

Code-named AMD Genoa, the new line of CPUs supports 12 channels of DDR5-4800 (up to 6TB memory capacity per socket), 128 lanes of PCIe Gen5, AMD Infinity Fabric/Guard technology, and up to 96 cores. This makes them ideal for critical workloads across cloud, enterprise, and high-performance computing.

With its massive 96-core-count in a single processor, the new AMD Genoa processors will allow organizations to reduce their physical footprint by deploying fewer servers while leveraging more powerful servers. HPE and Dell have announced their servers, each offering four systems, two 1 CPU chassis, and two 2 CPU chassis

Key features

Socket SP5

To support Zen 4 architecture based EPYC 9004 Genoa, AMD introduced SP5 (LGA 6096) socket with 2002 more contact pins than the SP3 socket to provide greater power delivery and signal integrity. SP5 can provide a peak power of up to 700W.

Memory

Genoa has key enhancements in memory cost, which is 50% of a server’s BOM. The support for 72-bit and 80-bit DIMMs is noteworthy. Most servers will use 80-bit ECC, but some hyperscalers want to cut down to 72-bit. The advantage here is that there is 1 less DRAM die for parity checks. The other important feature is dual rank versus single rank memory. With Milan and most Intel platforms, dual-rank memory is crucial to maximizing performance. There’s a 25% performance delta on Milan, for example. With Genoa, this is brought down to 4.5%. This is another considerable cost improvement because cheaper single-rank memory can be used. Genoa has higher memory latency than Milan, 118ns on Genoa versus 105ns on Milan. AMD’s argument against this is that only 3ns of this is from the massively larger IO die, 73ns on Genoa versus 70ns on Milan. Most of the memory latency impact comes from the DDR5 memory device itself. 35ns on DDR5 versus 25ns on DDR4.

The 4th Gen AMD EPYC processors are the first AMD x86 server processors to support DDR5 memory. The memory runs at speeds of up to 4,800 MT/s, which is 50 percent faster than the 3,200 MT/s that the previous 3rd Gen AMD EPYC processors supported.[1]

Power management

Power management is enhanced. Genoa has 2 basic modes for power management, performance determinism or power determinism. Performance determinism is for firms that want consistent performance. It consumes less power when allowed to, and performance is kept stable. Most customers will choose this option because stability is vital. Power determinism is for keeping power consumption stable and ramping performance up and down. Given factors such as the silicon lottery, thermal budget, and workloads, the chip will ramp up and down clock speeds. In addition to the power management modes, there is a configurable TDP for Genoa chips. The peak boost behavior will vary depending on which option is chosen.[2]

9004 highlight.png

CXL 2.0 for Type 3

AMD generally supports CXL 1.1 but supports CXL 2.0 for Type 3 memory devices, One noteworthy item is that the 64 lanes of CXL can be bifurcated into 16 4x devices.

Type 3 is what the ecosystem wanted, Genoa was delayed 2 quarters to add this feature. Intel Sapphire Rapids is not capable of CXL lane bifurcation. If one connects a 4x or 8x CXL device, that will consume all 16 lanes.

Hypervisors cannot change the memory assignment under the guest, which is huge for users using CXL-attached memory in the cloud.[3]



In summary, The pillars of performance for AMD are per-socket performance leadership, per-core performance leadership, leadership across all workloads and market segments, and leadership in TCO and sustainability

EPYC Genoa series[4]

MODEL # OF CPU CORES # OF THREADS MAX. BOOST CLOCK ALL CORE BOOST SPEED BASE CLOCK L3 CACHE DEFAULT TDP
AMD EPYC™ 9654P 96 192 Up to 3.7GHz 3.55GHz 2.4GHz 384MB 360W
AMD EPYC™ 9654 96 192 Up to 3.7GHz 3.55GHz 2.4GHz 384MB 360W
AMD EPYC™ 9634 84 168 Up to 3.7GHz 3.1GHz 2.25GHz 384MB 290W
AMD EPYC™ 9554P 64 128 Up to 3.75GHz 3.75GHz 3.1GHz 256MB 360W
AMD EPYC™ 9554 64 128 Up to 3.75GHz 3.75GHz 3.1GHz 256MB 360W
AMD EPYC™ 9534 64 128 Up to 3.7GHz 3.55GHz 2.45GHz 256MB 280W
AMD EPYC™ 9474F 48 96 Up to 4.1GHz 3.95GHz 3.6GHz 256MB 360W
AMD EPYC™ 9454P 48 96 Up to 3.8GHz 3.65GHz 2.75GHz 256MB 290W
AMD EPYC™ 9454 48 96 Up to 3.8GHz 3.65GHz 2.75GHz 256MB 290W
AMD EPYC™ 9374F 32 64 Up to 4.3GHz 4.1GHz 3.85GHz 256MB 320W
AMD EPYC™ 9354P 32 64 Up to 3.8GHz 3.75GHz 3.25GHz 256MB 280W
AMD EPYC™ 9354 32 64 Up to 3.8GHz 3.75GHz 3.25GHz 256MB 280W
AMD EPYC™ 9334 32 64 Up to 3.9GHz 3.85GHz 2.7GHz 128MB 210W
AMD EPYC™ 9274F 24 48 Up to 4.3GHz 4.1GHz 4.05GHz 256MB 320W
AMD EPYC™ 9254 24 48 Up to 4.15GHz 3.9GHz 2.9GHz 128MB 200W
AMD EPYC™ 9224 24 48 Up to 3.7GHz 3.65GHz 2.5GHz 64MB 200W
AMD EPYC™ 9174F 16 32 Up to 4.4GHz 4.15GHz 4.1GHz 256MB 320W
AMD EPYC™ 9124 16 32 Up to 3.7GHz 3.6GHz 3.0GHz 64MB 200W

The big question then is obvious, is AMD Genoa worth the investment? these new CPU technologies have a lot to offer in terms of compute power, security, and efficiency

The big thing to note is that when software per core licensing costs come into play, this lead extends even further in TCO. This is best shown in the enterprise benchmark, which runs VMMark. VMMark runs 19 representative VM per tile and then sees how many tiles can be run as well as the speed. Genoa is faster and can handle more VMs.

Reference