NVIDIA GPU: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
| Line 48: | Line 48: | ||
|Lovelace | |Lovelace | ||
|Ada Lovelace | |Ada Lovelace | ||
| | | | ||
** | |||
| | | | ||
* Fourth-Gen Tensor Cores increasing throughput by up to 5X, to 1.4 Tensor-petaFLOPS using the new FP8 Transformer Engine (like H100 model) | * Fourth-Gen Tensor Cores increasing throughput by up to 5X, to 1.4 Tensor-petaFLOPS using the new FP8 Transformer Engine (like H100 model) | ||
Revision as of 13:17, 18 March 2023
NVIDIA GPU Architecture
| Series | Architecture | Notable Models | Key Features |
|---|---|---|---|
| Tesla | Tesla | C1060, M2050, K80, P100, V100, A100 | First dedicated GPGPU series |
| Fermi | Fermi | GTX 400, GTX 500, Tesla 20-series, Quadro 4000/5000 | First to feature CUDA cores and support for ECC memory |
| Kepler | Kepler | GTX 600, GTX 700, Tesla K-series, Quadro K-series | First to feature Dynamic Parallelism and Hyper-Q |
| Maxwell | Maxwell | GTX 900, GTX 1000, Quadro M-series | First to support VR and 4K displays |
| Pascal | Pascal | GTX 1000, Quadro P-series | First to support simultaneous multi-projection |
| Volta | Volta | Titan V, Tesla V100, Quadro GV100 | First to feature Tensor Cores and NVLink 2.0 |
| Turing | Turing | RTX 2000, GTX 1600, Quadro RTX | First to feature Ray Tracing Cores and RTX technology |
| Ampere | Ampere | RTX 3000, A-series | Features third-generation Tensor Cores and more |
| Lovelace | Ada Lovelace |
|
|
NVIDIA GPU Models
| Model | Architecture | CUDA Cores | Tensor Cores | RT Cores | Memory Size | Memory Type | Memory Bandwidth | TDP | Launch Date |
|---|---|---|---|---|---|---|---|---|---|
| Tesla C870 | Tesla | 128 | No | No | 1.5 GB GDDR3 | GDDR3 | 76.8 GB/s | 105W | Jun 2006 |
| Tesla C1060 | Tesla | 240 | No | No | 4 GB GDDR3 | GDDR3 | 102 GB/s | 238W | Dec 2008 |
| Tesla M1060 | Tesla | 240 | No | No | 4 GB GDDR3 | GDDR3 | 102 GB/s | 225W | Dec 2008 |
| Tesla M2050 | Fermi | 448 | No | No | 3 GB GDDR5 | GDDR5 | 148 GB/s | 225W | May 2010 |
| Tesla M2070 | Fermi | 448 | No | No | 6 GB GDDR5 | GDDR5 | 150 GB/s | 225W | May 2010 |
| Tesla K10 | Kepler | 3072 | No | No | 8 GB GDDR5 | GDDR5 | 320 GB/s | 225W | May 2012 |
| Tesla K20 | Kepler | 2496 | No | No | 5/6 GB GDDR5 | GDDR5 | 208 GB/s | 225W | Nov 2012 |
| Tesla K40 | Kepler | 2880 | No | No | 12 GB GDDR5 | GDDR5 | 288 GB/s | 235W | Nov 2013 |
| Tesla K80 | Kepler | 4992 | No | No | 24 GB GDDR5 | GDDR5 | 480 GB/s | 300W | Nov 2014 |
| Tesla M40 | Maxwell | 3072 | No | No | 12 GB GDDR5 | GDDR5 | 288 GB/s | 250W | Nov 2015 |
| Tesla P4 | Pascal | 2560 | No | No | 8 GB GDDR5 | GDDR5 | 192 GB/s | 75W | Sep 2016 |
| Tesla P40 | Pascal | 3840 | No | No | 24 GB GDDR5X | GDDR5X | 480 GB/s | 250W | Sep 2016 |
| Tesla V100 | Volta | 5120 | 640 | Yes | 16/32 GB HBM2 | HBM2 | 900 GB/s | 300W | May 2017 |
| Tesla T4 | Turing | 2560 | 320 | No | 16 GB | ||||
| A100 PCIe | Ampere | 6912 | 432 | Yes | 40 GB HBM2 / 80 GB HBM2 | HBM2 | 1555 GB/s | 250W | May 2020 |
| A100 SXM4 | Ampere | 6912 | 432 | Yes | 40 GB HBM2 / 80 GB HBM2 | HBM2 | 1555 GB/s | 400W | May 2020 |
| A30 | Ampere | 7424 | 184 | No | 24 GB GDDR6 | GDDR6 | 696 GB/s | 165W | Apr 2021 |
| A40 | Ampere | 10752 | 336 | No | 48 GB GDDR6 | GDDR6 | 696 GB/s | 300W | Apr 2021 |
| A10 | Ampere | 10240 | 320 | No | 24 GB GDDR6 | GDDR6 | 624 GB/s | 150W | Mar 2021 |
| A16 | Ampere | 16384 | 512 | No | 48 GB GDDR6 | GDDR6 | 768 GB/s | 400W | Mar 2021 |
| A100 80GB | Ampere | 6912 | 432 | Yes | 80 GB HBM2 | HBM2 | 2025 GB/s | 400W | Apr 2021 |
| A100 40GB | Ampere | 6912 | 432 | Yes | 40 GB HBM2 | HBM2 | 1555 GB/s | 250W | May 2020 |
| A200 PCIe | Ampere | 10752 | 672 | Yes | 80 GB HBM2 / 160 GB HBM2 | HBM2 | 2050 GB/s | 400W | Nov 2021 |
| A200 SXM4 | Ampere | 10752 | 672 | Yes | 80 GB HBM2 / 160 GB HBM2 | HBM2 | 2050 GB/s | 400W | Nov 2021 |
| A5000 | Ampere | 8192 | 256 | Yes | 24 GB GDDR6 | GDDR6 | 768 GB/s | 230W | Apr 2021 |
| A4000 | Ampere | 6144 | 192 | Yes | 16 GB GDDR6 | GDDR6 | 512 GB/s | 140W | Apr 2021 |
| A3000 | Ampere | 3584 | 112 | Yes | 24 GB G | ||||
| Titan RTX | Turing | 4608 | 576 | Yes | 24 GB GDDR6 | GDDR6 | 672 GB/s | 280W | Dec 2018 |
| GeForce RTX 3090 | Turing | 10496 | 328 | Yes | 24 GB GDDR6X | GDDR6X | 936 GB/s | 350W | Sep 2020 |
| GeForce RTX 3080 Ti | Turing | 10240 | 320 | Yes | 12 GB GDDR6X | GDDR6X | 912 GB/s | 350W | May 2021 |
| GeForce RTX 3080 | Turing | 8704 | 272 | Yes | 10 GB GDDR6X | GDDR6X | 760 GB/s | 320W | Sep 2020 |
| GeForce RTX 3070 Ti | Turing | 6144 | 192 | Yes | 8 GB GDDR6X | GDDR6X | 608 GB/s | 290W | Jun 2021 |
| GeForce RTX 3070 | Turing | 5888 | 184 | Yes | 8 GB GDDR6 | GDDR6 | 448 GB/s | 220W | Oct 2020 |
| GeForce RTX 3060 Ti | Turing | 4864 | 152 | Yes | 8 GB GDDR6 | GDDR6 | 448 GB/s | 200W | Dec 2020 |
| GeForce RTX 3060 | Turing | 3584 | 112 | No | 12 GB GDDR6 | GDDR6 | 360 GB/s | 170W | Feb 2021 |
| Quadro RTX 8000 | Turing | 4608 | 576 | Yes | 48 GB GDDR6 | GDDR6 | 624 GB/s | 295W | Aug 2018 |
| Quadro RTX 6000 | Turing | 4608 | 576 | Yes | 24 GB GDDR6 | GDDR6 | 432 GB/s | 260W | Aug 2018 |
| Quadro RTX 5000 | Turing | 3072 | 384 | Yes | 16 GB GDDR6 | GDDR6 | 448 GB/s | 230W | Nov 2018 |
| Quadro RTX 4000 | Turing | 2304 | 288 | Yes | 8 GB GDDR6 | GDDR6 | 416 GB/s | 160W | Nov 2018 |
| Titan RTX (T-Rex) | Turing | 4608 | 576 | Yes | 24 GB | ||||
| Titan V | Volta | 5120 | 640 | 12 GB HBM2 | HBM2 | 652.8 GB/s | 250W | Dec 2017 | |
| Tesla V100 (PCIe) | Volta | 5120 | 640 | 16 GB HBM2 | HBM2 | 900 GB/s | 250W | June 2017 | |
| Tesla V100 (SXM2) | Volta | 5120 | 640 | 16 GB HBM2 | HBM2 | 900 GB/s | 300W | June 2017 | |
| Quadro GV100 | Volta | 5120 | 640 | 32 GB HBM2 | HBM2 | 870 GB/s | 250W | Mar 2018 | |
| Tesla GV100 (SXM2) | Volta | 5120 | 640 | 32 GB HBM2 | HBM2 | 900 GB/s | 300W | Mar 2018 | |
| DGX-1 (Volta) | Volta | 5120 | 640 | 16 x 32 GB HBM2 (512 GB total) | HBM2 | 2.7 TB/s | 3200W | Mar 2018 |
NVIDIA Grace Architecture
NVIDIA has announced that they will be partnering with server manufacturers such as HPE, Atos, and Supermicro to create servers that integrate the Grace architecture with ARM-based CPUs. These servers are expected to be available in the second half of 2023
| Architecture | Key Features |
|---|---|
| Grace | CPU-GPU integration, ARM Neoverse CPU, HBM2E memory |
| 900 GB/s memory bandwidth, support for PCIe 5.0 and NVLink | |
| 10x performance improvement for certain HPC workloads | |
| Energy efficiency improvements through unified memory space |