NVIDIA GPU Architecture
Series
|
Architecture
|
Notable Models
|
Key Features
|
Tesla
|
Tesla
|
C1060, M2050, K80, P100, V100, A100
|
First dedicated GPGPU series
|
Fermi
|
Fermi
|
GTX 400, GTX 500, Tesla 20-series, Quadro 4000/5000
|
First to feature CUDA cores and support for ECC memory
|
Kepler
|
Kepler
|
GTX 600, GTX 700, Tesla K-series, Quadro K-series
|
First to feature Dynamic Parallelism and Hyper-Q
|
Maxwell
|
Maxwell
|
GTX 900, GTX 1000, Quadro M-series
|
First to support VR and 4K displays
|
Pascal
|
Pascal
|
GTX 1000, Quadro P-series
|
First to support simultaneous multi-projection
|
Volta
|
Volta
|
Titan V, Tesla V100, Quadro GV100
|
First to feature Tensor Cores and NVLink 2.0
|
Turing
|
Turing
|
RTX 2000, GTX 1600, Quadro RTX
|
First to feature Ray Tracing Cores and RTX technology
|
Ampere
|
Ampere
|
RTX 3000, A-series
|
Features third-generation Tensor Cores and more
|
|
|
|
|
NVIDIA GPU Models
Model
|
Architecture
|
CUDA Cores
|
Tensor Cores
|
RT Cores
|
Memory Size
|
Memory Type
|
Memory Bandwidth
|
TDP
|
Launch Date
|
Tesla C870
|
Tesla
|
128
|
No
|
No
|
1.5 GB GDDR3
|
GDDR3
|
76.8 GB/s
|
105W
|
Jun 2006
|
Tesla C1060
|
Tesla
|
240
|
No
|
No
|
4 GB GDDR3
|
GDDR3
|
102 GB/s
|
238W
|
Dec 2008
|
Tesla M1060
|
Tesla
|
240
|
No
|
No
|
4 GB GDDR3
|
GDDR3
|
102 GB/s
|
225W
|
Dec 2008
|
Tesla M2050
|
Fermi
|
448
|
No
|
No
|
3 GB GDDR5
|
GDDR5
|
148 GB/s
|
225W
|
May 2010
|
Tesla M2070
|
Fermi
|
448
|
No
|
No
|
6 GB GDDR5
|
GDDR5
|
150 GB/s
|
225W
|
May 2010
|
Tesla K10
|
Kepler
|
3072
|
No
|
No
|
8 GB GDDR5
|
GDDR5
|
320 GB/s
|
225W
|
May 2012
|
Tesla K20
|
Kepler
|
2496
|
No
|
No
|
5/6 GB GDDR5
|
GDDR5
|
208 GB/s
|
225W
|
Nov 2012
|
Tesla K40
|
Kepler
|
2880
|
No
|
No
|
12 GB GDDR5
|
GDDR5
|
288 GB/s
|
235W
|
Nov 2013
|
Tesla K80
|
Kepler
|
4992
|
No
|
No
|
24 GB GDDR5
|
GDDR5
|
480 GB/s
|
300W
|
Nov 2014
|
Tesla M40
|
Maxwell
|
3072
|
No
|
No
|
12 GB GDDR5
|
GDDR5
|
288 GB/s
|
250W
|
Nov 2015
|
Tesla P4
|
Pascal
|
2560
|
No
|
No
|
8 GB GDDR5
|
GDDR5
|
192 GB/s
|
75W
|
Sep 2016
|
Tesla P40
|
Pascal
|
3840
|
No
|
No
|
24 GB GDDR5X
|
GDDR5X
|
480 GB/s
|
250W
|
Sep 2016
|
Tesla V100
|
Volta
|
5120
|
640
|
Yes
|
16/32 GB HBM2
|
HBM2
|
900 GB/s
|
300W
|
May 2017
|
Tesla T4
|
Turing
|
2560
|
320
|
No
|
16 GB
|
|
|
|
|
A100 PCIe
|
Ampere
|
6912
|
432
|
Yes
|
40 GB HBM2 / 80 GB HBM2
|
HBM2
|
1555 GB/s
|
250W
|
May 2020
|
A100 SXM4
|
Ampere
|
6912
|
432
|
Yes
|
40 GB HBM2 / 80 GB HBM2
|
HBM2
|
1555 GB/s
|
400W
|
May 2020
|
A30
|
Ampere
|
7424
|
184
|
No
|
24 GB GDDR6
|
GDDR6
|
696 GB/s
|
165W
|
Apr 2021
|
A40
|
Ampere
|
10752
|
336
|
No
|
48 GB GDDR6
|
GDDR6
|
696 GB/s
|
300W
|
Apr 2021
|
A10
|
Ampere
|
10240
|
320
|
No
|
24 GB GDDR6
|
GDDR6
|
624 GB/s
|
150W
|
Mar 2021
|
A16
|
Ampere
|
16384
|
512
|
No
|
48 GB GDDR6
|
GDDR6
|
768 GB/s
|
400W
|
Mar 2021
|
A100 80GB
|
Ampere
|
6912
|
432
|
Yes
|
80 GB HBM2
|
HBM2
|
2025 GB/s
|
400W
|
Apr 2021
|
A100 40GB
|
Ampere
|
6912
|
432
|
Yes
|
40 GB HBM2
|
HBM2
|
1555 GB/s
|
250W
|
May 2020
|
A200 PCIe
|
Ampere
|
10752
|
672
|
Yes
|
80 GB HBM2 / 160 GB HBM2
|
HBM2
|
2050 GB/s
|
400W
|
Nov 2021
|
A200 SXM4
|
Ampere
|
10752
|
672
|
Yes
|
80 GB HBM2 / 160 GB HBM2
|
HBM2
|
2050 GB/s
|
400W
|
Nov 2021
|
A5000
|
Ampere
|
8192
|
256
|
Yes
|
24 GB GDDR6
|
GDDR6
|
768 GB/s
|
230W
|
Apr 2021
|
A4000
|
Ampere
|
6144
|
192
|
Yes
|
16 GB GDDR6
|
GDDR6
|
512 GB/s
|
140W
|
Apr 2021
|
A3000
|
Ampere
|
3584
|
112
|
Yes
|
24 GB G
|
|
|
|
|
Titan RTX
|
Turing
|
4608
|
576
|
Yes
|
24 GB GDDR6
|
GDDR6
|
672 GB/s
|
280W
|
Dec 2018
|
GeForce RTX 3090
|
Turing
|
10496
|
328
|
Yes
|
24 GB GDDR6X
|
GDDR6X
|
936 GB/s
|
350W
|
Sep 2020
|
GeForce RTX 3080 Ti
|
Turing
|
10240
|
320
|
Yes
|
12 GB GDDR6X
|
GDDR6X
|
912 GB/s
|
350W
|
May 2021
|
GeForce RTX 3080
|
Turing
|
8704
|
272
|
Yes
|
10 GB GDDR6X
|
GDDR6X
|
760 GB/s
|
320W
|
Sep 2020
|
GeForce RTX 3070 Ti
|
Turing
|
6144
|
192
|
Yes
|
8 GB GDDR6X
|
GDDR6X
|
608 GB/s
|
290W
|
Jun 2021
|
GeForce RTX 3070
|
Turing
|
5888
|
184
|
Yes
|
8 GB GDDR6
|
GDDR6
|
448 GB/s
|
220W
|
Oct 2020
|
GeForce RTX 3060 Ti
|
Turing
|
4864
|
152
|
Yes
|
8 GB GDDR6
|
GDDR6
|
448 GB/s
|
200W
|
Dec 2020
|
GeForce RTX 3060
|
Turing
|
3584
|
112
|
No
|
12 GB GDDR6
|
GDDR6
|
360 GB/s
|
170W
|
Feb 2021
|
Quadro RTX 8000
|
Turing
|
4608
|
576
|
Yes
|
48 GB GDDR6
|
GDDR6
|
624 GB/s
|
295W
|
Aug 2018
|
Quadro RTX 6000
|
Turing
|
4608
|
576
|
Yes
|
24 GB GDDR6
|
GDDR6
|
432 GB/s
|
260W
|
Aug 2018
|
Quadro RTX 5000
|
Turing
|
3072
|
384
|
Yes
|
16 GB GDDR6
|
GDDR6
|
448 GB/s
|
230W
|
Nov 2018
|
Quadro RTX 4000
|
Turing
|
2304
|
288
|
Yes
|
8 GB GDDR6
|
GDDR6
|
416 GB/s
|
160W
|
Nov 2018
|
Titan RTX (T-Rex)
|
Turing
|
4608
|
576
|
Yes
|
24 GB
|
|
|
|
|
Titan V
|
Volta
|
5120
|
640
|
|
12 GB HBM2
|
HBM2
|
652.8 GB/s
|
250W
|
Dec 2017
|
Tesla V100 (PCIe)
|
Volta
|
5120
|
640
|
|
16 GB HBM2
|
HBM2
|
900 GB/s
|
250W
|
June 2017
|
Tesla V100 (SXM2)
|
Volta
|
5120
|
640
|
|
16 GB HBM2
|
HBM2
|
900 GB/s
|
300W
|
June 2017
|
Quadro GV100
|
Volta
|
5120
|
640
|
|
32 GB HBM2
|
HBM2
|
870 GB/s
|
250W
|
Mar 2018
|
Tesla GV100 (SXM2)
|
Volta
|
5120
|
640
|
|
32 GB HBM2
|
HBM2
|
900 GB/s
|
300W
|
Mar 2018
|
DGX-1 (Volta)
|
Volta
|
5120
|
640
|
|
16 x 32 GB HBM2 (512 GB total)
|
HBM2
|
2.7 TB/s
|
3200W
|
Mar 2018
|
NVIDIA Grace Architecture
NVIDIA has announced that they will be partnering with server manufacturers such as HPE, Atos, and Supermicro to create servers that integrate the Grace architecture with ARM-based CPUs. These servers are expected to be available in the second half of 2023
Architecture
|
Key Features
|
Grace
|
CPU-GPU integration, ARM Neoverse CPU, HBM2E memory
|
900 GB/s memory bandwidth, support for PCIe 5.0 and NVLink
|
10x performance improvement for certain HPC workloads
|
Energy efficiency improvements through unified memory space
|