Nvdia-smi tips and tricks: Difference between revisions

From HPCWIKI
Jump to navigation Jump to search
Line 20: Line 20:
+---------------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------------+
</syntaxhighlight>
</syntaxhighlight>
{| class="wikitable"
|+
!Property name
!Anotation
!Meaning
|-
|Performance State
|Perf
|States range from P0 (maxi-mum performance) to P12 (minimum performance).
|-
|
|
|
|-
|
|
|
|}


== Turn on / off ECC<ref>https://thelinuxcluster.com/2013/07/24/turning-off-and-on-ecc-ram-for-nvidia-gp-gpu-cards/</ref> ==
== Turn on / off ECC<ref>https://thelinuxcluster.com/2013/07/24/turning-off-and-on-ecc-ram-for-nvidia-gp-gpu-cards/</ref> ==

Revision as of 11:22, 26 July 2023

Ouput example

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05              Driver Version: 535.86.05    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:D8:00.0 Off |                  Off |
| 30%   42C    P8              38W / 450W |      2MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
Property name Anotation Meaning
Performance State Perf States range from P0 (maxi-mum performance) to P12 (minimum performance).

Turn on / off ECC[1]

To Turn off the ECC RAM

# nvidia-smi -g 0 --ecc-config=0
(repeat with -g x for each GPU ID)

To Turn back on ECC RAM

# nvidia-smi -g 0 --ecc-config=1
(repeat with -g x for each GPU ID)

To Reset ECC error[2]

# nvidia-smi -g 0 --reset-ecc-errors=TYPE (0|VOLATILE or 1|AGGREGATE)

Reset GPU

# nvidia-smi -g 0 --gpu-reset

Compute mode

#nvidia-smi -g0 -c <mode number>

number Mode
0 Default
1 Exclusive_Thread
2 Prohibited
3 Exclusive_Process

References