Nvdia-smi tips and tricks: Difference between revisions
Jump to navigation
Jump to search
Line 28: | Line 28: | ||
# nvidia-smi -g 0 --ecc-config=1 | # nvidia-smi -g 0 --ecc-config=1 | ||
(repeat with -g x for each GPU ID) | (repeat with -g x for each GPU ID) | ||
To Reset ECC error<ref>https://developer.download.nvidia.com/compute/DCGM/docs/nvidia-smi-367.38.pdf</ref> | |||
# nvidia-smi -g 0 --reset-ecc-errors=TYPE (0|VOLATILE or 1|AGGREGATE) | |||
== Reset GPU == | |||
# nvidia-smi -g 0 --gpu-reset | |||
== Compute mode == | |||
<code>#nvidia-smi -g0 -c <mode number></code> | |||
{| class="wikitable" | |||
|+ | |||
!number | |||
!Mode | |||
! | |||
|- | |||
|<code>0</code> | |||
|<code>Default</code> | |||
| | |||
|- | |||
|<code>1</code> | |||
|<code>Exclusive_Thread</code> | |||
| | |||
|- | |||
|<code>2</code> | |||
|<code>Prohibited</code> | |||
| | |||
|- | |||
|<code>3</code> | |||
|<code>Exclusive_Process</code> | |||
| | |||
|} | |||
== References == | == References == | ||
<references /> | <references /> |
Revision as of 11:16, 26 July 2023
Ouput example
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:D8:00.0 Off | Off |
| 30% 42C P8 38W / 450W | 2MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Turn on / off ECC[1]
To Turn off the ECC RAM
# nvidia-smi -g 0 --ecc-config=0 (repeat with -g x for each GPU ID)
To Turn back on ECC RAM
# nvidia-smi -g 0 --ecc-config=1 (repeat with -g x for each GPU ID)
To Reset ECC error[2]
# nvidia-smi -g 0 --reset-ecc-errors=TYPE (0|VOLATILE or 1|AGGREGATE)
Reset GPU
# nvidia-smi -g 0 --gpu-reset
Compute mode
#nvidia-smi -g0 -c <mode number>
number | Mode | |
---|---|---|
0
|
Default
|
|
1
|
Exclusive_Thread
|
|
2
|
Prohibited
|
|
3
|
Exclusive_Process
|