Nvdia-smi tips and tricks: Difference between revisions
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
| Line 43: | Line 43: | ||
| == Reset GPU == | == Reset GPU == | ||
|   # nvidia-smi -g 0 --gpu-reset |   # nvidia-smi -g 0 --gpu-reset | ||
| == GPU mode == | |||
| The mode of the GPU is established directly at power-on, from settings stored in the GPU’s non-volatile memory. | |||
| gpumodeswitch changes the mode of the GPU by updating the GPU’s non-volatile memory settings. | |||
| Compute mode is a configuration that is optimized for high-performance computing (HPC) applications, Compute mode can cause compatibility problems with OS and hypervisors when the GPU is used primarily as a graphics device. | |||
| == Compute mode == | == Compute mode == | ||
| Line 50: | Line 58: | ||
| !number | !number | ||
| !Mode | !Mode | ||
| ! | !Meaning | ||
| |- | |- | ||
| |<code>0</code> | |<code>0</code> | ||
| |<code>Default</code> | |<code>Default</code> | ||
| | | |Default mode GPU can be shared with several jobs,  | ||
| |- | |- | ||
| |<code>1</code> | |<code>1</code> | ||
| |<code>Exclusive_Thread</code> | |<code>Exclusive_Thread</code> | ||
| | | |Exclusive thread mode only is allowed to run one job, but in the same time, only one thread runs on exclusive thread mode GPU. | ||
| |- | |- | ||
| |<code>2</code> | |<code>2</code> | ||
| |<code>Prohibited</code> | |<code>Prohibited</code> | ||
| | | |prohibited mode GPU is not allowed to run job,  | ||
| |- | |- | ||
| |<code>3</code>   | |<code>3</code>   | ||
| |<code>Exclusive_Process</code> | |<code>Exclusive_Process</code> | ||
| | | |Exclusive process mode  is allowed to run one job, but in the same time, only one process runs on exclusive process mode GPU. | ||
| |} | |} | ||
| == References == | == References == | ||
| <references /> | <references /> | ||
Revision as of 11:06, 9 August 2023
Ouput example
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05              Driver Version: 535.86.05    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:D8:00.0 Off |                  Off |
| 30%   42C    P8              38W / 450W |      2MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
| Property name | Anotation | Meaning | 
|---|---|---|
| Performance State | Perf | States range from P0 (maxi-mum performance) to P12 (minimum performance). | 
Turn on / off ECC[1]
To Turn off the ECC RAM
# nvidia-smi -g 0 --ecc-config=0 (repeat with -g x for each GPU ID)
To Turn back on ECC RAM
# nvidia-smi -g 0 --ecc-config=1 (repeat with -g x for each GPU ID)
To Reset ECC error[2]
# nvidia-smi -g 0 --reset-ecc-errors=TYPE (0|VOLATILE or 1|AGGREGATE)
Reset GPU
# nvidia-smi -g 0 --gpu-reset
GPU mode
The mode of the GPU is established directly at power-on, from settings stored in the GPU’s non-volatile memory.
gpumodeswitch changes the mode of the GPU by updating the GPU’s non-volatile memory settings.
Compute mode is a configuration that is optimized for high-performance computing (HPC) applications, Compute mode can cause compatibility problems with OS and hypervisors when the GPU is used primarily as a graphics device.
Compute mode
#nvidia-smi -g0 -c <mode number>
| number | Mode | Meaning | 
|---|---|---|
| 0 | Default | Default mode GPU can be shared with several jobs, | 
| 1 | Exclusive_Thread | Exclusive thread mode only is allowed to run one job, but in the same time, only one thread runs on exclusive thread mode GPU. | 
| 2 | Prohibited | prohibited mode GPU is not allowed to run job, | 
| 3 | Exclusive_Process | Exclusive process mode is allowed to run one job, but in the same time, only one process runs on exclusive process mode GPU. |