FAQ: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
== could not select device driver "" with capabilities: GPU == | == Failed to load plugin io.containerd and could not use snapshotter == | ||
Reason - warning or information from the snapshotter<ref>https://dev.to/napicella/what-is-a-containerd-snapshotters-3eo2</ref> - image storage - that we have a lot of choices | |||
Impact : the warning log doesn't impact the whole system operating | |||
Solve to | |||
1.Disable the snapshotter plugins which you don't need by updating config file for your system and restart containerd, like | |||
<code># /etc/containerd/config.toml | |||
disabled_plugins = ["cri", "btrfs"]</code> | |||
2. To use ZFS, you need to mount ZFS dataset on /var/lib/containerd/io.containerd.snapshotter.v1.zfs | |||
3. To use btrfs, you need to mount btrfs to /var/lib/containerd/io.containerd.snapshotter.v1.btrfs | |||
4. For aufs, you need to modprobe it as explained in the error log | |||
== Could not select device driver "" with capabilities: GPU == | |||
* Reason - no nvidia-container-toolkit or currupt exist package | * Reason - no nvidia-container-toolkit or currupt exist package | ||
Line 53: | Line 69: | ||
A VM configured with a vGPU that supports SR-IOV may fail to start, This issue occurs because [[PCIe]] [[AER (Advanced Error Reporting)]] [[support]] was disabled in the [[BIOS]] settings of the server. | A VM configured with a vGPU that supports SR-IOV may fail to start, This issue occurs because [[PCIe]] [[AER (Advanced Error Reporting)]] [[support]] was disabled in the [[BIOS]] settings of the server. | ||
== Reference == | |||
<references /> |
Revision as of 15:23, 24 April 2023
Failed to load plugin io.containerd and could not use snapshotter
Reason - warning or information from the snapshotter[1] - image storage - that we have a lot of choices
Impact : the warning log doesn't impact the whole system operating
Solve to
1.Disable the snapshotter plugins which you don't need by updating config file for your system and restart containerd, like
# /etc/containerd/config.toml
disabled_plugins = ["cri", "btrfs"]
2. To use ZFS, you need to mount ZFS dataset on /var/lib/containerd/io.containerd.snapshotter.v1.zfs
3. To use btrfs, you need to mount btrfs to /var/lib/containerd/io.containerd.snapshotter.v1.btrfs
4. For aufs, you need to modprobe it as explained in the error log
Could not select device driver "" with capabilities: GPU
- Reason - no nvidia-container-toolkit or currupt exist package
- Solve to install/reinstall nvidia-container-toolkit then restart docker daemon
$distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$sudo systemctl restart docker
Pytorch FAQ
- How to get CUDA compute capability of a GPU?
- $python -c "import torch; print(torch.cuda.get_arch_list())"
Show List Of Network Cards on Linux
- lspci command : List all PCI devices.
#lspci | egrep -i --color 'network|ethernet'
#lspci | egrep -i --color 'network|ethernet|wireless|wi-fi'
- lshw command : Linux identify Ethernet interfaces and NIC hardware.
#lshw -class network
- $sudo lshw -class network -short
- dmidecode command : List all hardware data from BIOS.
- ifconfig command : Outdated network config
$ifconfig -a
$ip link show
$ip a
- ip command : Recommended new network config .
$ip a show wlp82s0
$ip -br -c link show # To list all interface, link status, MAC address, etc
$ip -br -c addr show # similar list with IP address instead of MAC Address
- hwinfo command : Probe Linux for network cards.
$sudo hwinfo --network --short
- ethtool command : See NIC/card driver and settings on Linux.
$sudo ethtool -i eno1
$sudo ethtool -i enp0s31f6
- /proc/net/dev file - The dev pseudo-file contains network device status information. This gives the number of received and sent packets, the number of errors and collisions and other basic statistics
$cat /proc/net/dev
Failed to set iommu for container: Invalid argument
A VM configured with a vGPU that supports SR-IOV may fail to start, This issue occurs because PCIe AER (Advanced Error Reporting) support was disabled in the BIOS settings of the server.