Docker with nvidia GPU-2

I have to use docker for my job, but I canot find the good instaruction page to run docker with GPU on Manjaro or Arch, comparng to Ubuntu.
I do not like Ubuntu at all.

Here is my report to install nvidia-container-toolkit to run nvidia GPU on docker.
I hope Arch creates new pkg.

I found the post, or Arch page, but I can not find the nvidia-container-toolkit now on Arch-repo or AUR .
Something has changed on Arch-repo.

I follows ref-1 page instaructions & solve nvidia-container-cli errors by ref-2.

Reference.
ref-1
ref-2

preparation


We can not use compiled pkgs and have to build manually in this moment.
I copied some contents from ref-1 and modify for my codition.

1. Install libnvidia-container-tools at first.


Because you may have message, if installing nvidia-container-toolkit at first.

“Missing dependencies: → libnvidia-container-tools>=1.9.0 AUR”.

wget https://aur.archlinux.org/cgit/aur.git/snapshot/libnvidia-container.tar.gz
tar xvf libnvidia-container.tar.gz && cd libnvidia-container/
makepkg

sudo pacman -U libnvidia-container-1.11.0-1-x86_64.pkg.tar.zst
sudo pacman -U libnvidia-container-tools-1.11.0-1-x86_64.pkg.tar.zst

2. Install nvidia-container-toolkit .

wget https://aur.archlinux.org/cgit/aur.git/snapshot/nvidia-container-toolkit.tar.gz
tar xvf nvidia-container-toolkit.tar.gz && cd nvidia-container-toolkit/
makepkg

yay -U nvidia-container-toolkit-1.14.6-1-x86_64.pkg.tar.zst

3. confirm following file.

set false.

vi /etc/nvidia-container-runtime/config.toml
##
no-cgroups = false

4. restart docker.

sudo systemctl restart docker

5. Run dokcer but have error.

docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi
##
nvidia-container-cli: initialization error: nvml error: insufficient permissions: unknown.

6. modify the config.toml.

sudo vi /etc/nvidia-container-runtime/config.toml

## add or modify as follows
user = "root:vglusers"
## or
user = "root:root"

7. restart docker.

sudo systemctl restart docker

8. run docker with GPU.

it works ! :grin:

docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi

Thu May 16 06:28:13 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+====================|
|   0  NVIDIA GeForce GTX 1080        Off |   00000000:01:00.0 Off |                  N/A |
|  0%   40C    P8             10W /  240W |      13MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|==============================================================

My contion is as follows.

use a little bit old driver & old GPU.

yay -Qs cuda

local/cuda 11.8.0-1
    NVIDIA's GPU programming toolkit

local/cudnn 8.6.0.163-1
    NVIDIA CUDA Deep Neural Network library
 
yay -Qs nvidia-container-toolkit

local/nvidia-container-toolkit 1.14.6-1
    NVIDIA container runtime toolkit
inxi -F

System:
  Host: ***  Kernel: 5.15.150-1-MANJARO arch: x86_64 bits: 64
  Desktop: IceWM v: N/A Distro: Manjaro Linux

CPU:
  Info: quad core model: Intel Core i7-4790K bits: 64 type: MT MCP cache:
    L2: 1024 KiB

Graphics:
  Device-1: NVIDIA GP104 [GeForce GTX 1080] driver: nvidia v: 550.54.14
  Display: server: X.org v: 1.21.1.11 driver: X: loaded: nvidia gpu: nvidia
    resolution: 3840x1080
  API: EGL v: 1.5 drivers: kms_swrast,nvidia,swrast,zink
    platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: mesa v: 24.0.2-manjaro1.1
    renderer: llvmpipe (LLVM 16.0.6 256 bits)

Reference.


ref-1:

ref-2: