Docker with nvidia GPU-2

I have to use docker for my job, but I canot find the good instaruction page to run docker with GPU on Manjaro or Arch, comparng to Ubuntu.
I do not like Ubuntu at all.

Here is my report to install nvidia-container-toolkit to run nvidia GPU on docker.
I hope Arch creates new pkg.

I found the post, or Arch page, but I can not find the nvidia-container-toolkit now on Arch-repo or AUR .
Something has changed on Arch-repo.

I follows ref-1 page instaructions & solve nvidia-container-cli errors by ref-2.



We can not use compiled pkgs and have to build manually in this moment.
I copied some contents from ref-1 and modify for my codition.

1. Install libnvidia-container-tools at first.

Because you may have message, if installing nvidia-container-toolkit at first.

“Missing dependencies: → libnvidia-container-tools>=1.9.0 AUR”.

tar xvf libnvidia-container.tar.gz && cd libnvidia-container/

sudo pacman -U libnvidia-container-1.11.0-1-x86_64.pkg.tar.zst
sudo pacman -U libnvidia-container-tools-1.11.0-1-x86_64.pkg.tar.zst

2. Install nvidia-container-toolkit .

tar xvf nvidia-container-toolkit.tar.gz && cd nvidia-container-toolkit/

yay -U nvidia-container-toolkit-1.14.6-1-x86_64.pkg.tar.zst

3. confirm following file.

set false.

vi /etc/nvidia-container-runtime/config.toml
no-cgroups = false

4. restart docker.

sudo systemctl restart docker

5. Run dokcer but have error.

docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi
nvidia-container-cli: initialization error: nvml error: insufficient permissions: unknown.

6. modify the config.toml.

sudo vi /etc/nvidia-container-runtime/config.toml

## add or modify as follows
user = "root:vglusers"
## or
user = "root:root"

7. restart docker.

sudo systemctl restart docker

8. run docker with GPU.

it works ! :grin:

docker run --rm --gpus all nvidia/cuda:12.3.2-base-ubuntu22.04 nvidia-smi

Thu May 16 06:28:13 2024
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1080        Off |   00000000:01:00.0 Off |                  N/A |
|  0%   40C    P8             10W /  240W |      13MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |

| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |

My contion is as follows.

use a little bit old driver & old GPU.

yay -Qs cuda

local/cuda 11.8.0-1
    NVIDIA's GPU programming toolkit

    NVIDIA CUDA Deep Neural Network library
yay -Qs nvidia-container-toolkit

local/nvidia-container-toolkit 1.14.6-1
    NVIDIA container runtime toolkit
inxi -F

  Host: ***  Kernel: 5.15.150-1-MANJARO arch: x86_64 bits: 64
  Desktop: IceWM v: N/A Distro: Manjaro Linux

  Info: quad core model: Intel Core i7-4790K bits: 64 type: MT MCP cache:
    L2: 1024 KiB

  Device-1: NVIDIA GP104 [GeForce GTX 1080] driver: nvidia v: 550.54.14
  Display: server: v: driver: X: loaded: nvidia gpu: nvidia
    resolution: 3840x1080
  API: EGL v: 1.5 drivers: kms_swrast,nvidia,swrast,zink
    platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: mesa v: 24.0.2-manjaro1.1
    renderer: llvmpipe (LLVM 16.0.6 256 bits)