Cannot use cuda with cudnn and leela chess zero (lc0)

After this update I still cannot use cuda with cudnn and leela chess zero (lc0)
This is what I get

[vm@manjaro Lc0] ./lc0 
lc0 v0.26.2 built Sep  9 2020
go nodes 100
Found pb network file: ./J92-100
Creating backend [cudnn-auto]...
Switching to [cudnn]...
CUDA Runtime version: 11.0.0
Cudnn version: 8.0.2
Latest version of CUDA supported by the driver: 11.0.0
GPU: GeForce GTX 960M
GPU memory: 3.95123 Gb
GPU clock frequency: 1176 MHz
GPU compute capability: 5.0
error CUDA error: no kernel image is available for execution on the device (../../src/neural/cuda/winograd_helper.inc:363)

I’ve also tried sudo nvidia-modprobe, and successfully recompiled lc0 from source after the update, with no luck. Never had trouble with earlier versions of cuda. Any help or ideas appreciated…

Otherwise a trouble-free update. Everything else I tried (LO , gimp, krita, blender …) works fine.
Thank you manjaro team!

Vas

You should create a separate topic for that here. An inxi --system --verbosity=7 --filter --no-host --admin, nvidia-smi and mhwd --listinstalled would be welcome there. (There! Not here, where it’ll get lost in all the other blabber, noise, topics!

:innocent: :grin:

Thank you Fabby
Meanwhile I found the solution to my problem and I post it here :slight_smile:

Since I complied lc0 from source, the problem was inside the meson.build file which could not detect my GPU arch automatically! However, no errors shown during the compile. So inside the meson.build file, I had to change the line:

files += cuda_gen.process(cuda_files_nvcc_common)

to

files += cuda_gen.process(cuda_files_nvcc_common, extra_args: ['-arch=compute_50', '-code=sm_50'])

to tell meson explicitly compile for the “Maxwell” architecture (this is done with the: compute_50', and '-code=sm_50 parameters).
My laptop has the NVIDIA GTX960M chip, which is Maxwell.

So… Problem Solved! A moderator could move this to a dedicated topic with an appropriate title, so other people could benefit from this solution.

Vas

It’s actually better if you copy-paste your original question in a more appropriate section and then copy-paste this answer below it, flesh out both of them and then push the Solution button over there so that the next person that has the exact same problem you just had will benefit from your research.

Because to me it’s unclear which solution you’re solving (450 series driver? 440? Is it the source of lc0 you’ve changed? Which approximate line number was that, … )

:scream:

P.S. Also: I’m quite dumb and know very little about a lot, so I wouldn’t be able to do that for you… :sob:

Fair enough.
So I’ll provide more details, and later (when I have time), I’ll copy-paste the problem-solution in a separate thread.

  1. GPU NVIDIA GTX960M (driver 450.xx)
    The solution is explicit for this GPU series only - Maxwell architecture

  2. I changed the line 449 of the meson.build file which is inside the lc0 source directory (after unziping the lc0-0.26.2.tar.gz which contains the source code)

from:

files += cuda_gen.process(cuda_files_nvcc_common)

to:

files += cuda_gen.process(cuda_files_nvcc_common, extra_args: ['-arch=compute_50', '-code=sm_50'])

  1. I run ./build.sh which is also inside the lc0 directory, to do the compile

The rest is easy…

1 Like