Got the Lenovo ThinkStation A4000 GPU for graphic, rendering and machine learning. It is confirmed to have no damages and runs Blender and benchmarks under Windows, but the vast majority of AI and machine learning packages are not optimized for Windows, so I have tried Fedora LTS and Manjaro, but it always have a problem to set up this GPU by instructions. Right out of the box, it can display the desktop, but onboard VGA drivers are missing all CUDA and OPTiX capabilities which is verified by starting Blender 3+ and looking up the render acceleration settings. So I’ve tried to install all of combinations of nvidia drivers from Add or Remove software, but after reboot the system always stuck at the bootup textwall and can be operated only in CLI mode from tty2. I’ve tried sudo mhwd -a pci nonfree 0300, but it tells that is skipping the installation because an appropriate driver is already installed. I’ve also tried to install from NVIDIA-Linux-x86_64-460.80.run , but it yields an error “Your kernel headers for kernel 5.19.1-3-MANJARO cannot be found at /usr/lib/modules/5.19.1-3-MANJARO/build or /usr/lib/modules/5.19.1-3-MANJARO/source” despite I have altered nothing in the system.
uname -r tells:
5.15.60-1-MANJARO
mhwd -l && mhwd -li tells
[details="Спойлер"]
e[1me[31m> e[m0000:08:00.0 (0300:10de:24b0) Display controller nVidia Corporation:
--------------------------------------------------------------------------------
NAME VERSION FREEDRIVER TYPE
--------------------------------------------------------------------------------
video-linux 2018.05.04 true PCI
video-modesetting 2020.01.13 true PCI
video-vesa 2017.03.12 true PCI
e[1me[31m> e[mInstalled PCI configs:
--------------------------------------------------------------------------------
NAME VERSION FREEDRIVER TYPE
--------------------------------------------------------------------------------
video-linux 2018.05.04 true PCI
e[1me[31mWarning: e[mNo installed USB configs!
[/details]
lspci -vga tells
08:00.0 VGA compatible controller: NVIDIA Corporation GA104GL [RTX A4000] (rev a1)
so basically the system can observe the GPU and displays the image on the beginning, but after installing the driver it refuses to see any available displays (xinit refuses to run saying EE: No displays found) and nvida-smi also refuses to run saying that the problem is in the software.
Here is the inxi -Fza output.
[details="Спойлер"]
System:
Kernel: 5.15.60-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 12.1.1
parameters: BOOT_IMAGE=/boot/vmlinuz-5.15-x86_64 root=UUID=c1bd0cd3-ca24-41b1-93ca-d420de7bca18
rw quiet udev.log_priority=3
Console: tty 2 Distro: Manjaro Linux base: Arch Linux
Machine:
Type: Desktop Mobo: ASUSTeK model: PRIME X570-PRO v: Rev X.0x serial: <filter>
UEFI: American Megatrends v: 3603 date: 03/20/2021
CPU:
Info: model: AMD Ryzen 7 PRO 2700 socket: AM4 bits: 64 type: MT MCP arch: Zen+ gen: 2 level: v3
built: 2018-21 process: GF 12nm family: 0x17 (23) model-id: 8 stepping: 2 microcode: 0x800820D
Topology: cpus: 1x cores: 8 tpc: 2 threads: 16 smt: enabled cache: L1: 768 KiB desc: d-8x32
KiB; i-8x64 KiB L2: 4 MiB desc: 8x512 KiB L3: 16 MiB desc: 2x8 MiB
Speed (MHz): avg: 1653 high: 3200 min/max: 1550/3200 boost: enabled base/boost: 3200/4100
scaling: driver: acpi-cpufreq governor: schedutil volts: 1.1 V ext-clock: 100 MHz cores:
1: 3200 2: 1550 3: 1550 4: 1550 5: 1550 6: 1550 7: 1550 8: 1550 9: 1550 10: 1550 11: 1550
12: 1550 13: 1550 14: 1550 15: 1550 16: 1550 bogomips: 102254
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Vulnerabilities:
Type: itlb_multihit status: Not affected
Type: l1tf status: Not affected
Type: mds status: Not affected
Type: meltdown status: Not affected
Type: mmio_stale_data status: Not affected
Type: retbleed mitigation: untrained return thunk; SMT vulnerable
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, STIBP: disabled, RSB filling,
PBRSB-eIBRS: Not affected
Type: srbds status: Not affected
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: NVIDIA GA104GL [RTX A4000] vendor: Lenovo driver: N/A alternate: nouveau
non-free: 515.xx+ status: current (as of 2022-08) arch: Ampere code: GAxxx process: TSMC n7
(7nm) built: 2020-22 pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4 speed: 16 GT/s
bus-ID: 08:00.0 chip-ID: 10de:24b0 class-ID: 0300
Display: server: X.org v: 1.21.1.4 driver: X: loaded: nouveau unloaded: modesetting
alternate: fbdev,nv,vesa gpu: N/A tty: 160x45
Message: GL data unavailable in console for root.
Audio:
Device-1: NVIDIA GA104 High Definition Audio vendor: Lenovo driver: snd_hda_intel v: kernel
pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4 speed: 16 GT/s bus-ID: 08:00.1
chip-ID: 10de:228b class-ID: 0403
Device-2: AMD Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel pcie:
gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 0a:00.3 chip-ID: 1022:1457 class-ID: 0403
Sound Server-1: ALSA v: k5.15.60-1-MANJARO running: yes
Sound Server-2: JACK v: 1.9.21 running: no
Sound Server-3: PulseAudio v: 16.1 running: no
Sound Server-4: PipeWire v: 0.3.56 running: no
Network:
Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: kernel pcie: gen: 1
speed: 2.5 GT/s lanes: 1 port: f000 bus-ID: 04:00.0 chip-ID: 8086:1539 class-ID: 0200
IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Device-2: Ralink MT7601U Wireless Adapter type: USB driver: mt7601u bus-ID: 5-4:2
chip-ID: 148f:7601 class-ID: 0000 serial: <filter>
IF: wlp9s0f3u4 state: down mac: <filter>
Drives:
Local Storage: total: 2.29 TiB used: 10.1 GiB (0.4%)
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Crucial model: CT500P2SSD8 size: 465.76 GiB
block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 type: SSD serial: <filter>
rev: P2CR012 temp: 45.9 C scheme: GPT
SMART: yes health: PASSED on: 7d 18h cycles: 94 read-units: 1,236,647 [633 GB]
written-units: 1,960,729 [1.00 TB]
ID-2: /dev/sda maj-min: 8:0 vendor: Western Digital model: WD20EZAZ-00GGJB0
family: Blue (SMR) size: 1.82 TiB block-size: physical: 4096 B logical: 512 B sata: 3.1
speed: 6.0 Gb/s type: HDD rpm: 5400 serial: <filter> rev: 0A80 temp: 31 C scheme: MBR
SMART: yes state: enabled health: PASSED on: 22d 20h cycles: 205
ID-3: /dev/sdb maj-min: 8:16 type: USB vendor: Transcend model: JetFlash 16GB size: 14.96 GiB
block-size: physical: 512 B logical: 512 B type: SSD serial: <filter> rev: 8.01 scheme: MBR
SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure?
Partition:
ID-1: / raw-size: 89.76 GiB size: 87.8 GiB (97.81%) used: 10.06 GiB (11.5%) fs: ext4
block-size: 4096 B dev: /dev/nvme0n1p6 maj-min: 259:6
ID-2: /boot/efi raw-size: 100 MiB size: 96 MiB (96.00%) used: 38.3 MiB (39.9%) fs: vfat
block-size: 512 B dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
Alert: No swap data was found.
Sensors:
System Temperatures: cpu: 43.8 C mobo: N/A
Fan Speeds (RPM): N/A
Info:
Processes: 269 Uptime: 0m wakeups: 0 Memory: 31.26 GiB used: 758 MiB (2.4%) Init: systemd
v: 251 default: graphical tool: systemctl Compilers: gcc: 12.1.1 clang: 14.0.6 Packages:
pm: pacman pkgs: 1184 libs: 317 tools: pamac pm: flatpak pkgs: 0 Shell: Bash (sudo) v: 5.1.16
running-in: tty 2 inxi: 3.3.21
[/details]