About a year ago, I made this post, trying to figure out how to isolate my dGPU for pass-through to a VM. I never managed to make it work, and I was using a workaround, removing the cable until I need it.
Originally I thought the problem was that both my iGPU & dGPU were AMD, so it couldn’t load the driver first and last at the same time. Therefore, I decided to replace my 6900XT with an Nvidia RTX 4080. Unfortunately, this didn’t solve the issue either.
Later on, I moved my main screen to the right side and got as main an Alienware, where to my surprise was behaving better, I didn’t have to remove the cable, but just to switch on the monitor, after the boot selection screen. Much more convenient, but it didn’t last.
That monitor kept failing, so I had to refund it and replace it with an Asus, which sent me back to square 1, where I have to remove the cable again. And that made me mad, so I decided to solve this issue once and for all.
I know that the problem is in the motherboard firmware, I filed an INC back then to Gigabyte, but they suggested it is a software issue, and they sent me a video with Windows 10, working fine. However, I am sure it is firmware related, since it happens when I enter in BIOS settings, the screen appears on my dGPU connected monitor.
I am not very positive, on getting a fix from their side, so here I am again.
Since Windows is working, there should be a way to make Linux work too. I have been trying multiple configurations since last week, where I got the new monitor, but no luck. I have re-installed Manjaro multiple times, just to clean things up, and keep trying whatever solutions I have found online, but none of them works.
My current setup:
GRUB_CMDLINE_LINUX_DEFAULT="resume=UUID=ff3443f9-1bb6-4670-81e5-2c14956e7a45 udev.log_priority=3 video=efifb:off amdgpu.dc=1 amdgpu.modeset=0 nouveau.modeset=1"
mkinitcpio.conf
MODULES=(amdgpu)
HOOKS=(base udev autodetect kms modconf block keyboard keymap consolefont plymouth resume filesystems fsck)
modprobe.d/vfio.conf
blacklist nouveau
blacklist nvidia
blacklist nvidia-drm
blacklist nvidia-modeset
softdep nvidia pre: vfio-pci
options vfio_pci ids=10de:2704,10de:22bb
This is the result, but only if I boot without the cable attached:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD103 [GeForce RTX 4080] [10de:2704] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:40bc]
Kernel driver in use: vfio-pci
Kernel modules: nouveau, nvidia_drm, nvidia
If I attach the cable, it stucks before the login screen. Here are some interesting lines from my log, when I have the cable attached:
Sep 02 20:32:31.689115 wizzy-am5-manjaro-kde6 kernel: pci 0000:01:00.0: vgaarb: setting as boot VGA device
Sep 02 20:32:31.689177 wizzy-am5-manjaro-kde6 kernel: pci 0000:01:00.0: vgaarb: bridge control possible
Sep 02 20:32:31.689240 wizzy-am5-manjaro-kde6 kernel: pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Sep 02 20:32:31.689306 wizzy-am5-manjaro-kde6 kernel: pci 0000:15:00.0: vgaarb: setting as boot VGA device (overriding previous)
Sep 02 20:32:31.689369 wizzy-am5-manjaro-kde6 kernel: pci 0000:15:00.0: vgaarb: bridge control possible
Sep 02 20:32:31.689432 wizzy-am5-manjaro-kde6 kernel: pci 0000:15:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
Sep 02 20:32:31.689438 wizzy-am5-manjaro-kde6 kernel: vgaarb: loaded
...
Sep 02 20:32:31.725515 wizzy-am5-manjaro-kde6 kernel: amdgpu 0000:15:00.0: [drm] REG_WAIT timeout 1us * 100000 tries - optc1_wait_for_state line:839
...
Sep 02 20:32:31.725958 wizzy-am5-manjaro-kde6 kernel: amdgpu 0000:15:00.0: [drm] REG_WAIT timeout 1us * 100000 tries - optc1_wait_for_state line:839
...
Sep 02 20:32:31.727036 wizzy-am5-manjaro-kde6 kernel: amdgpu 0000:15:00.0: [drm] REG_WAIT timeout 1us * 100000 tries - optc1_wait_for_state line:839
...
Sep 02 20:32:32.329760 wizzy-am5-manjaro-kde6 kernel: nvidia: loading out-of-tree module taints kernel.
Sep 02 20:32:32.329773 wizzy-am5-manjaro-kde6 kernel: nvidia: module license 'NVIDIA' taints kernel.
Sep 02 20:32:32.329783 wizzy-am5-manjaro-kde6 kernel: Disabling lock debugging due to kernel taint
Sep 02 20:32:32.329796 wizzy-am5-manjaro-kde6 kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
Sep 02 20:32:32.329810 wizzy-am5-manjaro-kde6 kernel: nvidia: module license taints kernel.
...
Sep 02 20:32:32.667118 wizzy-am5-manjaro-kde6 kernel: nvidia: unknown parameter 'modset' ignored
Sep 02 20:32:32.667127 wizzy-am5-manjaro-kde6 kernel: iwlwifi 0000:0e:00.0: Detected Intel(R) Wi-Fi 6 AX210 160MHz, REV=0x420
Sep 02 20:32:32.667240 wizzy-am5-manjaro-kde6 kernel: thermal thermal_zone0: failed to read out thermal zone (-61)
Sep 02 20:32:32.667326 wizzy-am5-manjaro-kde6 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 508
Sep 02 20:32:32.667335 wizzy-am5-manjaro-kde6 kernel: NVRM: GPU 0000:01:00.0 is already bound to vfio-pci.
Sep 02 20:32:32.667347 wizzy-am5-manjaro-kde6 kernel: NVRM: The NVIDIA probe routine was not called for 1 device(s).
Sep 02 20:32:32.667356 wizzy-am5-manjaro-kde6 kernel: NVRM: This can occur when another driver was loaded and
NVRM: obtained ownership of the NVIDIA device(s).
Sep 02 20:32:32.667366 wizzy-am5-manjaro-kde6 kernel: NVRM: Try unloading the conflicting kernel module (and/or
NVRM: reconfigure your kernel without the conflicting
NVRM: driver(s)), then try loading the NVIDIA kernel module
NVRM: again.
Sep 02 20:32:32.667374 wizzy-am5-manjaro-kde6 kernel: NVRM: No NVIDIA devices probed.
Sep 02 20:32:32.667382 wizzy-am5-manjaro-kde6 kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 508
...