Cannot isolate dVGA for VFIO

I recently upgraded my system from AM4 with 2 dVGAs, to AM5 with iGPU and dVGA.
Since then I am having trouble to isolate my dVGA from Linux, so I can pass it through to my VM (QEMU and KVM)

I posted my issues to reddit and Level1Forums, and most people blame Manjaro, so here I am. Let’s prove them wrong!

Setup:
3 monitors. 2 connected on iGPU (DP & HDMI) and 1 to my 6900XT DP.
3rd monitor is off, until I boot my VM

Problem:
If during boot I have the DP cable connected to 6900XT, Linux ignores my Kernel Parameters and initializes dVGA as normal.
As soon as I modify “/etc/modprobe.conf” with "MODULES=“vfio_pci vfio”, I boot to a blank screen having access only to console.

Troubleshooting:
Made literally hundreds of tests, but I am willing to do more in order to make it work.
Here is a long thread if somebody wants to read.

Workaround:
I boot with the cable disconnected from DP port (tried to disconnect it from the monitor side, didn’t work), and I only connected it after I boot the VM. This way it works, but it is very annoying, as sometimes I forget to disconnect it, and I have to reboot, or if I want to reboot just the VM, I have to reboot the whole system, etc.

Kernel:
I am using 6.1, which has ACS patch embedded. I am having issues with 6.2 & 6.3

BIOS:
iGPU defined as primary output

inxi -F

System:
  Host: wizzy-am5-manjaro Kernel: 6.1.31-2-MANJARO arch: x86_64 bits: 64
    Desktop: KDE Plasma v: 5.27.5 Distro: Manjaro Linux
Machine:
  Type: Desktop System: Gigabyte product: X670E AORUS MASTER v: -CF
    serial: <superuser required>
  Mobo: Gigabyte model: X670E AORUS MASTER v: x.x
    serial: <superuser required> UEFI: American Megatrends LLC. v: F10d
    date: 05/05/2023
CPU:
  Info: 16-core model: AMD Ryzen 9 7950X3D bits: 64 type: MT MCP cache:
    L2: 16 MiB
  Speed (MHz): avg: 3129 min/max: 3000/5759 cores: 1: 3000 2: 3000 3: 4200
    4: 3000 5: 3000 6: 3000 7: 3000 8: 3000 9: 3000 10: 3000 11: 3000 12: 3000
    13: 3000 14: 3000 15: 3000 16: 3000 17: 4200 18: 3000 19: 3599 20: 2880
    21: 3000 22: 3000 23: 3000 24: 4200 25: 3000 26: 3000 27: 3056 28: 3000
    29: 3000 30: 3000 31: 3000 32: 3000
Graphics:
  Device-1: AMD Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] driver: vfio-pci
    v: N/A
  Device-2: AMD Raphael driver: amdgpu v: kernel
  Device-3: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
    type: USB
  Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.1.1 driver: X:
    loaded: amdgpu unloaded: modesetting,radeon dri: radeonsi gpu: amdgpu
    resolution: 1: 1200x1920~60Hz 2: 2560x1440~60Hz
  API: OpenGL v: 4.6 Mesa 23.0.4 renderer: AMD Radeon Graphics (gfx1036
    LLVM 15.0.7 DRM 3.49 6.1.31-2-MANJARO)
Audio:
  Device-1: AMD Navi 21/23 HDMI/DP Audio driver: vfio-pci
  Device-2: AMD Rembrandt Radeon High Definition Audio driver: snd_hda_intel
  Device-3: AMD Family 17h/19h HD Audio driver: snd_hda_intel
  Device-4: Creative Sound Blaster X5
    driver: cdc_acm,hid-generic,snd-usb-audio,usbhid type: USB
  Device-5: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
    type: USB
  API: ALSA v: k6.1.31-2-MANJARO status: kernel-api
  Server-1: PulseAudio v: 16.1 status: active
Network:
  Device-1: Intel Ethernet I225-V driver: igc
  IF: enp15s0 state: up speed: 1000 Mbps duplex: full mac: 74:56:3c:4b:74:7e
  Device-2: Intel Wi-Fi 6 AX210/AX211/AX411 160MHz driver: iwlwifi
  IF: wlp16s0 state: down mac: 1a:7b:5e:33:8f:04
Bluetooth:
  Device-1: Intel AX210 Bluetooth driver: btusb type: USB
  Report: rfkill ID: hci0 rfk-id: 0 state: down bt-service: enabled,running
    rfk-block: hardware: no software: yes address: see --recommends
Drives:
  Local Storage: total: 5.52 TiB used: 1.12 TiB (20.2%)
  ID-1: /dev/nvme0n1 vendor: Kingston model: SKC3000D2048G size: 1.86 TiB
  ID-2: /dev/nvme1n1 vendor: Seagate model: XPG GAMMIX S50 Lite
    size: 953.87 GiB
  ID-3: /dev/nvme2n1 vendor: Samsung model: SSD 970 EVO 500GB
    size: 465.76 GiB
  ID-4: /dev/sda vendor: Crucial model: CT2000BX500SSD1 size: 1.82 TiB
  ID-5: /dev/sdb vendor: Samsung model: SSD 850 EVO 500GB size: 465.76 GiB
Partition:
  ID-1: / size: 448.43 GiB used: 34.27 GiB (7.6%) fs: ext4 dev: /dev/nvme2n1p2
  ID-2: /boot/efi size: 299.4 MiB used: 312 KiB (0.1%) fs: vfat
    dev: /dev/nvme2n1p1
Swap:
  ID-1: swap-1 type: partition size: 8.8 GiB used: 10 MiB (0.1%)
    dev: /dev/nvme2n1p3
Sensors:
  System Temperatures: cpu: 39.6 C mobo: N/A gpu: amdgpu temp: 34.0 C
  Fan Speeds (RPM): N/A
Info:
  Processes: 523 Uptime: 30m Memory: available: 61.95 GiB
  used: 38.42 GiB (62.0%) Shell: Zsh inxi: 3.3.27

Kernel parameters:

GRUB_CMDLINE_LINUX_DEFAULT="resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28,10ec:8161 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1

IOMMU groups:

IOMMU Group 0:
        00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 1:
        00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 10:
        00:08.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14dd]
IOMMU Group 11:
        00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 71)
        00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 12:
        00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e0]
        00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e1]
        00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e2]
        00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e3]
        00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e4]
        00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e5]
        00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e6]
        00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14e7]
IOMMU Group 13:
        01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c0)
IOMMU Group 14:
        02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
IOMMU Group 15:
        03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c0)
IOMMU Group 16:
        03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
IOMMU Group 17:
        03:00.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:73a6]
IOMMU Group 18:
        03:00.3 Serial bus controller [0c80]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 USB [1002:73a4]
IOMMU Group 19:
        04:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:5013] (rev 01)
IOMMU Group 2:
        00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 20:
        05:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f4] (rev 01)
IOMMU Group 21:
        06:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 22:
        06:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 23:
        06:05.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 24:
        06:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 25:
        06:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 26:
        06:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 27:
        06:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 28:
        0b:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f4] (rev 01)
IOMMU Group 29:
        0c:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 3:
        00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 30:
        0c:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 31:
        0c:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 32:
        0c:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 33:
        0c:08.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 34:
        0c:0c.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 35:
        0c:0d.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f5] (rev 01)
IOMMU Group 36:
        0f:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 01)
IOMMU Group 37:
        10:00.0 Network controller [0280]: Intel Corporation Wi-Fi 6 AX210/AX211/AX411 160MHz [8086:2725] (rev 1a)
IOMMU Group 38:
        11:00.0 Non-Volatile memory controller [0108]: ADATA Technology Co., Ltd. XPG GAMMIX S50 NVMe SSD [1cc1:5350] (rev 03)
IOMMU Group 39:
        12:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f7] (rev 01)
IOMMU Group 4:
        00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 40:
        13:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f6] (rev 01)
IOMMU Group 41:
        14:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f7] (rev 01)
IOMMU Group 42:
        15:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43f6] (rev 01)
IOMMU Group 43:
        16:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
IOMMU Group 44:
        17:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c9)
IOMMU Group 45:
        17:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
IOMMU Group 46:
        17:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] VanGogh PSP/CCP [1022:1649]
IOMMU Group 47:
        17:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b6]
IOMMU Group 48:
        17:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b7]
IOMMU Group 49:
        17:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h HD Audio Controller [1022:15e3]
IOMMU Group 5:
        00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14db]
IOMMU Group 50:
        18:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15b8]
IOMMU Group 6:
        00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 7:
        00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 8:
        00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14da]
IOMMU Group 9:
        00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14dd]

Gigabyte X670 Aorus Master manuals

Anyone?
Anything to suggest for me to try?

I cannot accept that others, with very similar setups, have no issues, in other distros. There must be a way.

Here it is correctly isolated…

video: 1002:73bf
audio: 1002:ab28
what ever that is: 10ec:8161, it is not listed.

The configuration is fully normal, beside your special parameters.

Maybe order is wrong?

No :see_no_evil:

MODULES=(vfio_pci vfio vfio_iommu_type1 amdgpu) → /etc/mkinitcpio.conf → sudo mkinitcpio -P

The order matters here. If it initializes the iGPU with amdgpu module first, then it is a sign that it loads the amdgpu first and then vfio, but vfio needs to claim it first.

Quote:

If you also have another driver loaded this way for early modesetting (such as nouveau, radeon, amdgpu, i915, etc.), all of the aforementioned VFIO modules must precede it.
PCI passthrough via OVMF - ArchWiki

A 2nd NIC, which I removed, but forgot to change the kernel params.

Yes, sorry, the mkinitcpio.conf is the correct file. This is the line:

MODULES="crc32c vfio_pci vfio vfio_iommu_type1"

And, I don’t have any other driver, but amdgpu is loaded later for my iGPU

I changed this to

MODULES="crc32c vfio_pci vfio vfio_iommu_type1 amdgpu"

And now I can login normally, but it initializes my screen that is attached to 6900XT

and lspci -nnv returns

03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c0) (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Radeon RX 6900 XT [1002:0e3a]
        Flags: bus master, fast devsel, latency 0, IRQ 59, IOMMU group 15
        Memory at 1400000000 (64-bit, prefetchable) [size=16G]
        Memory at 1200000000 (64-bit, prefetchable) [size=2M]
        I/O ports at f000 [size=256]
        Memory at fca00000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at fcb00000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
        Flags: fast devsel, IRQ 255, IOMMU group 16
        Memory at fcb24000 (32-bit, non-prefetchable) [disabled] [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

Could you add:

sudo dmesg | grep -i vfio
sudo grep -E "^\s+linux\s+" /boot/grub/grub.cfg

These are from boot without the cable attached:

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    0.047413] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    7.818330] VFIO - User Level meta-driver version: 0.3
[    7.827018] vfio_pci: add [1002:73bf[ffffffff:ffffffff]] class 0x000000/00000000
[    7.827144] vfio_pci: add [1002:ab28[ffffffff:ffffffff]] class 0x000000/00000000
[   30.694983]  acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme crc32c_intel nvme_core xhci_pci xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec
[  141.282797]  acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme crc32c_intel nvme_core xhci_pci xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec

grub.cfg:

        linux   /boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw  resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
                linux   /boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw  resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
                linux   /boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw  resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
        linux /boot/memtest86+/memtest.efi 

Let me know if you need the same, after booting with the cable on

Yes, I thought you booted with cable?

Since it didn’t work as expected, I rebooted “normal” :smiling_face:

sudo dmesg | grep -i vfio

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    0.047389] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    8.100049] VFIO - User Level meta-driver version: 0.3
[    8.108572] vfio_pci: add [1002:73bf[ffffffff:ffffffff]] class 0x000000/00000000
[    8.108695] vfio_pci: add [1002:ab28[ffffffff:ffffffff]] class 0x000000/00000000

sudo grep -E “^\s+linux\s+” /boot/grub/grub.cfg

linux   /boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw  resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
                linux   /boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw  resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
                linux   /boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw  resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
        linux /boot/memtest86+/memtest.efi 

hm… ok… at this moment I have no idea. I clearly see the issue, but vfio is not to blame here, since it works as intended. It is probably because of the amdgpu module or a HOOK runs too late?

At this moment, I am out of ideas. In your situation, I would go and debug the kernel and look when amdgpu claims the device and when vfio does.

Also check that xorg or wayland ignores the monitor.

No idea how to do that though…

I do have Wayland installed, but my work requires VM Horizon, which does not work with it, only Xorg, even with xwayland installed.

You could start with increasing the log output: General troubleshooting - ArchWiki Add debug to the kernel parameters.

That should increase the output of dmesg as well.

First boot without cable, below one with cable.

sudo dmesg | grep -i vfio

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    0.047655] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    7.838402] VFIO - User Level meta-driver version: 0.3
[    7.846474] vfio_pci: add [1002:73bf[ffffffff:ffffffff]] class 0x000000/00000000
[    7.846616] vfio_pci: add [1002:ab28[ffffffff:ffffffff]] class 0x000000/00000000
[   72.309441]  bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec
[   81.503712]  acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec
[   93.829279]  acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec

sudo dmesg | grep -i amdgpu

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    0.047655] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    5.028601] amdgpu: unknown parameter 'sg_display' ignored
[    5.028798] [drm] amdgpu kernel modesetting enabled.
[    5.028813] amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GP17.VGA_.ATPX handle
[    5.031827] amdgpu: Ignoring ACPI CRAT on non-APU system
[    5.031829] amdgpu: Virtual CRAT table created for CPU
[    5.031833] amdgpu: Topology: Add CPU node
[    5.031887] amdgpu 0000:03:00.0: enabling device (0006 -> 0007)
[    5.033327] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT
[    5.033328] amdgpu: ATOM BIOS: 113-D4120100-100
[    5.033349] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[    5.033372] amdgpu 0000:03:00.0: amdgpu: MEM ECC is not presented.
[    5.033373] amdgpu 0000:03:00.0: amdgpu: SRAM ECC is not presented.
[    5.033390] amdgpu 0000:03:00.0: BAR 2: releasing [mem 0xfcf0000000-0xfcf01fffff 64bit pref]
[    5.033392] amdgpu 0000:03:00.0: BAR 0: releasing [mem 0xfce0000000-0xfcefffffff 64bit pref]
[    5.033419] amdgpu 0000:03:00.0: BAR 0: assigned [mem 0x1400000000-0x17ffffffff 64bit pref]
[    5.033426] amdgpu 0000:03:00.0: BAR 2: assigned [mem 0x1200000000-0x12001fffff 64bit pref]
[    5.033474] amdgpu 0000:03:00.0: amdgpu: VRAM: 16368M 0x0000008000000000 - 0x00000083FEFFFFFF (16368M used)
[    5.033476] amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[    5.033477] amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
[    5.033515] [drm] amdgpu: 16368M of VRAM memory ready
[    5.033516] [drm] amdgpu: 31719M of GTT memory ready.
[    5.033771] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist
[    5.033773] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist
[    6.978030] amdgpu 0000:03:00.0: amdgpu: STB initialized to 2048 entries
[    6.978520] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware
[    7.187689] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[    7.187730] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5600 (58.86.0)
[    7.187732] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[    7.187760] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
[    7.262258] amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully!
[    7.276655] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    7.276810] amdgpu: sdma_bitmap: ffff
[    7.311884] amdgpu: HMM registered 16368MB device memory
[    7.312102] amdgpu: SRAT table not found
[    7.312103] amdgpu: Virtual CRAT table created for GPU
[    7.312266] amdgpu: Topology: Add dGPU node [0x73bf:0x1002]
[    7.312270] kfd kfd: amdgpu: added device 1002:73bf
[    7.312293] amdgpu 0000:03:00.0: amdgpu: SE 4, SH per SE 2, CU per SH 10, active_cu_number 80
[    7.312340] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[    7.312342] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    7.312343] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    7.312344] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    7.312345] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    7.312346] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    7.312347] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    7.312348] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    7.312349] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    7.312350] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    7.312351] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[    7.312352] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[    7.312353] amdgpu 0000:03:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
[    7.312354] amdgpu 0000:03:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
[    7.312354] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[    7.312355] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[    7.312356] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[    7.312357] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
[    7.312358] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
[    7.312359] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
[    7.312360] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
[    7.313117] amdgpu 0000:03:00.0: amdgpu: Using BACO for runtime pm
[    7.313323] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:03:00.0 on minor 0
[    7.319624] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[    7.332888] amdgpu 0000:17:00.0: enabling device (0006 -> 0007)
[    7.333752] amdgpu 0000:17:00.0: amdgpu: Fetched VBIOS from VFCT
[    7.333754] amdgpu: ATOM BIOS: 102-RAPHAEL-008
[    7.333765] amdgpu 0000:17:00.0: vgaarb: deactivate vga console
[    7.333767] amdgpu 0000:17:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[    7.333770] amdgpu 0000:17:00.0: amdgpu: PCIE atomic ops is not supported
[    7.333807] amdgpu 0000:17:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[    7.333814] amdgpu 0000:17:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[    7.333817] amdgpu 0000:17:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[    7.333848] [drm] amdgpu: 512M of VRAM memory ready
[    7.333850] [drm] amdgpu: 31719M of GTT memory ready.
[    7.334093] amdgpu 0000:17:00.0: amdgpu: PSP runtime database doesn't exist
[    7.334096] amdgpu 0000:17:00.0: amdgpu: PSP runtime database doesn't exist
[    7.334782] amdgpu 0000:17:00.0: amdgpu: Will use PSP to load VCN firmware
[    7.424420] amdgpu 0000:17:00.0: amdgpu: RAS: optional ras ta ucode is not available
[    7.430405] amdgpu 0000:17:00.0: amdgpu: RAP: optional rap ta ucode is not available
[    7.430406] amdgpu 0000:17:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[    7.430447] amdgpu 0000:17:00.0: amdgpu: smu driver if version = 0x00000004, smu fw if version = 0x00000005, smu fw program = 0, smu fw version = 0x00544fda (84.79.218)
[    7.430449] amdgpu 0000:17:00.0: amdgpu: SMU driver if version not matched
[    7.431579] amdgpu 0000:17:00.0: amdgpu: SMU is initialized successfully!
[    7.522665] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    7.522717] amdgpu: sdma_bitmap: 3
[    7.544772] amdgpu: HMM registered 512MB device memory
[    7.545400] amdgpu: SRAT table not found
[    7.545400] amdgpu: Virtual CRAT table created for GPU
[    7.545635] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
[    7.545637] kfd kfd: amdgpu: added device 1002:164e
[    7.545645] amdgpu 0000:17:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
[    7.545689] amdgpu 0000:17:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[    7.545691] amdgpu 0000:17:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    7.545692] amdgpu 0000:17:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    7.545693] amdgpu 0000:17:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    7.545694] amdgpu 0000:17:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    7.545695] amdgpu 0000:17:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    7.545696] amdgpu 0000:17:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    7.545697] amdgpu 0000:17:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    7.545698] amdgpu 0000:17:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    7.545699] amdgpu 0000:17:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    7.545700] amdgpu 0000:17:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[    7.545701] amdgpu 0000:17:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[    7.545702] amdgpu 0000:17:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[    7.545702] amdgpu 0000:17:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[    7.545703] amdgpu 0000:17:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
[    7.546070] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:17:00.0 on minor 1
[    7.550207] fbcon: amdgpudrmfb (fb0) is primary device
[    7.788850] amdgpu 0000:17:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[   14.923827] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
[   64.610713] snd_hda_intel 0000:17:00.1: bound 0000:17:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[   64.772748] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[   64.773601] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[   64.774007] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5600 (58.86.0)
[   64.774675] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[   64.775379] amdgpu 0000:03:00.0: amdgpu: dpm has been enabled
[   64.780805] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[   64.816421] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[   64.817068] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[   64.817704] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[   64.818321] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[   64.818928] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[   64.819524] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[   64.820113] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[   64.820467] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[   64.820988] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[   64.821568] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[   64.822167] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[   64.822762] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[   64.823349] amdgpu 0000:03:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
[   64.823799] amdgpu 0000:03:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
[   64.824334] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[   64.824908] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[   64.825494] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[   64.826021] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
[   64.826344] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
[   64.826633] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
[   64.827024] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
[   64.834873] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[   64.835513] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[   72.309114] WARNING: CPU: 1 PID: 222 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:656 amdgpu_irq_put+0x46/0x70 [amdgpu]
[   72.309441]  bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec
[   72.311912] RIP: 0010:amdgpu_irq_put+0x46/0x70 [amdgpu]
[   72.315825]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.316432]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.317842]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.318179]  ? smu_v11_0_enable_thermal_alert+0x60/0x60 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.318527]  smu_smc_hw_cleanup+0x50/0x2f0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.318872]  smu_suspend+0x5f/0xe0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.319205]  amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.319530]  amdgpu_device_suspend+0xcd/0x150 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.319858]  amdgpu_pmops_runtime_suspend+0xbe/0x1a0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   72.324660] amdgpu 0000:03:00.0: amdgpu: Fail to disable thermal alert!
[   72.324912] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -22
[   72.350322] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
[   76.010780] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[   76.011010] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[   76.011232] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5600 (58.86.0)
[   76.011459] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[   76.011701] amdgpu 0000:03:00.0: amdgpu: dpm has been enabled
[   76.015448] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[   76.047745] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[   76.048112] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[   76.048572] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[   76.049019] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[   76.049414] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[   76.049767] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[   76.050116] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[   76.050486] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[   76.050849] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[   76.051199] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[   76.051551] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[   76.051888] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[   76.052230] amdgpu 0000:03:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
[   76.052570] amdgpu 0000:03:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
[   76.052909] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[   76.053252] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[   76.053580] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[   76.053910] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
[   76.054228] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
[   76.054543] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
[   76.054863] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
[   76.060441] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[   76.060972] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[   76.189160] amdgpu 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   76.189163] amdgpu 0000:17:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   81.503599] WARNING: CPU: 6 PID: 257 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:656 amdgpu_irq_put+0x46/0x70 [amdgpu]
[   81.503712]  acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec
[   81.503730] RIP: 0010:amdgpu_irq_put+0x46/0x70 [amdgpu]
[   81.503813]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.503882]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.503956]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504021]  ? smu_v11_0_enable_thermal_alert+0x60/0x60 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504104]  smu_smc_hw_cleanup+0x50/0x2f0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504179]  smu_suspend+0x5f/0xe0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504249]  amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504313]  amdgpu_device_suspend+0xcd/0x150 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504378]  amdgpu_pmops_runtime_suspend+0xbe/0x1a0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   81.504465] amdgpu 0000:03:00.0: amdgpu: Fail to disable thermal alert!
[   81.504466] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -22
[   81.529294] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer
[   86.041304] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[   86.041306] amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
[   86.041308] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5600 (58.86.0)
[   86.041310] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[   86.041326] amdgpu 0000:03:00.0: amdgpu: dpm has been enabled
[   86.045334] amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
[   86.076640] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[   86.076641] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[   86.076642] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[   86.076642] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[   86.076643] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[   86.076643] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[   86.076643] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[   86.076644] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[   86.076644] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[   86.076645] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[   86.076645] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[   86.076646] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[   86.076646] amdgpu 0000:03:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
[   86.076646] amdgpu 0000:03:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
[   86.076647] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[   86.076647] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[   86.076648] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[   86.076648] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
[   86.076649] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
[   86.076649] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
[   86.076649] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
[   86.081725] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[   86.081728] amdgpu 0000:03:00.0: [drm] Cannot find any crtc or sizes
[   93.829162] WARNING: CPU: 31 PID: 543 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:656 amdgpu_irq_put+0x46/0x70 [amdgpu]
[   93.829279]  acpi_cpufreq mac_hid dm_multipath crypto_user fuse loop dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid nvme nvme_core xhci_pci crc32c_intel xhci_pci_renesas nvme_common vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio amdgpu drm_ttm_helper ttm video wmi gpu_sched drm_buddy drm_display_helper cec
[   93.829296] RIP: 0010:amdgpu_irq_put+0x46/0x70 [amdgpu]
[   93.829371]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829435]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829504]  ? amdgpu_irq_put+0x46/0x70 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829565]  ? smu_v11_0_enable_thermal_alert+0x60/0x60 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829642]  smu_smc_hw_cleanup+0x50/0x2f0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829711]  smu_suspend+0x5f/0xe0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829774]  amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829833]  amdgpu_device_suspend+0xcd/0x150 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829891]  amdgpu_pmops_runtime_suspend+0xbe/0x1a0 [amdgpu 3105ce22dec4cce778fe470f35fd6901136b2724]
[   93.829974] amdgpu 0000:03:00.0: amdgpu: Fail to disable thermal alert!
[   93.829975] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -22
[   93.853809] amdgpu 0000:03:00.0: amdgpu: free PSP TMR buffer

sudo dmesg | grep -i vfio

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    0.047611] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    8.084899] VFIO - User Level meta-driver version: 0.3
[    8.094219] vfio_pci: add [1002:73bf[ffffffff:ffffffff]] class 0x000000/00000000
[    8.095312] vfio_pci: add [1002:ab28[ffffffff:ffffffff]] class 0x000000/00000000

sudo dmesg | grep -i amdgpu

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    0.047611] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64 root=UUID=fe399164-95ff-466f-b402-5c4f8aa899f7 rw debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt amdgpu.sg_display=0 vfio-pci.ids=1002:73bf,1002:ab28 hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 video=efifb:off,vesafb:off pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1
[    5.026920] amdgpu: unknown parameter 'sg_display' ignored
[    5.027120] [drm] amdgpu kernel modesetting enabled.
[    5.027133] amdgpu: vga_switcheroo: detected switching method \_SB_.PCI0.GP17.VGA_.ATPX handle
[    5.030231] amdgpu: Ignoring ACPI CRAT on non-APU system
[    5.030233] amdgpu: Virtual CRAT table created for CPU
[    5.030237] amdgpu: Topology: Add CPU node
[    5.030299] amdgpu 0000:03:00.0: enabling device (0006 -> 0007)
[    5.031736] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT
[    5.031737] amdgpu: ATOM BIOS: 113-D4120100-100
[    5.031754] amdgpu 0000:03:00.0: vgaarb: deactivate vga console
[    5.031756] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[    5.031774] amdgpu 0000:03:00.0: amdgpu: MEM ECC is not presented.
[    5.031775] amdgpu 0000:03:00.0: amdgpu: SRAM ECC is not presented.
[    5.031790] amdgpu 0000:03:00.0: BAR 2: releasing [mem 0xfcf0000000-0xfcf01fffff 64bit pref]
[    5.031792] amdgpu 0000:03:00.0: BAR 0: releasing [mem 0xfce0000000-0xfcefffffff 64bit pref]
[    5.031817] amdgpu 0000:03:00.0: BAR 0: assigned [mem 0x1400000000-0x17ffffffff 64bit pref]
[    5.031823] amdgpu 0000:03:00.0: BAR 2: assigned [mem 0x1200000000-0x12001fffff 64bit pref]
[    5.031869] amdgpu 0000:03:00.0: amdgpu: VRAM: 16368M 0x0000008000000000 - 0x00000083FEFFFFFF (16368M used)
[    5.031871] amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[    5.031872] amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF
[    5.031903] [drm] amdgpu: 16368M of VRAM memory ready
[    5.031905] [drm] amdgpu: 31719M of GTT memory ready.
[    5.032148] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist
[    5.032150] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist
[    6.985910] amdgpu 0000:03:00.0: amdgpu: STB initialized to 2048 entries
[    6.986512] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware
[    7.195596] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[    7.195614] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5600 (58.86.0)
[    7.195616] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched
[    7.195645] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable
[    7.269378] amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully!
[    7.523795] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    7.523951] amdgpu: sdma_bitmap: ffff
[    7.571767] amdgpu: HMM registered 16368MB device memory
[    7.572171] amdgpu: SRAT table not found
[    7.572172] amdgpu: Virtual CRAT table created for GPU
[    7.572332] amdgpu: Topology: Add dGPU node [0x73bf:0x1002]
[    7.572335] kfd kfd: amdgpu: added device 1002:73bf
[    7.572358] amdgpu 0000:03:00.0: amdgpu: SE 4, SH per SE 2, CU per SH 10, active_cu_number 80
[    7.572409] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[    7.572411] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    7.572412] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    7.572413] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    7.572414] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    7.572415] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    7.572416] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    7.572417] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    7.572417] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    7.572418] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    7.572419] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[    7.572420] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
[    7.572421] amdgpu 0000:03:00.0: amdgpu: ring sdma2 uses VM inv eng 14 on hub 0
[    7.572422] amdgpu 0000:03:00.0: amdgpu: ring sdma3 uses VM inv eng 15 on hub 0
[    7.572423] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[    7.572424] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[    7.572425] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[    7.572426] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_1 uses VM inv eng 5 on hub 1
[    7.572427] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 6 on hub 1
[    7.572428] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_1.1 uses VM inv eng 7 on hub 1
[    7.572428] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 8 on hub 1
[    7.573178] amdgpu 0000:03:00.0: amdgpu: Using BACO for runtime pm
[    7.573383] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:03:00.0 on minor 0
[    7.582270] fbcon: amdgpudrmfb (fb0) is primary device
[    7.776784] amdgpu 0000:03:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[    7.803729] amdgpu 0000:17:00.0: enabling device (0006 -> 0007)
[    7.804967] amdgpu 0000:17:00.0: amdgpu: Fetched VBIOS from VFCT
[    7.805002] amdgpu: ATOM BIOS: 102-RAPHAEL-008
[    7.808109] amdgpu 0000:17:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[    7.808679] amdgpu 0000:17:00.0: amdgpu: PCIE atomic ops is not supported
[    7.810160] amdgpu 0000:17:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[    7.810815] amdgpu 0000:17:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[    7.811434] amdgpu 0000:17:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[    7.813354] [drm] amdgpu: 512M of VRAM memory ready
[    7.813996] [drm] amdgpu: 31719M of GTT memory ready.
[    7.816374] amdgpu 0000:17:00.0: amdgpu: PSP runtime database doesn't exist
[    7.816932] amdgpu 0000:17:00.0: amdgpu: PSP runtime database doesn't exist
[    7.820561] amdgpu 0000:17:00.0: amdgpu: Will use PSP to load VCN firmware
[    7.911200] amdgpu 0000:17:00.0: amdgpu: RAS: optional ras ta ucode is not available
[    7.918070] amdgpu 0000:17:00.0: amdgpu: RAP: optional rap ta ucode is not available
[    7.919004] amdgpu 0000:17:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[    7.920032] amdgpu 0000:17:00.0: amdgpu: smu driver if version = 0x00000004, smu fw if version = 0x00000005, smu fw program = 0, smu fw version = 0x00544fda (84.79.218)
[    7.920663] amdgpu 0000:17:00.0: amdgpu: SMU driver if version not matched
[    7.923130] amdgpu 0000:17:00.0: amdgpu: SMU is initialized successfully!
[    8.014265] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    8.015280] amdgpu: sdma_bitmap: 3
[    8.032101] amdgpu: HMM registered 512MB device memory
[    8.033386] amdgpu: SRAT table not found
[    8.034368] amdgpu: Virtual CRAT table created for GPU
[    8.036056] amdgpu: Topology: Add dGPU node [0x164e:0x1002]
[    8.037032] kfd kfd: amdgpu: added device 1002:164e
[    8.037428] amdgpu 0000:17:00.0: amdgpu: SE 1, SH per SE 1, CU per SH 2, active_cu_number 2
[    8.038446] amdgpu 0000:17:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[    8.039235] amdgpu 0000:17:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    8.040347] amdgpu 0000:17:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    8.041211] amdgpu 0000:17:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    8.042049] amdgpu 0000:17:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    8.042924] amdgpu 0000:17:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    8.043967] amdgpu 0000:17:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    8.045020] amdgpu 0000:17:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    8.045988] amdgpu 0000:17:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    8.047041] amdgpu 0000:17:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    8.048039] amdgpu 0000:17:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[    8.049091] amdgpu 0000:17:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
[    8.050090] amdgpu 0000:17:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
[    8.051137] amdgpu 0000:17:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
[    8.052141] amdgpu 0000:17:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
[    8.053749] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:17:00.0 on minor 1
[    8.057830] amdgpu 0000:17:00.0: [drm] fb1: amdgpudrmfb frame buffer device
[   14.744927] snd_hda_intel 0000:17:00.1: bound 0000:17:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[   25.551598] amdgpu 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[   25.551602] amdgpu 0000:17:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none

Also, I don’t think Wayland will make any difference, as it “grabs” the screen on bootloader.
When I have the cable on, immediately after the POST beep, initializes 6900XT as primary display.

Well, look at the time code, as I thought: amdgpu loads first. That is the real problem. This blog post seems to address your problem in some kind: Kernel 6.X: Prevent loading any other driver than vfio-pci in order to passthrough graphic card. | Szymon Niedźwiedź

You need to create an extra hook, which will prevent amdgpu to claim the output. It is written for Arch Linux, but it is 1:1 appliable to Manjaro also as far as I can see.

1 Like

I have seen that, but looks too complicated… Not just to do it, but to do it every time I am upgrading my system, which is relatively often. However, my understanding is, that this is Kernel bug, or happens, because iGPU and dGPU use both amdgpu module, therefore it is loaded for both cards.

Also, I am not sure it is related to my problem 100%, as mine works, without the cable. And this is what confuses me. I mean, the dGPU is there, installed… What difference makes the cable? Why it loads amdgpu early when the cable is installed?

Complicated? Look… the only thing that would need to be changed is just DEVS= anything else is static. However, I know no other workaround and I believe other Distros will have the same problem when using Linux 6.x.

1 Like

Why there are two HW IDs for 6900XT VGA? (not the HDMI audio)

03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c0) (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Radeon RX 6900 XT [1002:0e3a]

1002:73bf:1002:0e3a

vendor:device:subvendor:subdevice

It is the same vendor, commonly device and subdevice id are different. The subvendor id often describes a specific customized device from a specific manufacture.

In your case, it comes directly from AMD (Advanced Micro Devices).

Like that:

00:02.0 VGA compatible controller [0300]: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630] [8086:3e92]
	DeviceName: Onboard IGD
	Subsystem: Hewlett-Packard Company CoffeeLake-S GT2 [UHD Graphics 630] [103c:845a]
	Kernel driver in use: i915
	Kernel modules: i915

The chip comes from Intel, but it was used by HP. It doesn’t need to be modified entirely. It could be just the name, but subvendor id and subdevice id changed.

1 Like

Ok, I tried it, but didn’t work…

I followed the steps, made all modifications, but I end up without Desktop Environment when the cable is connected, works fine, when it is not.

My command line params:

GRUB_CMDLINE_LINUX_DEFAULT="debug resume=UUID=eb319a47-23e2-4b2b-ad27-4924407771e0 udev.log_priority=3 amd_iommu=force_enable iommu=pt hugepages=16384 systemd.unified_cgroup_hierarchy=1 kvm.ignore_msrs=1 pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1"

I removed the IDs from here, as it instructs in the blog.

And then, I end up here:

I ctrl+alt+F2, and ran “lspci -nvv” and as you can see, vfio-pci driver is loaded…

Well… I see progress here. Now the vfio-pci driver claim the device as intended even if the cable is connected.

What does this “Virtualization daemon” do? This is now the part which stuck. :wink: