Stuck on clean with free drivers using a Virtual Machine

Hi there,

So recently I have been testing out pci passthrough with this guide:

https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
(don’t know if that has anything to do with my problem but posting it anyway)

and I’ve heard somewhere that the nonfree drivers won’t work with that, which gets reflected by my sudo dmesg | grep -i vfio below:

[    2.760850] VFIO - User Level meta-driver version: 0.3
[    2.764723] vfio-pci 0000:26:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    2.779032] vfio_pci: add [10de:13c2[ffffffff:ffffffff]] class 0x000000/00000000
[    2.795792] vfio_pci: add [10de:0fbb[ffffffff:ffffffff]] class 0x000000/00000000

…when booting with the non-free drivers installed (doesn’t include the enable line)
so now I’ve tried switching to the free drivers and the system won’t boot correctly.
It’s stuck at clean but my WiFi hotspot miraculously turns on during that phase - idk how that works.

I’m working at getting my setup to work for a few days now, so here is what i already tried on my own, hoping that mentioning that could be of any help:

  • I’m running 2 GPUs on 2 monitors so I tried switching labels around with the BUS ID in the config files, located in x11 folders, to make Manjaro display things via my second GPU because neither nvidia-settings, nor display could activate any connections on the second GPU to the display (works now, idk why or how)
  • a few months ago I’ve already ran through the whole installation, hoping I could ignore an error I’ve had (unsure about the reason because i cant remember well any more)

i think I’ve narrowed down my problem to my graphics driver keeping my device from booting so I’m posting here.

i really hope that someone can help me - I don’t want to run windows any more outside of a VM

System:
  Kernel: 5.12.19-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0 
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.12-x86_64 
  root=UUID=a27f0baf-14fa-41db-844a-9ff44f8c8f6f rw quiet apparmor=1 
  security=apparmor resume=UUID=d0dac5d0-5548-41e7-8572-14de99a109cc 
  udev.log_priority=3 iommu=pt 
  Desktop: Xfce 4.16.0 tk: Gtk 3.24.29 info: xfce4-panel wm: xfwm 4.16.1 vt: 7 
  dm: LightDM 1.30.0 Distro: Manjaro Linux base: Arch Linux 
Machine:
  Type: Desktop System: Micro-Star product: MS-7B86 v: 4.0 serial: <filter> 
  Mobo: Micro-Star model: B450-A PRO MAX (MS-7B86) v: 4.0 serial: <filter> 
  BIOS: American Megatrends LLC. v: M.D1 date: 04/12/2021 
Battery:
  Device-1: hidpp_battery_0 model: Logitech Wireless Mouse MX Master 2S 
  serial: <filter> charge: 100% (should be ignored) rechargeable: yes 
  status: Discharging 
Memory:
  RAM: total: 15.55 GiB used: 4.97 GiB (31.9%) 
  RAM Report: permissions: Unable to run dmidecode. Root privileges required. 
CPU:
  Info: 8-Core model: AMD Ryzen 7 3700X bits: 64 type: MT MCP arch: Zen 2 
  family: 17 (23) model-id: 71 (113) stepping: 0 microcode: 8701021 cache: 
  L2: 4 MiB bogomips: 115232 
  Speed: 4328 MHz min/max: 2200/3600 MHz boost: enabled Core speeds (MHz): 
  1: 4328 2: 2175 3: 2186 4: 3750 5: 2200 6: 2181 7: 2179 8: 3828 9: 2177 
  10: 4192 11: 2167 12: 2136 13: 2199 14: 2454 15: 4353 16: 2178 
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1 
  bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy 
  constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total cqm_occup_llc 
  cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid f16c flushbyasid 
  fma fpu fsgsbase fxsr fxsr_opt ht hw_pstate ibpb ibs irperf lahf_lm lbrv lm 
  mba mca mce misalignsse mmx mmxext monitor movbe msr mtrr mwaitx nonstop_tsc 
  nopl npt nrip_save nx osvw overflow_recov pae pat pausefilter pclmulqdq 
  pdpe1gb perfctr_core perfctr_llc perfctr_nb pfthreshold pge pni popcnt pse 
  pse36 rdpid rdpru rdrand rdseed rdt_a rdtscp rep_good sep sev sev_es sha_ni 
  skinit smap smca sme smep ssbd sse sse2 sse4_1 sse4_2 sse4a ssse3 stibp 
  succor svm svm_lock syscall tce topoext tsc tsc_scale umip v_vmsave_vmload 
  vgif vmcb_clean vme vmmcall wbnoinvd wdt xgetbv1 xsave xsavec xsaveerptr 
  xsaveopt xsaves 
  Vulnerabilities: Type: itlb_multihit status: Not affected 
  Type: l1tf status: Not affected 
  Type: mds status: Not affected 
  Type: meltdown status: Not affected 
  Type: spec_store_bypass 
  mitigation: Speculative Store Bypass disabled via prctl and seccomp 
  Type: spectre_v1 
  mitigation: usercopy/swapgs barriers and __user pointer sanitization 
  Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, STIBP: 
  conditional, RSB filling 
  Type: srbds status: Not affected 
  Type: tsx_async_abort status: Not affected 
Graphics:
  Device-1: NVIDIA GP107 [GeForce GTX 1050] 
  vendor: PC Partner Limited / Sapphire driver: nvidia v: 470.63.01 
  alternate: nouveau,nvidia_drm bus-ID: 25:00.0 chip-ID: 10de:1c81 
  class-ID: 0300 
  Device-2: NVIDIA GM204 [GeForce GTX 970] vendor: Gigabyte driver: vfio-pci 
  v: 0.2 alternate: nouveau,nvidia_drm,nvidia bus-ID: 26:00.0 
  chip-ID: 10de:13c2 class-ID: 0300 
  Display: x11 server: X.Org 1.20.13 compositor: xfwm4 v: 4.16.1 driver: 
  loaded: vfio-pci note: n/a (using device driver) failed: nvidia 
  display-ID: :0.0 screens: 1 
  Screen-1: 0 s-res: 3840x1351 s-dpi: 96 s-size: 1016x357mm (40.0x14.1") 
  s-diag: 1077mm (42.4") 
  Monitor-1: HDMI-0 res: 1920x1080 hz: 60 dpi: 92 size: 531x298mm (20.9x11.7") 
  diag: 609mm (24") 
  Monitor-2: DP-1 res: 1920x1080 hz: 60 dpi: 92 size: 531x299mm (20.9x11.8") 
  diag: 609mm (24") 
  OpenGL: renderer: NVIDIA GeForce GTX 1050/PCIe/SSE2 
  v: 4.6.0 NVIDIA 470.63.01 direct render: Yes 
Audio:
  Device-1: NVIDIA GP107GL High Definition Audio 
  vendor: PC Partner Limited / Sapphire driver: snd_hda_intel v: kernel 
  bus-ID: 25:00.1 chip-ID: 10de:0fb9 class-ID: 0403 
  Device-2: NVIDIA GM204 High Definition Audio vendor: Gigabyte 
  driver: vfio-pci v: 0.2 alternate: snd_hda_intel bus-ID: 26:00.1 
  chip-ID: 10de:0fbb class-ID: 0403 
  Device-3: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI 
  driver: snd_hda_intel v: kernel bus-ID: 28:00.4 chip-ID: 1022:1487 
  class-ID: 0403 
  Sound Server-1: ALSA v: k5.12.19-1-MANJARO running: yes 
  Sound Server-2: JACK v: 1.9.19 running: no 
  Sound Server-3: PulseAudio v: 15.0 running: yes 
  Sound Server-4: PipeWire v: 0.3.34 running: yes 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  vendor: Micro-Star MSI driver: r8169 v: kernel port: e000 bus-ID: 22:00.0 
  chip-ID: 10ec:8168 class-ID: 0200 
  IF: enp34s0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
  IP v4: <filter> type: dynamic noprefixroute scope: global 
  broadcast: <filter> 
  IP v6: <filter> type: dynamic noprefixroute scope: global 
  IP v6: <filter> type: dynamic noprefixroute scope: global 
  IP v6: <filter> type: noprefixroute scope: link 
  Device-2: Realtek RTL8812AU 802.11a/b/g/n/ac 2T2R DB WLAN Adapter type: USB 
  driver: rtl8812au bus-ID: 3-2:2 chip-ID: 0bda:8812 class-ID: 0000 
  serial: <filter> 
  IF: wlp40s0f3u2 state: up mac: <filter> 
  IP v4: <filter> type: noprefixroute scope: global broadcast: <filter> 
  IP v6: <filter> type: noprefixroute scope: link 
  IF-ID-1: virbr0 state: down mac: <filter> 
  IP v4: <filter> scope: global broadcast: <filter> 
  WAN IP: <filter> 
Bluetooth:
  Message: No bluetooth data found. 
Logical:
  Message: No logical block device data found. 
RAID:
  Message: No RAID data found. 
Drives:
  Local Storage: total: 3.87 TiB used: 1.26 TiB (32.5%) 
  SMART Message: Required tool smartctl not installed. Check --recommends 
  ID-1: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 860 QVO 1TB 
  size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s 
  type: SSD serial: <filter> rev: 2B6Q scheme: MBR 
  ID-2: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 870 QVO 2TB 
  size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s 
  type: SSD serial: <filter> rev: 1B6Q scheme: GPT 
  ID-3: /dev/sdc maj-min: 8:32 vendor: Samsung model: SSD 850 EVO 250GB 
  size: 232.89 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s 
  type: SSD serial: <filter> rev: 2B6Q scheme: MBR 
  ID-4: /dev/sdd maj-min: 8:48 vendor: Western Digital model: WD10EZEX-00WN4A0 
  size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s 
  type: HDD rpm: 7200 serial: <filter> rev: 1A01 scheme: MBR 
  Message: No optical or floppy data found. 
Partition:
  ID-1: / raw-size: 922.71 GiB size: 907.15 GiB (98.31%) 
  used: 525.94 GiB (58.0%) fs: ext4 dev: /dev/sda1 maj-min: 8:1 label: N/A 
  uuid: a27f0baf-14fa-41db-844a-9ff44f8c8f6f 
  ID-2: /run/media/jonas/5757cf41-d650-4984-9e58-01ab341fba59 
  raw-size: 1.82 TiB size: 1.79 TiB (98.37%) used: 756.94 GiB (41.3%) fs: ext4 
  dev: /dev/sdb1 maj-min: 8:17 label: N/A 
  uuid: 5757cf41-d650-4984-9e58-01ab341fba59 
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) 
  ID-1: swap-1 type: partition size: 8.8 GiB used: 2.76 GiB (31.4%) 
  priority: -2 dev: /dev/sda2 maj-min: 8:2 label: N/A 
  uuid: d0dac5d0-5548-41e7-8572-14de99a109cc 
Unmounted:
  ID-1: /dev/sdc1 maj-min: 8:33 size: 50 MiB fs: ntfs label: System Reserved 
  uuid: DCC8CC0BC8CBE23E 
  ID-2: /dev/sdc2 maj-min: 8:34 size: 232.35 GiB fs: ntfs label: N/A 
  uuid: 440ACCBD0ACCACEC 
  ID-3: /dev/sdc3 maj-min: 8:35 size: 499 MiB fs: ntfs label: N/A 
  uuid: 602278FB2278D80C 
  ID-4: /dev/sdd1 maj-min: 8:49 size: 931.51 GiB fs: ntfs 
  label: Lokaler Datentr\xc3\xa4ger uuid: 40ACC3BCACC3AB2C 
USB:
  Hub-1: 1-0:1 info: Full speed (or root) Hub ports: 10 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 1-1:2 info: Logitech Unifying Receiver type: Keyboard,Mouse,HID 
  driver: logitech-djreceiver,usbhid interfaces: 3 rev: 2.0 speed: 12 Mb/s 
  power: 98mA chip-ID: 046d:c52b class-ID: 0300 
  Device-2: 1-2:3 info: ROCCAT Ryos MK Glow Keyboard type: Keyboard,HID 
  driver: ryos,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 500mA 
  chip-ID: 1e7d:31ce class-ID: 0300 
  Device-3: 1-4:27 info: QinHeng CH340 serial converter 
  type: <vendor specific> driver: ch341,ch341-uart interfaces: 1 rev: 1.1 
  speed: 12 Mb/s power: 98mA chip-ID: 1a86:7523 class-ID: ff00 
  Device-4: 1-8:18 info: Lakeview Research Saleae Logic 
  type: <vendor specific> driver: usbfs interfaces: 1 rev: 2.0 speed: 480 Mb/s 
  power: 100mA chip-ID: 0925:3881 class-ID: ff00 serial: <filter> 
  Hub-2: 2-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-3: 3-0:1 info: Full speed (or root) Hub ports: 4 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 3-2:2 
  info: Realtek RTL8812AU 802.11a/b/g/n/ac 2T2R DB WLAN Adapter type: Network 
  driver: rtl8812au interfaces: 1 rev: 2.0 speed: 480 Mb/s power: 500mA 
  chip-ID: 0bda:8812 class-ID: 0000 serial: <filter> 
  Hub-4: 4-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
Sensors:
  System Temperatures: cpu: 53.8 C mobo: N/A gpu: nvidia temp: 40 C 
  Fan Speeds (RPM): N/A gpu: nvidia fan: 49% 
Info:
  Processes: 379 Uptime: 4h 18m wakeups: 6 Init: systemd v: 248 
  tool: systemctl Compilers: gcc: 11.1.0 clang: 12.0.1 Packages: 1247 
  pacman: 1238 lib: 330 flatpak: 9 Shell: Bash v: 5.1.8 
  running-in: xfce4-terminal inxi: 3.3.06

:+1: Welcome to Manjaro! :+1:

  1. Please read this:
    How to provide good information
    and press the three dots below your post and press the :pencil2: to give us more information so we can see what’s really going on.
    Now we know the symptom of the disease, but we need some more probing to know where the origin lies… :grin:
  2. An inxi --admin --verbosity=7 --filter --no-host --width would be the minimum required information for us to be able to help you. (Personally Identifiable Information like serial numbers and MAC addresses will be filtered out by the above command)
    Also, please copy-paste that output in-between 3 backticks ``` at the beginning and end of the code/text.
  3. Just to be clear: you’re trying to do a GPU passthrough on Manjaro as Host and Windows as guest, using which VM software?
  4. Please post the contents of /etc/X11/mhwd.d/nvidia.conf

:+1:

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 470.63.01

# nvidia-settings: X configuration file generated by nvidia-settings
# nvidia-settings:  version 470.63.01

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    Screen      1  "Screen1" RightOf "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
    Option         "Xinerama" "0"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "BenQ BL2405"
    HorizSync       30.0 - 83.0
    VertRefresh     50.0 - 76.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "Acer G246HL"
    HorizSync       30.0 - 80.0
    VertRefresh     55.0 - 76.0
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce GTX 970"
    BusID          "PCI:37:0:0"
	Option "NoLogo" "1"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce GTX 1050"
    BusID          "PCI:38:0:0"
	Option "NoLogo" "1"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "DFP-4"
    Option         "metamodes" "DVI-I-1: nvidia-auto-select +0+0, DVI-D-0: nvidia-auto-select +1920+271"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "Stereo" "0"
    Option         "metamodes" "nvidia-auto-select +0+0 {AllowGSYNC=Off}"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Extensions"
    Option         "COMPOSITE" "Enable"
EndSection

 
Section "InputClass"
    Identifier          "Keyboard Defaults"
    MatchIsKeyboard        "yes"
    Option              "XkbOptions" "terminate:ctrl_alt_bksp"
EndSection

Yes, manjaro host with windows guest.
I will be using Virtual Machine Manager, but right now im just trying to boot with only video-linux drivers installed.

I just saw that I didnt switch the BoardName around when i switched the BusID of the two devices.

Kernel 5.12 is EOL so please install both 5.4 and 5.10 LTS (Long Term Support) and see which one of the two troubles you least.

So the GTX 970 is for the VM, right? and that is not hooked up to anything, right?

:thinking:

1 Like

will try both kernels now, thanks for the next steps.

yes

my monitor is occasionally hooked up to my gtx970 via dvi because i have to switch gpus to navigate through my bios. i havent tried disconnecting it from any monitors during boot when having accessed my bios tho, nor do i plan on doing that in the future - best case scenario is that both manjaro and the vm get fired up at some time during bootup and i can just switch between them using my monitor settings

1 Like

Okay im done testing.

5.10 was the same as 5.12 and 5.14:

  • did not get past clean but still could access terminal with alt + f2 and switch my graphics drivers back to nonfree

5.04 however was not the same:

  • i encountered nothing but a black screen and could not access the terminal, could not even navigate through it blindly
1 Like

Good, so now you’re back with non-free does the VM start?

:thinking:

no it does not :confused:

the cpu usage graph has a short spike and then goes flat with no picture on dvi

Hello @Shoxx98 :wink:

Let me summarize…

This is the GPU for the host:

and this is the one for the VM:

and it failed to load the nvidia driver and loader vfio-pci instead:

Ok thats looks normal:

but why did you add the GPU, which should be used for the VM to the xorg config?

I mean, the GPU should be not configured by the host and just pass through to the VM, or what is your intention? You can’t configure a GPU for the Host and pass it through to the VM. Not possible.

3 Likes

oh right so it trying to load non-free drivers could even not be the issue, if it would load vfio drivers instead.

it would give me a black screen if i didnt label the 1050 as device 0.

the board names are not in order.
970 is PCI:38:0:0.

i think i had to add a second device myself. That is how i selected my “primary GPU” for boot.

If I dont do that, Manjaro will boot, trying to display something with the gpu closer to my cpu, which is the 970, and just blackscreen, not use the 1050 at all.

It could be, however, that the entry of device 1 is obsolete, as you said, when device 0 is the only one that gets used anyway.
I will try deleting that tomorrow.

1 Like

Hey,
Just a note: The primary BIOS GPU, the one connected on the main port, will always be initialized first, hence xorg will try to match that. If you want to passthrough the NVIDIA GeForce GTX 1050 that you defined as Device1 in nvidia.conf, but is actually Device-1 by inxi, it means is on the primary port of the mainboard.
Defining the GeForce GTX 970 as Device0 doesn’t quite work because initializes last, reason why inxi will show it as Device-2. At least this is how i understand it.
Switch the GPU slots, regenerate the nvidia.conf and then add the one you want to passthrough.

yeeah theres 2 problems with that:

  1. my case isnt big enough
  2. my iommu group for my second gpu looks like this:
IOMMU Group 13:
	03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
	03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
	03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
	20:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	20:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	20:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
	22:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
	25:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107 [GeForce GTX 1050] [10de:1c81] (rev a1)
	25:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)

and i dont know how to do the iommu regrouping thing with a manjaro kernel.
because as far as i understand it, linux-vfio is the standard arch kernel, so there could be compatibility issues, and they make it quite scary to use because you have to “READ THE WIKI AND UNDERSTAND HOW TO USE MAKEPKG AND EVERYTHING IT ENTAILS”.
And i dont know how to build ACS or Zen/Liquorix myself. I havent found a guide to use yet but didnt really search for it either.

I would be fine with just passing over the 1050 for now and then do some modding to my pc later with riser cables and things like that to connect my 970 but Seperation of iommu members would come first

update:

  • regenerated the nvidia.conf. after i edited that one the first time, i reinstalled drivers a couple of times and i believe the BusIDs got switched back in the process, so 970 is probably BusID 37? i dont know anymore, honestly
  • this is my setup now:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet apparmor=1 security=apparmor resume=UUID=d0dac5d0-5548-41e7-8572-14de99a109cc udev.log_priority=3 iommu=pt" in /etc/default/grub (if I got that correctly, there should be a pcie_acs_override=... option here in the final setup)
  • MODULES="" in mkinitcpio.conf

and

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce GTX 970"
    BusID          "PCI:37:0:0"
	Option "NoLogo" "1"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce GTX 1050"
    BusID          "PCI:38:0:0"
	Option "NoLogo" "1"
EndSection

in nvidia.conf

so im just testing the vfio kernel now and will see what works, because:

Ive found this thread
https://www.reddit.com/r/VFIO/comments/kdelvx/need_some_help_with_the_acs_patch_on_manjaro_5818/
which describes the usage of patches from the vfio kernel linux-vfio to build my own manjaro kernel.
I want to use the lts kernel, but just used the add-acs-overrides.patch file from linux-vfio (the 5.13 one) and tested the built kernel.
The patch file is the same according to https://text-compare.com/
On bootup, after cleaning it tells me that it failed to start load kernel module.
I will try the lts patches now:
0001-ZEN-Add-sysctl-and-CONFIG-to-disallow-unprivileged-C.patch
0002-gcc-plugins-modern-gcc-plugin-infrastructure-requres.patch
together with:
add-acs-overrides.patch
like so in pkgbuild:

source=("https://www.kernel.org/pub/linux/kernel/v5.x/linux-${_basekernel}.tar.xz"
        "https://www.kernel.org/pub/linux/kernel/v5.x/patch-${pkgver}.xz"
        # the main kernel config files
        'config' 'config.anbox'
        # ARCH Patches
        '0001-ZEN-Add-sysctl-and-CONFIG-to-disallow-unprivileged-CLONE_NEWUSER.patch'
        '0002-HID-quirks-Add-Apple-Magic-Trackpad-2-to-hid_have_special_driver-list.patch'
        # Temp Fixes
        # MANJARO Patches
        '0101-i2c-nuvoton-nc677x-hwmon-driver.patch'
        '0102-iomap-iomap_bmap-should-accept-unwritten-maps.patch'
        '0103-futex.patch'
        '0104-revert-xhci-Add-support-for-Renesas-controller-with-memory.patch'
        # '0105-ucsi-acpi.patch'
        # '0106-ucsi.patch'
        '0107-quirk-kernel-org-bug-210681-firmware_rome_error.patch'
        # Lenovo + AMD
        '0302-lenovo-wmi2.patch'
        # Bootsplash
        '0401-revert-fbcon-remove-now-unusued-softback_lines-cursor-argument.patch'        
        '0402-revert-fbcon-remove-no-op-fbcon_set_origin.patch'
        '0403-revert-fbcon-remove-soft-scrollback-code.patch'
        '0501-bootsplash.patch'
        '0502-bootsplash.patch'
        '0503-bootsplash.patch'
        '0504-bootsplash.patch'
        '0505-bootsplash.patch'
        '0506-bootsplash.patch'
        '0507-bootsplash.patch'
        '0508-bootsplash.patch'
        '0509-bootsplash.patch'
        '0510-bootsplash.patch'
        '0511-bootsplash.patch'
        '0512-bootsplash.patch'
        '0513-bootsplash.gitpatch'
        '0001-ZEN-Add-sysctl-and-CONFIG-to-disallow-unprivileged-C.patch'
        '0002-gcc-plugins-modern-gcc-plugin-infrastructure-requres.patch'
        'add-acs-overrides.patch'
        )

that probably will not work because the important files are identical i think.

update:

  • had to delete the 001-ZEN... line, because it would throw an error during building, and am currently building now (no edit for that because i did some progress on that while writing this update)
    i did my best deciding on the right options for the gcc infrastructure patch and will just see what comes out at the end.

I’m lost now which one you want to pass through to the VM, but if it’s the 970 that section should be removed from there as otherwise it will be in use by the host and cannot be passed through to the guest as MegaVolt tried to tell you:

:face_with_monocle:

update:

  • second try failed but no loading kernel error this time
  • also tried to just run linux-vfio-lts in the process and that didnt get past clean (probably didnt have the setup described earlier for that since i tried linux-vfio before posting that description)
  • tried building zen via add/remove software, didnt work.

okay so ill just delete the device0?

1 Like

since, apparently, i should not use the second gpu anyway on my host, according to bogdancovaciu:

, I will have to passthrough my 1050 now, which sits in slot 2 (the lower one).
I cannot use this gpu because manjaro doesnt let me.
I have it set up like this:
The displays will not activate, not even without passthrough enabled!
“Apply what is possible” doesnt do anything here.

I switched the Identifier around because that would let manjaro use my other gpu.
I just let the 970 stay in there, there was no conscious decision being made of removing or letting that in there. It just worked like that for now and I let it be.
I could have probably not mentioned it in the configs at all and it wouldve been fine, according to all of you.

I’m honestly lost right now and dont know what to do next.
All of you just care about my nvidia config but I really feel like it does hardly anything, if at all, since this is the config file with the same setup as above (I saved the config file beforehand using the nvidia gui):

Section "Monitor"
    # HorizSync source: edid, VertRefresh source: edid
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Acer G246HL"
    HorizSync       30.0 - 80.0
    VertRefresh     55.0 - 76.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce GTX 1050"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "Stereo" "0"
    Option         "nvidiaXineramaInfoOrder" "DFP-3"
    Option         "metamodes" "DP-1: nvidia-auto-select +1920+191, HDMI-0: nvidia-auto-select +0+0"
    Option         "SLI" "Off"
    Option         "MultiGPU" "Off"
    Option         "BaseMosaic" "off"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

I’ve tested out so much, I dont even know what I did to get my setup running which had the 1050 displaying things, so it couldve not even been the switched-around Identifier, but something else entirely, such as the 970 being unplugged with a force-reinstall of the nvidia drivers.

just imagine those pictures being there

https://ibb.co/5vwB20C
https://ibb.co/xYNXfcw

edit:
since I dont plan on using the 1050 for manjaro anymore, i have decided that returning to just using my gtx970 for my host would be fine.

Well, that’s because you can’t use the pass-through GPU in the host and that seems to be what you were doing…

:man_shrugging:

So it doesn’t matter which one you choose, but choose one, don’t activate the nVidia drivers on that one in Manjaro and then pass it through to the VM and the VM will have it.

:face_with_monocle:

okay how do i not activate the nvidia drivers on that one?

because it cant be through the nvidia.config. theres no mention of the 970 in that one. and the 970 is the only one that works right now

There are 2 things here:

  1. not in the nvidia nonfig
  2. not bound to the adapter.

For the last one:

inxi --admin --graphics --filter --no-host --width

:+1: