Iommu. GPU Passthrough problem because of identical GPU's

#1

Hello!

first: Sry for my English! I am native German.

I have this System:

Mainboard ASUS ROG Zenith Extreme
CPU AMD Threadripper 1950x
RAM 64GB Corsair Dominator Platinum
2x NVME's

And as GPU (in this Order build in):
1x ASUS Strix 1080ti
1x AMD Radeon Pro WX3100
1x ASUS Strix 1080ti

i buyed last week the AMD Radeon Pro, because i want to use Linux again as a Desktop System. I tried it with one of the Nvidia Card, but with the properity Driver i had a big problem with lagging Windows if resizing. Everything else was fast. Only resizing was extremly slow. I dont tried a lot to fix it, because i want it anyway to use the GPU's in another way. So i buyed the AMD Card and built it in.

Im using only the AMD Card. I want to use one of the Nvidia GPU's for a VM. But i cant passthrough it because the two nvidia cards have the same ID.

I read already the wiki article https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Using_identical_guest_and_host_GPUs. But that dont help me, because that isnt working.

Because this Script

#!/bin/sh

for i in /sys/bus/pci/devices/*/boot_vga; do
	if [ $(cat "$i") -eq 0 ]; then
		GPU="${i%/boot_vga}"
		AUDIO="$(echo "$GPU" | sed -e "s/0$/1/")"
		echo "vfio-pci" > "$GPU/driver_override"
		if [ -d "$AUDIO" ]; then
			echo "vfio-pci" > "$AUDIO/driver_override"
		fi
	fi
done

modprobe -i vfio-pci

gives me a "Permission Denied" at Bootscreen. I tried chmod 777 and chmod +x. Even than it dont work.

I hope somebody can give me a step for step guide or help.

After work, i can send few logs and infos like lspci and others.

#2

Im Home now. Here the detailed Specs:

ystem:
  Host: colossus Kernel: 5.1.0-mainline x86_64 bits: 64 compiler: gcc 
  v: 8.3.0 Console: tty 0 Distro: Manjaro Linux 
Machine:
  Type: Desktop Mobo: ASUSTeK model: ROG ZENITH EXTREME v: Rev 1.xx 
  serial: <filter> UEFI: American Megatrends v: 1701 date: 01/09/2019 
CPU:
  Topology: 16-Core (2-Die) model: AMD Ryzen Threadripper 1950X bits: 64 
  type: MT MCP MCM arch: Zen rev: 1 L2 cache: 8192 KiB 
  flags: lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm 
  bogomips: 256070 
  Speed: 2000 MHz min/max: 2200/4000 MHz Core speeds (MHz): 1: 3120 2: 2000 
  3: 2200 4: 2200 5: 2000 6: 2000 7: 2000 8: 3999 9: 3223 10: 2000 11: 3999 
  12: 2000 13: 3999 14: 2200 15: 2200 16: 2200 17: 2200 18: 2200 19: 2200 
  20: 2200 21: 2200 22: 2000 23: 2000 24: 2000 25: 2000 26: 3999 27: 2200 
  28: 2200 29: 2200 30: 2200 31: 2200 32: 2200 
Graphics:
  Device-1: NVIDIA GP102 [GeForce GTX 1080 Ti] vendor: ASUSTeK 
  driver: nouveau v: kernel bus ID: 08:00.0 
  Device-2: AMD Lexa XT [Radeon PRO WX 3100] driver: amdgpu v: kernel 
  bus ID: 42:00.0 
  Device-3: NVIDIA GP102 [GeForce GTX 1080 Ti] vendor: ASUSTeK 
  driver: nouveau v: kernel bus ID: 43:00.0 
  Display: server: X.org 1.20.4 driver: nouveau 
  resolution: <xdpyinfo missing> 
  OpenGL: renderer: AMD Radeon Pro WX3100 (POLARIS12 DRM 3.30.0 
  5.1.0-mainline LLVM 8.0.0) 
  v: 4.5 Mesa 19.0.3 direct render: Yes 
Audio:
  Device-1: NVIDIA GP102 HDMI Audio vendor: ASUSTeK driver: snd_hda_intel 
  v: kernel bus ID: 08:00.1 
  Device-2: AMD Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel 
  v: kernel bus ID: 0a:00.3 
  Device-3: AMD Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] 
  driver: snd_hda_intel v: kernel bus ID: 42:00.1 
  Device-4: NVIDIA GP102 HDMI Audio vendor: ASUSTeK driver: snd_hda_intel 
  v: kernel bus ID: 43:00.1 
  Sound Server: ALSA v: k5.1.0-mainline 
Network:
  Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb 
  v: 5.6.0-k port: 2000 bus ID: 03:00.0 
  IF: enp3s0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
  Device-2: Aquantia AQC107 NBase-T/IEEE 802.3bz Ethernet [AQtion] 
  vendor: ASUSTeK driver: atlantic v: 2.0.4.0-kern port: 2000 
  bus ID: 05:00.0 
  IF: enp5s0 state: up speed: 10000 Mbps duplex: full mac: <filter> 
Drives:
  Local Storage: total: 2.73 TiB used: 85.24 GiB (3.1%) 
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO 1TB size: 931.51 GiB 
  ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 970 EVO 1TB size: 931.51 GiB 
  ID-3: /dev/sda vendor: Western Digital model: WD10EADS-42P6B0 
  size: 931.51 GiB 
  ID-4: /dev/sdb vendor: Apple model: HDD WD10EZES-40UFAA0 size: 931.51 GiB 
Partition:
  ID-1: / size: 915.60 GiB used: 85.24 GiB (9.3%) fs: ext4 
  dev: /dev/nvme1n1p2 
Sensors:
  System Temperatures: cpu: 30.9 C mobo: N/A 
  Fan Speeds (RPM): cpu: 0 
  GPU: device: nouveau temp: 24 C fan: 0 device: nouveau temp: 23 C fan: 0 
  device: amdgpu temp: 41 C fan: 1916 
Info:
  Processes: 499 Uptime: 5m Memory: 62.83 GiB used: 4.42 GiB (7.0%) 
  Init: systemd Compilers: gcc: 8.3.0 clang: 8.0.0 Shell: bash v: 5.0.3 
  inxi: 3.0.33 

#3

Have nobody some ideas?

#4

This dont work too :frowning:

#5

Run sudo su, then input your root password, then run the script.

#6

Its do nothing. It makes only a new empty line, and the Cursor is pulsating. Can only abort with ctrl+c to get a new commandline.

#7

Sry, have following Error:

#8

Somehow i got it working.

i still get the last posted Error in the Screenshot. But it works anyway.

But the start of the VM is very slow. Even the Windows Start Circle loops very laggy.

#9

This Thread can be closed. KVM performance is still slow (only Windows 10 as guest). But this is another problem.

I reinstalled already and switched to Archlinux.

The Solution for my specific passthrough Problem: Simply do nothing. Dont activate iommu with Kernel parameters. Dont do anything. Only(!!) blacklist all Drivers for the nvidia cards. After a restart no driver use the GPU's and i can add them without problem into virt-manager...