Gnome (41.1) internal display laggy when eGPU is used with external display

Hi all, I’ve recently switched to Gnome as I heard Wayland currently has (slightly) better support for eGPUs (KDE and Gnome). I was having issues with KDE Plasma on xOrg and tried KDE Plasma’s Wayland implementation which was still problematic (no way of setting primary GPU as far as I could tell) whereas Gnome had this.

With that additional udev rule, I was able to use Gnome with Wayland and my eGPU, however, now the internal display of the laptop (what I’m using as my secondary monitor), is extremely laggy/choppy/sluggish.

Rarely, a bootup would result in a much smoother second display experience. I got the journalctl boot log of that boot instance and compared it to a boot instance where it was laggy and found that problematic boot instance had a lot of the following messages:

i915 0000:00:02.0: cannot be used for peer-to-peer DMA as the client and provider (0000:3a:00.0) do not share an upstream bridge or whitelisted host bridge

Whereas the boot instance with no problems on the secondary monitor (internal) did not have any of those messages.

I’m also aware of this merge request that’s to fix lots of Mutter’s framerate issues, but I’m not sure if my problem falls within the scope of that solution.

glxgears reports ~60fps for my external (primary) monitor, and ~30fps for my internal (secondary) monitor.

my inxi:

  Kernel: 5.14.18-1-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.14-x86_64
  root=UUID=5a221c44-509c-4529-8d41-43ed70a124b9 rw quiet splash apparmor=1
  security=apparmor resume=UUID=78565509-fb86-4433-aa87-75fcb00a1811
  udev.log_priority=3 intel_iommu=igfx_off
  Desktop: GNOME 41.1 tk: GTK 3.24.30 wm: gnome-shell dm: GDM 41.0
  Distro: Manjaro Linux base: Arch Linux
  Type: Convertible System: LENOVO product: 81C4 v: Lenovo YOGA C930-13IKB
  serial: <superuser required> Chassis: type: 31 v: Lenovo YOGA C930-13IKB
  serial: <superuser required>
  Mobo: LENOVO model: LNVNB161216 v: SDK0J40709 WIN
  serial: <superuser required> UEFI: LENOVO v: 8GCN37WW date: 11/23/2020
  ID-1: BAT1 charge: 43.6 Wh (98.2%) condition: 44.4/60.0 Wh (74.0%)
  volts: 8.5 min: 7.7 model: Simplo BASE-BAT type: Li-poly serial: <filter>
  status: Unknown
  Info: Quad Core model: Intel Core i7-8550U bits: 64 type: MT MCP
  arch: Kaby Lake note: check family: 6 model-id: 8E (142) stepping: A (10)
  microcode: EA cache: L1: 256 KiB L2: 1024 KiB L3: 8 MiB
  flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  bogomips: 32012
  Speed: 856 MHz min/max: 400/4000 MHz Core speeds (MHz): 1: 800 2: 800
  3: 1062 4: 830 5: 800 6: 800 7: 800 8: 800
  Vulnerabilities: Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf
  mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: spec_store_bypass
  mitigation: Speculative Store Bypass disabled via prctl and seccomp
  Type: spectre_v1
  mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Full generic retpoline, IBPB: conditional,
  IBRS_FW, STIBP: conditional, RSB filling
  Type: srbds mitigation: Microcode
  Type: tsx_async_abort status: Not affected
  Device-1: Intel UHD Graphics 620 vendor: Lenovo driver: i915 v: kernel
  bus-ID: 00:02.0 chip-ID: 8086:5917 class-ID: 0300
  Device-2: AMD Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
  vendor: XFX Pine driver: amdgpu v: kernel bus-ID: 3a:00.0 chip-ID: 1002:67df
  class-ID: 0300
  Device-3: Acer Integrated Camera type: USB driver: uvcvideo bus-ID: 1-1:2
  chip-ID: 5986:2115 class-ID: 0e02
  Device-4: Logitech Webcam C270 type: USB driver: snd-usb-audio,uvcvideo
  bus-ID: 1-6.1.2:7 chip-ID: 046d:0825 class-ID: 0102 serial: <filter>
  Display: wayland server: compositor: gnome-shell driver:
  loaded: amdgpu note: n/a (using device driver) - try sudo/root display-ID: 0
  resolution: <missing: xdpyinfo>
  OpenGL: renderer: AMD Radeon RX 580 Series (POLARIS10 DRM 3.42.0
  5.14.18-1-MANJARO LLVM 13.0.0)
  v: 4.6 Mesa 21.2.5 direct render: Yes
  Device-1: Intel Sunrise Point-LP HD Audio vendor: Lenovo driver: snd_soc_skl
  v: kernel alternate: snd_hda_intel bus-ID: 00:1f.3 chip-ID: 8086:9d71
  class-ID: 0401
  Device-2: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
  vendor: XFX Pine driver: snd_hda_intel v: kernel bus-ID: 3a:00.1
  chip-ID: 1002:aaf0 class-ID: 0403
  Device-3: Logitech Webcam C270 type: USB driver: snd-usb-audio,uvcvideo
  bus-ID: 1-6.1.2:7 chip-ID: 046d:0825 class-ID: 0102 serial: <filter>
  Sound Server-1: ALSA v: k5.14.18-1-MANJARO running: yes
  Sound Server-2: JACK v: 1.9.19 running: no
  Sound Server-3: PulseAudio v: 15.0 running: yes
  Sound Server-4: PipeWire v: 0.3.40 running: yes
  Device-1: Intel Wireless-AC 9260 driver: iwlwifi v: kernel bus-ID: 6b:00.0
  chip-ID: 8086:2526 class-ID: 0280
  IF: wlp107s0 state: up mac: <filter>
  Device-1: Intel Wireless-AC 9260 Bluetooth Adapter type: USB driver: btusb
  v: 0.8 bus-ID: 1-8:5 chip-ID: 8087:0025 class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 3 state: down bt-service: enabled,running
  rfk-block: hardware: no software: yes address: see --recommends
  Local Storage: total: 462.05 GiB used: 15.89 GiB (3.4%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: MZVLB256HAHQ-000L2
  size: 238.47 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
  lanes: 4 type: SSD serial: <filter> rev: 0L1QEXD7 temp: 29.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 type: USB vendor: Kingston model: SA400S37240
  size: 223.57 GiB block-size: physical: 512 B logical: 512 B type: N/A
  serial: <filter> rev: SBFK scheme: GPT
  ID-1: / raw-size: 217.26 GiB size: 212.85 GiB (97.97%)
  used: 11.07 GiB (5.2%) fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
  used: 288 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
  ID-1: swap-1 type: partition size: 20.92 GiB used: 0 KiB (0.0%) priority: -2
  dev: /dev/nvme0n1p3 maj-min: 259:3
  System Temperatures: cpu: 29.8 C pch: 41.0 C mobo: 27.8 C gpu: amdgpu
  temp: 47.0 C
  Fan Speeds (RPM): N/A gpu: amdgpu fan: 944
  Processes: 321 Uptime: 36m wakeups: 8 Memory: 15.38 GiB
  used: 2.88 GiB (18.8%) Init: systemd v: 249 tool: systemctl Compilers:
  gcc: 11.1.0 Packages: pacman: 1182 lib: 327 flatpak: 0 Shell: Zsh v: 5.8
  running-in: gnome-terminal inxi: 3.3.09

So I did more comparing between the two journalctl logs between the okay boot instance and problematic boot instance and found that perhaps the former did not use the eGPU as primary GPU? I tried to recreate this by removing the udev rule (switching the primary GPU back to the iGPU), but the external monitor seemed to have a worse framerate than that former boot instance.

Specifically talking about:

GPU /dev/dri/card1 selected primary given udev rule

whereas the former boot did not have such a line (even though I expected it to).

Switching back to integrated did remove the messages cannot be used for peer-to-peer DMA as the client and provider (0000:3a:00.0) do not share an upstream bridge or whitelisted host bridge messages so that explains that.

I do wonder if DMA would help with this stuttering as saw the following in the journalctl log:

Disabling DMA buffer screen sharing for driver 'amdgpu'.

I tried to enable “dma-buf-screen-sharing" in mutter’s experimental features, but this resulted in (what feels like) contradicting messages of:

gnome-shell[990]: Disabling DMA buffer screen sharing for driver 'amdgpu'.
gnome-shell[1796]: Enabling experimental feature 'dma-buf-screen-sharing'

Full disclosure, I have no experience with these things so I could be massively misunderstanding something so this stuff may be completely irrelevent to my issue. Feel free to correct me.