Amdgpu crashing apps and KDE Plasma when playing 7 Days to Die

Greetings,

Since doing this update [Stable Update] 2023-08-11 - Kernels, Plasma, Nvidia, Firefox, Thunderbird, Pamac, Pipewire, Mesa I have been having issues while playing the game 7 Days to Die. The game has had since then visuals bugs, crashing a lot of my application and turning my environment into a mess requiring me to reboot to get back to a usable state. I was waiting for a fix and did this update [Stable Update] 2023-08-29 - Kernels, Mesa, Deepin, Firefox, KDE Gear & Frameworks, Nvidia, Budgie but the issue still remains, the only difference is that there are now missing textures and more visual bugs. I have checked whether it was an issue with an update of 7DTD but only had this problem since I updated packages and didn’t get to download an update on the game at the moment when problems started arising. I tried the linux native version and the windows one through proton but both lead to the same outcome. Journal gave me errors for amdgpu.

How to reproduce

Visual bugs appears starting from the menu, then things really goes downside when starting to play (new solo, continued solo, multiplayer).

inxi -b

System:
  Host: utilisateur-Manjaro Kernel: 6.1.49-1-MANJARO arch: x86_64 bits: 64
    Desktop: KDE Plasma v: 5.27.7 Distro: Manjaro Linux
Machine:
  Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
  Mobo: ASUSTeK model: TUF GAMING B450M-PLUS II v: Rev X.0x
    serial: <superuser required> BIOS: American Megatrends v: 4010
    date: 03/14/2023
CPU:
  Info: 8-core AMD Ryzen 7 5700G with Radeon Graphics [MT MCP] speed (MHz):
    avg: 1557 min/max: 1400/4672
Graphics:
  Device-1: AMD Cezanne [Radeon Vega Series / Radeon Mobile Series]
    driver: amdgpu v: kernel
  Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.2.0 driver: X:
    loaded: amdgpu unloaded: modesetting dri: radeonsi gpu: amdgpu
    resolution: 1680x1050~60Hz
  API: OpenGL v: 4.6 Mesa 23.1.6-2 renderer: AMD Radeon Graphics (renoir
    LLVM 15.0.7 DRM 3.49 6.1.49-1-MANJARO)
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    driver: r8169
  Device-2: D-Link DWA-140 RangeBooster N Adapter(rev.B3) [Ralink RT5372]
    driver: rt2800usb type: USB
Drives:
  Local Storage: total: 1.13 TiB used: 462.08 GiB (40.0%)
Info:
  Processes: 349 Uptime: 20m Memory: total: 28 GiB note: est.
  available: 29.17 GiB used: 4.89 GiB (16.8%) Shell: Zsh inxi: 3.3.29

Journal

kernel	amdgpu 0000:07:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process 7DaysToDie.exe pid 3995 thread 7DaysToDie:cs0 pid 4054)
kernel	amdgpu 0000:07:00.0: amdgpu:   in page starting at address 0x0000800279991000 from IH client 0x1b (UTCL2)
kernel	amdgpu 0000:07:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00601031
kernel	amdgpu 0000:07:00.0: amdgpu: 	 Faulty UTCL2 client ID: TCP (0x8)
kernel	amdgpu 0000:07:00.0: amdgpu: 	 MORE_FAULTS: 0x1
kernel	amdgpu 0000:07:00.0: amdgpu: 	 WALKER_ERROR: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: 	 PERMISSION_FAULTS: 0x3
kernel	amdgpu 0000:07:00.0: amdgpu: 	 MAPPING_ERROR: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: 	 RW: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:6 pasid:32786, for process 7DaysToDie.exe pid 3995 thread 7DaysToDie:cs0 pid 4054)
kernel	amdgpu 0000:07:00.0: amdgpu:   in page starting at address 0x0000800279a2d000 from IH client 0x1b (UTCL2)
kernel	amdgpu 0000:07:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
kernel	amdgpu 0000:07:00.0: amdgpu: 	 Faulty UTCL2 client ID: CB (0x0)
kernel	amdgpu 0000:07:00.0: amdgpu: 	 MORE_FAULTS: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: 	 WALKER_ERROR: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: 	 PERMISSION_FAULTS: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: 	 MAPPING_ERROR: 0x0
kernel	amdgpu 0000:07:00.0: amdgpu: 	 RW: 0x0
kernel	[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Thanks for reading me.

Didn’t the game 7DTD have the problem before the update 2023-08-11?

Have you tried playing other games encountering the same problem? Or only the game 7DTD causes the problem?

I had played the game a few hours until updating my packages without issues. After updating and having issues I checked if there was an update to the game but there wasn’t.
Linux native and Windows version have the same problem, but the game works fine under Windows 10.
The latest packages update made the issues visually different so I suspect the drivers.
I would like to try to undo the updates but have no idea to do it with so much packages involved.
I tried a few other 3D games but they don’t have any problem. I will try to test some other games though.

Two stable releases broke Diablo 4 for me, see below:

2023-08-11 - Working fine
2023-08-29 - issued
2023-09-10 - issued

So I’m using Timeshift to rollback and keep using 2023-08-11 until the issue is fixed at some point. Did you tried to rollback to a version that the game was working fine for you?

Did you check which package is causing the problem? Kernel or mesa?

Current kernel is 6.1.51-1-MANJARO, update your system. There were some AMD issues with previous releases. If that doesn’t help try the latest 6.5 kernel.

In my case I tried 6.5 kernel without success. Currently using the 6.1.

I tried also different Proton-GE version on Lutris without any difference also besides nothing changed on Lutris when I checked, like DXVK, etc.

It could be something related to mesa. Couldn’t go deeper due to lacking knowledge.

Maybe Nevado has a different case.

Starcrat 2 was still working fine in all cases, only Diablo 4 was breaking in my case.

Sorry for the delay.

I don’t play Diablo 4 so I can’t try out sorry.

I don’t use Timeshift and I don’t know how I could rollback the packages, but I’m afraid it would mess up my system even more anyway so I’m just going to leave it as is it.

When playing 7DTD:
2023-08-11 System hangs, apps crashing, system turning into a mess
2023-08-29 Same but worse with graphical bugs
2023-09-10 Reverted more or less to the issues from 2023-08-11

Logs are still the same with a bunch of amdgpu VM_L2_PROTECTION_FAULT_STATUS and amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered

Feels like it’s touching memory it shouldn’t but I’m certainly no expert on that.

I’ll try to post a video, maybe it’ll help. If I find out how to post one.

Alright I made a video, should be easier this way.

I guess I didn’t stress the game enough last time because I still have big graphical bugs in the game added to the hangs/freezes.

I couldn’t attach the video and logs so I uploaded on an external host:

System:
  Host: utilisateur-Manjaro Kernel: 6.1.51-1-MANJARO arch: x86_64 bits: 64
    Desktop: KDE Plasma v: 5.27.7 Distro: Manjaro Linux
Machine:
  Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
  Mobo: ASUSTeK model: TUF GAMING B450M-PLUS II v: Rev X.0x
    serial: <superuser required> BIOS: American Megatrends v: 4010
    date: 03/14/2023
CPU:
  Info: 8-core AMD Ryzen 7 5700G with Radeon Graphics [MT MCP] speed (MHz):
    avg: 1960 min/max: 1400/4672
Graphics:
  Device-1: AMD Cezanne [Radeon Vega Series / Radeon Mobile Series]
    driver: amdgpu v: kernel
  Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.2.0 driver: X:
    loaded: amdgpu unloaded: modesetting dri: radeonsi gpu: amdgpu
    resolution: 1680x1050~60Hz
  API: OpenGL v: 4.6 Mesa 23.1.6-3 renderer: AMD Radeon Graphics (renoir
    LLVM 16.0.6 DRM 3.49 6.1.51-1-MANJARO)
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    driver: r8169
  Device-2: D-Link DWA-140 RangeBooster N Adapter(rev.B3) [Ralink RT5372]
    driver: rt2800usb type: USB
Drives:
  Local Storage: total: 1.13 TiB used: 533.64 GiB (46.2%)
Info:
  Processes: 385 Uptime: 15m Memory: total: 28 GiB note: est.
  available: 29.17 GiB used: 5.54 GiB (19.0%) Shell: Zsh inxi: 3.3.29
  • Can you answer my question?
  • Try to install different proton version for the windows game 7DTD.

  • Try to downgrade mesa version if it helps.

Dota 2 is part of them, I play it everyday. I tried quite a few games (Combat Master, Dead or Alive 5 Last Round, Dota 2, Foxhole, Increlution, PICO PARK, Sim Companies, Tale of Immortal, and a few more) and 7DTD is the only one I have issues with (I tried various video settings).

I tried Stable 8.0.3, 7.0.6, Experimental Proton, GE-8.14. Also with those options PROTON_NO_D3D11=1 %command%, PROTON_NO_D3D10=1 %command%, PROTON_NO_ESYNC=1 PROTON_NO_FSYNC=1 %command%.

I’ve never downgraded packages before so I’m unsure about doing it correctly without breaking my system. Do you have any recommended way (safe & easy) to do it?

Addendum: I’ve done the last stable update (2023-09-18) and my issues are still ongoing. I’ll try digging into downgrading mesa package.

I forgot your 17 day old unclear answer, sorry.

  1. Install manjaro-downgrade

  2. How to downgrade mesa:
    $ DOWNGRADE_FROM_ALA=1 sudo manjaro-downgrade mesa
    when you are in the stable branch.

I downgraded mesa but even after a few reboots it switched to “soft rendering” so a lot of stuff wasn’t working as usual anymore. So I went back to current version.

As for an update: I’ve seen that the game has two renderer.

  • GLCore (default)
  • Vulkan (not fully supported as they wrote it in the game options)

I’ve just tried out the vulkan renderer (on the native game version) for a few minutes and haven’t got any issues so far, though I thought it would be buggy as well.

I’ll stress test it more in the next days by playing on vulkan renderer and see if it is playable.
Clearly not a fix as there seem to be an issue with GLCore though. I’ve read a few bug reports on mesa with kinda the same errors, I’ll try out the related games once my internet gets back to normal.