Reinstall amdgpu (or switch to amdgpu-pro)

I’m experiencing constant driver crashes with Vega 56 in games only on Manjaro (not in Windows 10) and I don’t know why. I’ve tried probably every solution ever suggested (module parameters, switching kernels (haven’t went further than 6.1 though), resetting UV/OC and even downclocking the card), and nothing works. I’m pretty sure I’ve already corrupted something on my drives because 90% of the time it can’t even recover from a crash and I have to hold down the power button.

As a last resort I want to reinstall amdgpu, and if it doesn’t help then I’d try different drivers, such as amdgpu-pro. How exactly can I do it on Manjaro? Instructions online vary quite a bit.

If you play games you need opencl/vulkan support, have you installed opencl/vulkan drivers?
https://archlinux.org/packages/extra/x86_64/rocm-opencl-sdk/
https://archlinux.org/packages/extra/x86_64/amdvlk/

Welcome to the forum! :wave:

Please see the following and post your system details as outlined:

Such as?

There’s nothing to reinstall. The amdgpu module is included in the kernel.

That won’t solve anything. AMDGPU PRO is not recommended for gaming.

More info:

1 Like

I’d suggest you focus on repairing possible damage related to your quoted comment, rather than assuming another graphic driver will fix it.

The first thing you could try is to create a new user profile to see whether these crashes may be only limited to your current profile.

Additionally, if you provide system information and error logs, for example, this would likely help others to assist you.

As a new user, please take some time to familiarise yourself with Forum requirements; in particular, the many ways to use the forum effectively. The following links will help you achieve that.

And last, but not least, the Stable Update Announcements, which you should check frequently for important update related information.

I hope this helps. Cheers.

I use a Xeon 2640v3 with 16GB REG DDR4 and a Vega 56 GPU. Manjaro runs off an ADATA SU650 SATA SSD.

System is perfectly stable on Windows even with quite aggressive undervolting, but I can’t get rid of amdgpu crashes on Linux whatsoever. Only thing that made them rarer (I guess?) is maxing out the GPU voltage. I found a post with the exact same issue (I also have artifacts after it attempts to recover from time to time), but the problem persisted for OP in Windows, while I can’t get the driver to crash in Windows at all without serious overclocking. And I even use a 3rd party beta driver there, which isn’t really supposed to be stable. Additionally, I use LACT on Linux to tweak my card, but it’s the same with CoreCtrl as well, and disabling both doesn’t seem to help.

I googled about the problem and saw some people with Vega cards on Arch forums suggest using these kernel options for CPU and amdgpu module respectively:

processor.max_cstate=1
rcu_nocbs=0-15
idle=nomwait
pcie_aspm=off
iommu=pt
amdgpu.lockup_timeout=0
amdgpu.dc=1
amdgpu.vm_update_mode=0
amdgpu.dpm=-1
amdgpu.ppfeatureamdgpu.vm_fault_stop=2
amdgpu.ppfeaturemask=0xffffffff
amdgpu.vm_fault_stop=2
amdgpu.vm_debug=1
amdgpu.gpu_recovery=0

Obviously, it hasn’t helped. I attempted it with kernels 6.7, 6.6, 6.1 and even tried downloading an old (20.45) linux-firmware package and replacing my amdgpu folder in usr/lib/firmware with the one it had.
As for the logs, after a crash I usually see something like this in journalctl:

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=7773, emitted seq=7775
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 4018 thread firefox:cs0 pid 4176

It appears out of the blue, without anything remarkable preceding it - usually it’s just about me starting the process several minutes ago. Crashes can occur anywhere where the GPU is somewhat utilized, but happen especially often with one game (Redout 2) and I think spikes in GPU utilization have to do something with it, it has a lot of them. But again, it has all these spikes on Windows and nothing is crashing. Other than this, my gaming performance on Linux is perfect and I do have all the Vulkan and OpenCL drivers installed, even reinstalled them yesterday.
I’d also mention that I tried switching to Mesa-git and the problem persists.

Tried a new profile, opened steam, went away from the PC for like 10 minutes and this happened. Doubt it would’ve worked in games.

янв 26 15:06:42 x99 kernel: PM: suspend exit
янв 26 15:06:42 x99 kwin_wayland[7878]: kwin_libinput: Libinput: client bug: timer event7 debounce: scheduled expiry is in the past (-424ms), your system is too slow
янв 26 15:06:42 x99 bluetoothd[562]: Controller resume with wake event 0x0
янв 26 15:06:42 x99 kwin_wayland[7878]: kwin_libinput: Libinput: client bug: timer event7 debounce: scheduled expiry is in the past (-401ms), your system is too slow
янв 26 15:06:42 x99 systemd-sleep[23447]: System returned from sleep operation 'suspend'.
янв 26 15:06:42 x99 systemd[1]: systemd-suspend.service: Deactivated successfully.
янв 26 15:06:42 x99 systemd[1]: Finished System Suspend.
янв 26 15:06:42 x99 systemd[1]: systemd-suspend.service: Consumed 2.320s CPU time.
янв 26 15:06:42 x99 systemd[1]: Stopped target Sleep.
янв 26 15:06:42 x99 systemd[1]: Reached target Suspend.
янв 26 15:06:42 x99 systemd-logind[564]: Operation 'sleep' finished.
янв 26 15:06:42 x99 systemd[1]: Stopped target Suspend.
янв 26 15:06:42 x99 ModemManager[619]: <msg> [sleep-monitor-systemd] system is resuming
янв 26 15:06:42 x99 lact[646]: 2024-01-26T10:06:42.626589Z  INFO lact_daemon::suspend: suspend/resume event detected, reloading config
янв 26 15:06:42 x99 NetworkManager[560]: <info>  [1706263602.6267] manager: sleep: wake requested (sleeping: yes  enabled: yes)
янв 26 15:06:42 x99 NetworkManager[560]: <info>  [1706263602.6268] device (enp9s0): state change: disconnected -> unmanaged (reason 'sleeping', sys-iface-state: 'managed')
янв 26 15:06:42 x99 kdeconnectd[2781]: Error sending UDP packet: QAbstractSocket::NetworkError
янв 26 15:06:42 x99 kdeconnectd[8388]: Error sending UDP packet: QAbstractSocket::NetworkError
янв 26 15:06:42 x99 lact[646]: 2024-01-26T10:06:42.627998Z  INFO lact_daemon::server::handler: could not find GPU with id 1002:687F-1002:6B76-0000:05:00.0 defined in configuration
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 NetworkManager[560]: <info>  [1706263602.6455] device (enp9s0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: Generic FE-GE Realtek PHY r8169-0-900:00: attached PHY driver (mii_bus:phy_addr=r8169-0-900:00, irq=MAC)
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: r8169 0000:09:00.0 enp9s0: Link is Down
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 NetworkManager[560]: <info>  [1706263602.8926] manager: NetworkManager state is now DISCONNECTED
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:136 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x00000000011b4000 from IH client 0x12 (VMC)
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00040110
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: MP0 (0x0)
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x0
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x1
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x1
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x1
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:42 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:128 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x00000000011e6000 from IH client 0x12 (VMC)
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00040100
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: MP0 (0x0)
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x0
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x1
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x1
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:43 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 28b4 wait reg 28c6
янв 26 15:06:44 x99 kernel: amdgpu 0000:05:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706
янв 26 15:06:44 x99 systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
янв 26 15:06:45 x99 kernel: r8169 0000:09:00.0 enp9s0: Link is Up - 1Gbps/Full - flow control rx/tx
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9052] device (enp9s0): carrier: link connected
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9055] device (enp9s0): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9067] policy: auto-activating connection 'Wired connection 1' (743ce79d-0bc3-38f9-8b9d-d22890a55a71)
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9074] device (enp9s0): Activation: starting connection 'Wired connection 1' (743ce79d-0bc3-38f9-8b9d-d22890a55a71)
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9075] device (enp9s0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9080] manager: NetworkManager state is now CONNECTING
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9086] device (enp9s0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9101] device (enp9s0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
янв 26 15:06:45 x99 NetworkManager[560]: <info>  [1706263605.9109] dhcp4 (enp9s0): activation: beginning transaction (timeout in 45 seconds)
янв 26 15:06:45 x99 kernel: r8169 0000:09:00.0 enp9s0: Link is Up - 1Gbps/Full - flow control off
янв 26 15:06:45 x99 kernel: r8169 0000:09:00.0 enp9s0: Link is Down
янв 26 15:06:46 x99 ModemManager[619]: <msg> [base-manager] couldn't check support for device '/sys/devices/pci0000:00/0000:00:1c.2/0000:09:00.0': not supported by any plugin
янв 26 15:06:46 x99 rtkit-daemon[878]: Supervising 19 threads of 9 processes of 2 users.
янв 26 15:06:46 x99 rtkit-daemon[878]: Supervising 19 threads of 9 processes of 2 users.
янв 26 15:06:49 x99 NetworkManager[560]: <info>  [1706263609.3497] device (enp9s0): carrier: link connected
янв 26 15:06:49 x99 kernel: r8169 0000:09:00.0 enp9s0: Link is Up - 1Gbps/Full - flow control off
янв 26 15:06:50 x99 NetworkManager[560]: <info>  [1706263610.9773] dhcp4 (enp9s0): error parsing DHCP option 6 (domain_name_servers): address 0.0.0.0 is ignored
янв 26 15:06:50 x99 NetworkManager[560]: <info>  [1706263610.9774] dhcp4 (enp9s0): state changed new lease, address=192.168.0.100
янв 26 15:06:50 x99 NetworkManager[560]: <info>  [1706263610.9779] policy: set 'Wired connection 1' (enp9s0) as default for IPv4 routing and DNS
янв 26 15:06:50 x99 dnsmasq[753]: setting upstream servers from DBus
янв 26 15:06:50 x99 dnsmasq[753]: using nameserver 192.168.0.1#53(via enp9s0)
янв 26 15:06:50 x99 dnsmasq[753]: using nameserver 192.168.0.1#53 for domain 0.168.192.in-addr.arpa
янв 26 15:06:50 x99 dnsmasq[753]: cleared cache
янв 26 15:06:50 x99 NetworkManager[560]: <info>  [1706263610.9851] device (enp9s0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
янв 26 15:06:51 x99 systemd[1]: Starting Network Manager Script Dispatcher Service...
янв 26 15:06:52 x99 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring page1 timeout, signaled seq=10804, emitted seq=10806
янв 26 15:06:52 x99 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0

Out of curiosity I decided to try and install Manjaro KDE from scratch, this time on my NVME drive.
Installed Steam, LACT, gamemode and mangohud, and an hour into gaming with undervolt applied I couldn’t get it to crash at all. So looks like either I broke something about my previous install, or my SATA drive decided to give up.
I’ll post again if the issue comes back, and hopefully I’ll be able to track down the cause.

Well, this is interesting. VRAM overclock didn’t actually apply when I first tested the fresh install, and after reapplying it I got a crash with the exact same log in like 5 minutes. But it’s ok with the undervolt.
I flipped the bios switch on my card (initially tested the Vega 56 bios with 800mhz VRAM and 945mhz overclock) to run it on Vega 64 bios (945mhz VRAM by default) and again it works perfectly fine with an undervolt and crashes when I push VRAM further manually.

My conclusion is that amdgpu’s DPM doesn’t play nice with manual VRAM overclocks on Vega cards. I ruled out VRAM degradation by running OCCT’s VRAM test on Windows and I also ran memtest-vulkan for an hour in Manjaro with 945mhz on 56 bios and 1075mhz on 64 bios. In both, not a single error appeared and nothing crashed, but it quickly happens in games, usually during utilization spikes. I don’t know how to come around it without disabling DPM entirely, so I guess I’ll just live with 64 bios for now. Maybe in the future I’ll try modding my overclock manually into the bios.

1 Like

This topic was automatically closed 36 hours after the last reply. New replies are no longer allowed.