So far I’ve only had this issue when playing Counter Strike 2 and sometimes have gone sessions without it crashing a single time. However, 1 in every 5 games it will crash.
When it crashes, its as if I basically get logged out, so the screen dies and then I end up on the login screen, where I log back in and find that all graphical processes have been killed.
I then start everything back up and its fine again.
Here are my specs using inxi:
System:
Kernel: 6.6.7-4-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
Desktop: KDE Plasma v: 5.27.10 Distro: Manjaro Linux base: Arch Linux
Machine:
Type: Desktop Mobo: ASUSTeK model: PRIME X370-PRO v: Rev X.0x
serial: <superuser required> BIOS: American Megatrends v: 6203
date: 07/27/2023
CPU:
Info: 8-core model: AMD Ryzen 7 1800X bits: 64 type: MT MCP arch: Zen rev: 1
cache: L1: 768 KiB L2: 4 MiB L3: 16 MiB
Speed (MHz): avg: 2426 high: 3988 min/max: 2200/4000 boost: disabled
cores: 1: 1962 2: 3879 3: 1855 4: 1987 5: 1917 6: 3946 7: 2200 8: 2200
9: 1780 10: 3988 11: 1937 12: 1935 13: 1778 14: 3878 15: 1802 16: 1777
bogomips: 128045
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
Device-1: AMD Navi 32 [Radeon RX 7700 XT / 7800 XT] vendor: XFX
driver: amdgpu v: kernel arch: RDNA-3 bus-ID: 0c:00.0
Display: x11 server: X.Org v: 21.1.10 with: Xwayland v: 23.2.3 driver: X:
loaded: amdgpu unloaded: modesetting,radeon dri: radeonsi gpu: amdgpu
resolution: 2560x1440
API: EGL v: 1.5 drivers: radeonsi,swrast platforms:
active: x11,surfaceless,device inactive: gbm,wayland
API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 23.1.9-manjaro1.1
glx-v: 1.4 direct-render: yes renderer: AMD Radeon RX 7800 XT (gfx1101 LLVM
16.0.6 DRM 3.54 6.6.7-4-MANJARO)
API: Vulkan v: 1.3.269 drivers: radv surfaces: xcb,xlib devices: 1
Audio:
Device-1: AMD Navi 31 HDMI/DP Audio driver: snd_hda_intel v: kernel
bus-ID: 0c:00.1
Device-2: AMD Family 17h HD Audio vendor: ASUSTeK driver: snd_hda_intel
v: kernel bus-ID: 0e:00.3
Device-3: Logitech G933 Wireless Headset Dongle
driver: hid-generic,snd-usb-audio,usbhid type: USB bus-ID: 5-4:4
API: ALSA v: k6.6.7-4-MANJARO status: kernel-api
Server-1: JACK v: 1.9.22 status: off
Server-2: PipeWire v: 1.0.0 status: off
Server-3: PulseAudio v: 16.1 status: active
Network:
Device-1: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: kernel
port: e000 bus-ID: 08:00.0
IF: enp8s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Drives:
Local Storage: total: 1.82 TiB used: 785.07 GiB (42.1%)
ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO Plus 2TB
size: 1.82 TiB temp: 49.9 C
Partition:
ID-1: / size: 1.79 TiB used: 785.07 GiB (42.8%) fs: ext4 dev: /dev/dm-0
mapped: luks-d98cd8d2-e273-4dda-808f-bc3d6ee962a8
Swap:
Alert: No swap data was found.
Sensors:
System Temperatures: cpu: 48.0 C mobo: N/A gpu: amdgpu temp: 50.0 C
Fan Speeds (rpm): N/A gpu: amdgpu fan: 0
Info:
Processes: 381 Uptime: 9h 29m Memory: total: 32 GiB available: 31.25 GiB
used: 7.93 GiB (25.4%) Init: systemd Compilers: gcc: 13.2.1 clang: 16.0.6
Packages: 1447 Shell: Zsh v: 5.9 inxi: 3.3.31
Here is an output of the related error that I was able to find in the logs that I had saved a few days ago:
Dec 16 17:17:01 thomas-pc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=11563292, emitted seq=11563294
Dec 16 17:17:01 thomas-pc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process cs2 pid 45713 thread VKRenderThread pid 45744
Dec 16 17:17:01 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: IP block:gfx_v11_0 is hung!
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800200 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800224 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800244 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800264 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800284 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a8002a0 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a8002c0 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a8002e0 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800210 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xfff4a800200 flags=0x0020]
Dec 16 17:17:02 thomas-pc kernel: Failed to wait all pipes clean
Dec 16 17:17:02 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: soft reset failed, will fallback to full reset!
Dec 16 17:17:02 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:02 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:02 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:02 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:03 thomas-pc kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Dec 16 17:17:03 thomas-pc kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Dec 16 17:17:04 thomas-pc kernel: [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: MODE1 reset
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: GPU mode1 reset
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: GPU smu mode1 reset
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: GPU reset succeeded, trying to resume
Dec 16 17:17:04 thomas-pc kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000F00000).
Dec 16 17:17:04 thomas-pc kernel: [drm] VRAM is lost due to GPU reset!
Dec 16 17:17:04 thomas-pc kernel: [drm] PSP is resuming...
Dec 16 17:17:04 thomas-pc kernel: [drm] reserve 0xa700000 from 0x83e0000000 for PSP TMR
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: RAP: optional rap ta ucode is not available
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: SMU is resuming...
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: smu driver if version = 0x0000003d, smu fw if version = 0x0000003f, smu fw program = 0, smu fw vers>
Dec 16 17:17:04 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: SMU driver if version not matched
Dec 16 17:17:05 thomas-pc kernel: amdgpu 0000:0c:00.0: amdgpu: SMU is resumed successfully!
Dec 16 17:17:05 thomas-pc kernel: [drm] DMUB hardware initialized: version=0x07002400
I have searched online for this error message and found that other people have had the same type of timeout where it throws them out of their desktop but the causes seem to be different, with some different hardware and a number of suggestions that don’t seem relevant and with no confirmation that any of them work.
The only thing of interest that I could find was someone said something about disabling something to do with dynamic power management, where it automatically adjusts the clocks and voltages of the GPU, I’ve not tried anything though.
Someone else also suggested that it was due to a bottleneck with CPU. But it does not seem to affect anything other than CS2 (maybe Vulkan?) compared to the other games that I’ve played (which may not be Vulkan).
Thought I would raise a topic to see if anyone had any ideas or have come across this before.
Edit:
Looking at the Arch Wiki, it suggests that while there are fewer issues with the vulkan-radeon package, this error is an issue that you can get.
https://wiki.archlinux.org/title/Vulkan#AMDGPU_-_Hangs_when_playing_DirectX_Vulkan_games
And so it suggests trying amdvlk instead. I suppose I can only try it and hope that stability for all my other applications and games are just as good with amdvlk.