I updated my system and attempted to play some games today. only to find out that they all freeze after a little while of playing. This doesn’t seem to be closely tied to GPU load either, since it crashed twice while I was in the menu of a game with low GPU usage. I’m unsure whether the update to Mesa 21.0.1 or the kernel update caused this.
System:
Kernel: 5.11.10-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.0
parameters: BOOT_IMAGE=/boot/vmlinuz-5.11-x86_64
root=UUID=fe9c5e23-5885-4d31-af95-391198cbd234 rw
systemd.unified_cgroup_hierarchy=1 intel_pstate=active udev.log_priority=3
Desktop: GNOME 3.38.4 tk: GTK 3.24.28 wm: gnome-shell dm: GDM 3.38.2.1
Distro: Manjaro Linux base: Arch Linux
Machine:
Type: Desktop Mobo: Gigabyte model: B85M-D3H v: x.x serial: <filter>
UEFI: American Megatrends v: FB date: 06/19/2014
CPU:
Info: Dual Core model: Intel Core i3-4150 bits: 64 type: MT MCP
arch: Haswell family: 6 model-id: 3C (60) stepping: 3 microcode: 28 cache:
L2: 3 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
bogomips: 27944
Speed: 1018 MHz min/max: 800/3500 MHz Core speeds (MHz): 1: 1018 2: 923
3: 959 4: 935
Vulnerabilities: Type: itlb_multihit status: KVM: VMX disabled
Type: l1tf
mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable
Type: mds mitigation: Clear CPU buffers; SMT vulnerable
Type: meltdown mitigation: PTI
Type: spec_store_bypass
mitigation: Speculative Store Bypass disabled via prctl and seccomp
Type: spectre_v1
mitigation: usercopy/swapgs barriers and __user pointer sanitization
Type: spectre_v2 mitigation: Full generic retpoline, IBPB: conditional,
IBRS_FW, STIBP: conditional, RSB filling
Type: srbds mitigation: Microcode
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: Intel 4th Generation Core Processor Family Integrated Graphics
vendor: Gigabyte driver: i915 v: kernel bus-ID: 00:02.0 chip-ID: 8086:041e
class-ID: 0300
Device-2: AMD Baffin [Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X] vendor: ASUSTeK driver: amdgpu v: kernel bus-ID: 01:00.0 chip-ID: 1002:67ef
class-ID: 0300
Display: wayland server: X.org 1.20.10 compositor: gnome-shell driver:
loaded: amdgpu,ati,intel unloaded: modesetting alternate: fbdev,vesa
display-ID: 0 resolution: <missing: xdpyinfo>
OpenGL: renderer: AMD Radeon RX 460 Graphics (POLARIS11 DRM 3.40.0
5.11.10-1-MANJARO LLVM 11.1.0)
v: 4.6 Mesa 21.0.1 direct render: Yes
Audio:
Device-1: Intel Xeon E3-1200 v3/4th Gen Core Processor HD Audio
driver: snd_hda_intel v: kernel bus-ID: 00:03.0 chip-ID: 8086:0c0c
class-ID: 0403
Device-2: Intel 8 Series/C220 Series High Definition Audio vendor: Gigabyte
driver: snd_hda_intel v: kernel bus-ID: 00:1b.0 chip-ID: 8086:8c20
class-ID: 0403
Device-3: AMD Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X]
vendor: ASUSTeK driver: snd_hda_intel v: kernel bus-ID: 01:00.1
chip-ID: 1002:aae0 class-ID: 0403
Sound Server-1: ALSA v: k5.11.10-1-MANJARO running: yes
Sound Server-2: JACK v: 0.125.0 running: no
Sound Server-3: PulseAudio v: 14.2 running: yes
Sound Server-4: PipeWire v: 0.3.24 running: yes
Network:
Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: Gigabyte driver: r8169 v: kernel port: d000 bus-ID: 03:00.0
chip-ID: 10ec:8168 class-ID: 0200
IF: enp3s0 state: down mac: <filter>
Device-2: Realtek RTL8192CU 802.11n WLAN Adapter type: USB driver: rtl8xxxu
bus-ID: 3-10:6 chip-ID: 0bda:8178 class-ID: 0000 serial: <filter>
IF: wlp0s20u10 state: up mac: <filter>
Bluetooth:
Device-1: Cambridge Silicon Radio Bluetooth Dongle (HCI mode) type: USB
driver: btusb v: 0.8 bus-ID: 3-12:7 chip-ID: 0a12:0001 class-ID: e001
Drives:
Local Storage: total: 1.82 TiB used: 1.54 TiB (84.5%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/sda maj-min: 8:0 vendor: Western Digital model: WD10EZEX-00WN4A0
size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
rotation: 7200 rpm serial: <filter> rev: 1A01 scheme: GPT
ID-2: /dev/sdb maj-min: 8:16 vendor: Seagate model: ST1000DM003-1ER162
size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
rotation: 7200 rpm serial: <filter> rev: CC43 scheme: MBR
Partition:
ID-1: / raw-size: 99.5 GiB size: 97.44 GiB (97.93%) used: 21.59 GiB (22.2%)
fs: ext4 dev: /dev/sda2 maj-min: 8:2
ID-2: /boot/efi raw-size: 1024 MiB size: 1022 MiB (99.80%)
used: 312 KiB (0.0%) fs: vfat dev: /dev/sda1 maj-min: 8:1
ID-3: /home raw-size: 354.17 GiB size: 347.62 GiB (98.15%)
used: 276.87 GiB (79.6%) fs: ext4 dev: /dev/sda5 maj-min: 8:5
Swap:
Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
ID-1: swap-1 type: zram size: 490.3 MiB used: 0 KiB (0.0%) priority: 32767
dev: /dev/zram0
ID-2: swap-2 type: zram size: 490.3 MiB used: 0 KiB (0.0%) priority: 32767
dev: /dev/zram1
ID-3: swap-3 type: zram size: 490.3 MiB used: 0 KiB (0.0%) priority: 32767
dev: /dev/zram2
ID-4: swap-4 type: zram size: 490.3 MiB used: 0 KiB (0.0%) priority: 32767
dev: /dev/zram3
Sensors:
System Temperatures: cpu: 50.0 C mobo: 27.8 C gpu: amdgpu temp: 38.0 C
Fan Speeds (RPM): N/A gpu: amdgpu fan: 1818
Info:
Processes: 243 Uptime: 27m wakeups: 0 Memory: 7.66 GiB
used: 1.98 GiB (25.8%) Init: systemd v: 247 tool: systemctl Compilers:
gcc: 10.2.0 clang: 11.1.0 Packages: 1525 pacman: 1508 lib: 432 flatpak: 17
Shell: Bash v: 5.1.0 running-in: gnome-terminal inxi: 3.3.03
This is the result of running sudo journalctl -b -1 | grep -i amd
Apr 11 20:24:43 abrar-desktop kernel: AMD AuthenticAMD
Apr 11 20:24:43 abrar-desktop kernel: RAMDISK: [mem 0x36875000-0x37431fff]
Apr 11 20:24:43 abrar-desktop kernel: AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
Apr 11 20:24:43 abrar-desktop kernel: AMD-Vi: AMD IOMMUv2 functionality not available on this system
Apr 11 20:24:47 abrar-desktop kernel: [drm] amdgpu kernel modesetting enabled.
Apr 11 20:24:47 abrar-desktop kernel: amdgpu: Topology: Add CPU node
Apr 11 20:24:47 abrar-desktop kernel: fb0: switching to amdgpudrmfb from EFI VGA
Apr 11 20:24:47 abrar-desktop kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Apr 11 20:24:47 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Apr 11 20:24:47 abrar-desktop kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Apr 11 20:24:47 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Apr 11 20:24:47 abrar-desktop kernel: amdgpu: ATOM BIOS: 115-C994PI00-100
Apr 11 20:24:48 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
Apr 11 20:24:48 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Apr 11 20:24:48 abrar-desktop kernel: [drm] amdgpu: 2048M of VRAM memory ready
Apr 11 20:24:48 abrar-desktop kernel: [drm] amdgpu: 3072M of GTT memory ready.
Apr 11 20:24:48 abrar-desktop kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Apr 11 20:24:48 abrar-desktop kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Apr 11 20:24:48 abrar-desktop kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Apr 11 20:24:48 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Apr 11 20:24:48 abrar-desktop kernel: fbcon: amdgpudrmfb (fb0) is primary device
Apr 11 20:24:48 abrar-desktop kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Apr 11 20:24:48 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Apr 11 20:24:48 abrar-desktop kernel: [drm] Initialized amdgpu 3.40.0 20150101 for 0000:01:00.0 on minor 1
Apr 11 20:25:08 abrar-desktop gnome-shell[1091]: Disabling DMA buffer screen sharing for driver 'amdgpu'.
Apr 11 20:25:23 abrar-desktop gnome-shell[1544]: Disabling DMA buffer screen sharing for driver 'amdgpu'.
Apr 11 21:08:16 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=452681, emitted seq=452683
Apr 11 21:08:16 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process eurotrucks2.exe pid 7405 thread eurotrucks:cs0 pid 7419
Apr 11 21:08:16 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Apr 11 21:08:16 abrar-desktop kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Apr 11 21:08:16 abrar-desktop kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Apr 11 21:08:16 abrar-desktop kernel: amdgpu: cp is busy, skip halt cp
Apr 11 21:08:16 abrar-desktop kernel: amdgpu: rlc is busy, skip halt rlc
Apr 11 21:08:16 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: BACO reset
Apr 11 21:08:17 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume
Apr 11 21:08:17 abrar-desktop kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
Apr 11 21:08:17 abrar-desktop kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v8_0> failed -110
Apr 11 21:08:17 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset(2) failed
Apr 11 21:08:17 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset end with ret = -110
Apr 11 21:08:27 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Apr 11 21:08:37 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
This was taken after it crashed while playing ETS 2 through Steam Proton 6.3-2
EDIT: I fixed the issue by deleting the Mesa shader cache, it’s located at ~/.cache/mesa_shader_cache/
EDIT 2: Nope! I spoke too soon, It’s happening again. SSH and sound still works but the picture just hangs and then the GPU stops outputting a signal. Dynamic sound effects like gunshots also freeze and glitch but the BG music keeps going without any issues or glitches.
Apr 12 16:10:39 abrar-desktop kernel: AMD AuthenticAMD
Apr 12 16:10:39 abrar-desktop kernel: RAMDISK: [mem 0x36875000-0x37431fff]
Apr 12 16:10:39 abrar-desktop kernel: AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
Apr 12 16:10:39 abrar-desktop kernel: AMD-Vi: AMD IOMMUv2 functionality not available on this system
Apr 12 16:10:44 abrar-desktop kernel: [drm] amdgpu kernel modesetting enabled.
Apr 12 16:10:44 abrar-desktop kernel: amdgpu: Topology: Add CPU node
Apr 12 16:10:44 abrar-desktop kernel: fb0: switching to amdgpudrmfb from EFI VGA
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Apr 12 16:10:44 abrar-desktop kernel: amdgpu: ATOM BIOS: 115-C994PI00-100
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Apr 12 16:10:44 abrar-desktop kernel: [drm] amdgpu: 2048M of VRAM memory ready
Apr 12 16:10:44 abrar-desktop kernel: [drm] amdgpu: 3072M of GTT memory ready.
Apr 12 16:10:44 abrar-desktop kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Apr 12 16:10:44 abrar-desktop kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Apr 12 16:10:44 abrar-desktop kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Apr 12 16:10:44 abrar-desktop kernel: fbcon: amdgpudrmfb (fb0) is primary device
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Apr 12 16:10:44 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Apr 12 16:10:44 abrar-desktop kernel: [drm] Initialized amdgpu 3.40.0 20150101 for 0000:01:00.0 on minor 1
Apr 12 16:11:02 abrar-desktop gnome-shell[1091]: Disabling DMA buffer screen sharing for driver 'amdgpu'.
Apr 12 16:11:42 abrar-desktop gnome-shell[1891]: Disabling DMA buffer screen sharing for driver 'amdgpu'.
Apr 12 16:19:54 abrar-desktop kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Apr 12 16:19:54 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=131336, emitted seq=131338
Apr 12 16:19:54 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process JustCause3.exe pid 5401 thread JustCause3:cs0 pid 5418
Apr 12 16:19:54 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Apr 12 16:19:54 abrar-desktop kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Apr 12 16:19:54 abrar-desktop kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Apr 12 16:19:55 abrar-desktop kernel: amdgpu: cp is busy, skip halt cp
Apr 12 16:19:55 abrar-desktop kernel: amdgpu: rlc is busy, skip halt rlc
Apr 12 16:19:55 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: BACO reset
Apr 12 16:19:55 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume
Apr 12 16:19:56 abrar-desktop kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
Apr 12 16:19:56 abrar-desktop kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v8_0> failed -110
Apr 12 16:19:56 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset(2) failed
Apr 12 16:19:56 abrar-desktop kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset end with ret = -110
Apr 12 16:20:06 abrar-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Logs taken after it crashed while playing Just Cause 3.