AMD GPU Driver Error - Unable to use - Push reset button is needed

Hello,

After the latest two kernel updates for the LTS 5.10 i’ve been getting error for AMD GPU driver what makes the system unable to use and reset button push is required. The issue happen randomly, i happened with my desktop in idle and also during soft usage like using Vivaldi. It’s intriguing because I never got the error while playing games. I have been able to get the error from KSystemLog as you can see in the image below at 07:57.

The issue lock the desktop, forces log out, blur the screen and freezes.

I think it will requires me to be patient and wait for future kernel update until someone is able to report the bug and it get fixed. Unfortunate I can’t do it by myself due to limited knowledge. Maybe I switch to the kernel 5.4 LTS, don’t know yet.

To be able to a screenshot from the system frozen I used my cellphone. The picture isn’t from today’s issue, but it always happen the same way so it illustrate well.

Do you have high frequency monitor (above 60Hz)? If yes see AMDGPU - ArchWiki

Nope, mine is a bit old, it’s the LG LCD 23 W2353V and the manual indicates:

Horizontal Freq.
30 - 83 kHz (Automatic)
Vertical Freq.
Analog,Digital : 56 - 75 Hz (Automatic)
HDMI : 56 - 61 Hz (Automatic)

When I made a fresh linux install from Manjaro 20.1 it was working pretty well, the issue start from the latest two kernel updates, I Think they were 5.10.18 and 5.10.23, the fresh installation if I remember well was using the 5.11 and I changed to 5.10.11, I didn’t register the info, so I can’t precisely indicate, sorry, maybe there is some history log registered anywhere in my PC.

Below my system info in case it’s needed:

System:
  Kernel: 5.10.26-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.0 
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.10-x86_64 
  root=UUID=dfa33105-8d9e-4555-9174-609a71216485 ro quiet apparmor=1 
  security=apparmor resume=UUID=493ccde5-7f80-4bf5-b22c-98ce796aade6 
  udev.log_priority=3 
  Desktop: KDE Plasma 5.21.3 tk: Qt 5.15.2 wm: kwin_x11 vt: 1 dm: SDDM 
  Distro: Manjaro Linux base: Arch Linux 
Machine:
  Type: Desktop Mobo: Gigabyte model: GA-870A-UD3 v: x.x serial: <filter> 
  BIOS: Award v: F5 date: 08/01/2011 
Memory:
  RAM: total: 7.77 GiB used: 2.55 GiB (32.8%) 
  RAM Report: permissions: Unable to run dmidecode. Root privileges required. 
CPU:
  Info: Quad Core model: AMD Phenom II X4 965 bits: 64 type: MCP arch: K10 
  family: 10 (16) model-id: 4 stepping: 3 microcode: 10000C8 cache: L2: 2 MiB 
  bogomips: 27334 
  Speed: 800 MHz min/max: 800/3400 MHz Core speeds (MHz): 1: 800 2: 3400 
  3: 800 4: 800 
  Flags: 3dnow 3dnowext 3dnowprefetch abm apic clflush cmov cmp_legacy 
  constant_tsc cpuid cr8_legacy cx16 cx8 de extapic extd_apicid fpu fxsr 
  fxsr_opt ht hw_pstate ibs lahf_lm lbrv lm mca mce misalignsse mmx mmxext 
  monitor msr mtrr nonstop_tsc nopl npt nrip_save nx osvw pae pat pdpe1gb pge 
  pni popcnt pse pse36 rdtscp rep_good sep skinit sse sse2 sse4a svm svm_lock 
  syscall tsc vme vmmcall wdt 
  Vulnerabilities: Type: itlb_multihit status: Not affected 
  Type: l1tf status: Not affected 
  Type: mds status: Not affected 
  Type: meltdown status: Not affected 
  Type: spec_store_bypass status: Not affected 
  Type: spectre_v1 
  mitigation: usercopy/swapgs barriers and __user pointer sanitization 
  Type: spectre_v2 
  mitigation: Full AMD retpoline, STIBP: disabled, RSB filling 
  Type: srbds status: Not affected 
  Type: tsx_async_abort status: Not affected 
Graphics:
  Device-1: AMD Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] 
  vendor: Micro-Star MSI driver: amdgpu v: kernel bus-ID: 01:00.0 
  chip-ID: 1002:67df class-ID: 0300 
  Display: x11 server: X.Org 1.20.10 compositor: kwin_x11 driver: 
  loaded: amdgpu,ati unloaded: modesetting,radeon alternate: fbdev,vesa 
  display-ID: :0 screens: 1 
  Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.0x11.2") 
  s-diag: 582mm (22.9") 
  Monitor-1: DVI-D-0 res: 1920x1080 hz: 60 dpi: 96 
  size: 510x290mm (20.1x11.4") diag: 587mm (23.1") 
  OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.40.0 
  5.10.26-1-MANJARO LLVM 11.1.0) 
  v: 4.6 Mesa 21.0.1 direct render: Yes 
Audio:
  Device-1: AMD SBx00 Azalia vendor: Gigabyte GA-880GMA-USB3 
  driver: snd_hda_intel v: kernel bus-ID: 00:14.2 chip-ID: 1002:4383 
  class-ID: 0403 
  Device-2: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] 
  vendor: Micro-Star MSI driver: snd_hda_intel v: kernel bus-ID: 01:00.1 
  chip-ID: 1002:aaf0 class-ID: 0403 
  Sound Server-1: ALSA v: k5.10.26-1-MANJARO running: yes 
  Sound Server-2: JACK v: 0.125.0 running: no 
  Sound Server-3: PulseAudio v: 14.2 running: yes 
  Sound Server-4: PipeWire v: 0.3.24 running: no 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  vendor: Gigabyte driver: r8169 v: kernel port: 9e00 bus-ID: 06:00.0 
  chip-ID: 10ec:8168 class-ID: 0200 
  IF: enp6s0 state: up speed: 100 Mbps duplex: full mac: <filter> 
  IP v4: <filter> type: dynamic noprefixroute scope: global 
  broadcast: <filter> 
  IP v6: <filter> type: noprefixroute scope: link 
  WAN IP: <filter> 
Bluetooth:
  Message: No Bluetooth data was found. 
Logical:
  Message: No LVM data was found. 
RAID:
  Message: No RAID data was found. 
Drives:
  Local Storage: total: 2.48 TiB used: 1.85 TiB (74.6%) 
  SMART Message: Unable to run smartctl. Root privileges required. 
  ID-1: /dev/sda maj-min: 8:0 vendor: SanDisk model: SSD PLUS 240GB 
  size: 223.58 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s 
  rotation: SSD serial: <filter> rev: 00RL scheme: MBR 
  ID-2: /dev/sdb maj-min: 8:16 vendor: Corsair model: Corsair Force GS 
  size: 119.24 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s 
  rotation: SSD serial: <filter> rev: 5.07 scheme: MBR 
  ID-3: /dev/sdc maj-min: 8:32 vendor: Patriot model: Burst size: 223.57 GiB 
  block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s rotation: SSD 
  serial: <filter> rev: KB.3 scheme: MBR 
  ID-4: /dev/sdd maj-min: 8:48 vendor: Crucial model: CT120BX100SSD1 
  size: 111.79 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s 
  rotation: SSD serial: <filter> rev: MU01 scheme: GPT 
  ID-5: /dev/sde maj-min: 8:64 vendor: Western Digital model: WD20PURZ-85GU6Y0 
  size: 1.82 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s 
  rotation: 5400 rpm serial: <filter> rev: 0A80 scheme: MBR 
  Optical-1: /dev/sr0 vendor: HL-DT-ST model: DVDRAM GH22NS50 rev: TN02 
  dev-links: cdrom 
  Features: speed: 48 multisession: yes audio: yes dvd: yes 
  rw: cd-r,cd-rw,dvd-r,dvd-ram state: running 
Partition:
  ID-1: / raw-size: 214.77 GiB size: 210.4 GiB (97.96%) 
  used: 140.15 GiB (66.6%) fs: ext4 dev: /dev/sdc1 maj-min: 8:33 
  label: LINUX-SSD uuid: dfa33105-8d9e-4555-9174-609a71216485 
  ID-2: /home/<filter>/Games raw-size: 111.79 GiB size: 109.47 GiB (97.93%) 
  used: 58.97 GiB (53.9%) fs: ext4 dev: /dev/sdd1 maj-min: 8:49 
  label: LINUX-GAMES uuid: ef7d7ba9-f8da-4419-b9a3-ca9e2232da86 
  ID-3: /home/<filter>/HDD-DATA-2T raw-size: 1.82 TiB size: 1.82 TiB (100.00%) 
  used: 1.66 TiB (91.1%) fs: ntfs dev: /dev/sde1 maj-min: 8:65 label: DATA_2T 
  uuid: 28A6726FA6723CFE 
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) 
  ID-1: swap-1 type: partition size: 8.8 GiB used: 14 MiB (0.2%) priority: -2 
  dev: /dev/sdc2 maj-min: 8:34 label: N/A 
  uuid: 493ccde5-7f80-4bf5-b22c-98ce796aade6 
Unmounted:
  ID-1: /dev/sda1 maj-min: 8:1 size: 100 MiB fs: ntfs label: System Reserved 
  uuid: E680DA2C5F90FD24 
  ID-2: /dev/sda2 maj-min: 8:2 size: 223.48 GiB fs: ntfs label: WIN-SSD 
  uuid: 07492E03A516286B 
  ID-3: /dev/sdb1 maj-min: 8:17 size: 119.24 GiB fs: ntfs label: WIN-GAMES 
  uuid: EAE6EE7DE6EE4A01 
USB:
  Hub-1: 1-0:1 info: Full speed (or root) Hub ports: 5 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Hub-2: 2-0:1 info: Full speed (or root) Hub ports: 5 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Hub-3: 3-0:1 info: Full speed (or root) Hub ports: 4 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Hub-4: 4-0:1 info: Full speed (or root) Hub ports: 5 rev: 1.1 speed: 12 Mb/s 
  chip-ID: 1d6b:0001 class-ID: 0900 
  Hub-5: 5-0:1 info: Full speed (or root) Hub ports: 5 rev: 1.1 speed: 12 Mb/s 
  chip-ID: 1d6b:0001 class-ID: 0900 
  Hub-6: 6-0:1 info: Full speed (or root) Hub ports: 2 rev: 1.1 speed: 12 Mb/s 
  chip-ID: 1d6b:0001 class-ID: 0900 
  Hub-7: 7-0:1 info: Full speed (or root) Hub ports: 4 rev: 1.1 speed: 12 Mb/s 
  chip-ID: 1d6b:0001 class-ID: 0900 
  Device-1: 7-3:2 info: SINO WEALTH USB KEYBOARD type: Keyboard,HID 
  driver: hid-generic,usbhid interfaces: 2 rev: 1.1 speed: 1.5 Mb/s 
  power: 100mA chip-ID: 258a:0001 class-ID: 0300 
  Device-2: 7-4:3 info: Pixart Imaging Optical Mouse type: Mouse 
  driver: hid-generic,usbhid interfaces: 1 rev: 1.1 speed: 1.5 Mb/s 
  power: 100mA chip-ID: 093a:2521 class-ID: 0301 
  Hub-8: 8-0:1 info: Full speed (or root) Hub ports: 2 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Hub-9: 9-0:1 info: Full speed (or root) Hub ports: 2 rev: 3.0 speed: 5 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
Sensors:
  System Temperatures: cpu: 36.1 C mobo: N/A gpu: amdgpu temp: 42.0 C 
  Fan Speeds (RPM): N/A gpu: amdgpu fan: 925 
Info:
  Processes: 237 Uptime: 59m wakeups: 0 Init: systemd v: 247 tool: systemctl 
  Compilers: gcc: 10.2.0 Packages: pacman: 1364 lib: 432 flatpak: 0 
  Shell: Bash v: 5.1.0 running-in: yakuake inxi: 3.3.03 

Have you tried kernel 5.11 or 5.12?

Not yet, I will give a try to the 5.11 once it’s stable, but 5.12 isn’t stable right now.

The reason I was preferring to stay with 5.10 if because it’s LTS, so I assume it it has less bugs.

Just because a kernel is LTS does NOT mean it has less bugs from my experience. It just means that specific kernel will be supported for a longer period of time. But some stuff from newer kernels also gets backported to the LTS kernels as well.

Also, AMD GPUs typically gets better with newer kernels (unless there was a regression). One of the reasons why I use Manjaro is to be able to easily use newer kernels for my AMD GPUs.

Thank you for the feedback

I’m going to install the 5.11 and use it for a week then i will post here if this particular issue was fixed.

The error still persists with kernel 5.11

amdgpu_dm_atomic_commit_tail ERROR waiting for fences timed out

The issue still remains after the last updates and also for the kernel 5.4

Unfortunately DrKonqi do not detect this error what reinforce it’s more related to the AMD driver.

Never had issue AGMD/GPU driver issue while using windows 7 so I will keep thinking that my hardware is OK. Maybe it will take more time from the community to detect the issue and fix it.

you can try testing