System frequently crashing after GPU drivers update

What linux-firmware are you referring?
How can I check in my system and check/locking it too?

I tried to use a lot of difference linux command to list system info and didn’t get linux-firmware results in the list

I second this, having recently purchased a refurbished Lenovo Thinkpad E595 with a Radeon RX Vega 10 (Picasso architecture) integrated GPU and experienced those random KWin reset moments in that same PC. Since I’ve downgraded linux-firmware on the laptop last week, I haven’t ran into the graphics reset bug so far.

Meanwhile, my desktop with a Radeon RX 570 (Polaris) has never had that same graphics freeze and reset issue, so I have no incentive to lock linux-firmware on that system.

I downgraded that package to 20210511.7685cf4 according to this reply. So far I haven’t had the freeze-and-reset behavior come back since installing that version.

Steps from hans12 :

Here are your steps:

  1. pacman -S yay #(or install similar AUR-tool if not already installed)
  2. optional: yay -Syu #(will update your system)
  3. yay -S downgrade #(install downgrade)
  4. sudo downgrade --ala-only linux-firmware
  5. select option from march (36) or earlier
  6. optional: [Y] set linux-firmware on ignore list, to prevent future updates
  7. optional: reboot to immediately apply changes

I downgraded to linux-firmware-20210208

1 Like

Sorry to make more questions, but if someone don’t mind to help me for better understand here. Thanks in advanced.

In general when I ready some reference to firmware I think about and kind of embedded software loaded in a specific hardware, like the firmware for the GPU, firmware for the printer, firmware for the router, etc.
Considering that Linux has the kernel and it has built-inside the AMDGPU driver witch shares some driver functionality with MESA, what is this refered linux-firmware? what it do? How it related to the kernel, GPU driver and the MESA?

In addition, using the command below it doesn’t show any Linux-firmware installed, the firmwares it shows are:

alsa-firmware
manjaro-firmware

$ pacman -Qqe > pkglist.txt

A version of linux-firmware should be installed. Please try this:

sudo pacman -Q --info linux-firmware

The package contains GPU specific firmware files which are used by AMDGPU.

Intersesting finding from the linux-firmware repo: on Aug. 11th there was a commit with the following commit message:

amdgpu: revert back to older picasso sdma firmware

Newer firmware seems to cause random stability issues that
are hard to reproduce reliably to root cause. Revert
back to the older SDMA firmware until the issue is fixed.

Maybe this is the problem we are seeing with the newer firmwares.

Edit: Raven and Raven2 firmwares are reverted to older versions as well.

2 Likes

I’m currently using the combination below for 1 week without issue. I will not celebrate now because just after do it I got issued again in the past, so I needs to wait little more.

Kernel 5.14rc6
MESA 21.1.6
linux-firmware 20210719.r1990.168452e-1

It’s basically the latest Manjaro Stable excepts from the Kernel

Well, others here had no luck with that firmware package on the long run (me included).

Based on the latest commits by AMD, there seems to be an issue with newer firmwares which is not present in the older versions. This correlates with observations from this forum.

If you still have stability issues I would recommend to switch back to linux-firmware version 20210315.3568f96 for now.

OP here! I had lived with no freezes for a month or so, but then they began happening on a daily/once-per-two-days basis, don’t know why. The commit that @B007C0DE found seems really interesting, hope some fix comes with it. I’ll try downgrading to the firmware version mentioned.

Thanks for keeping this post alive, guys! Unluckily, I still have met no reliable solution, as I dreamed when I first wrote this. Hopefully the old, pre-issues firmware version proves to be helpful here. Don’t really know, I’m just used to live with this bugs now lol.

1 Like

Hello, I’m using the following combination without a problem for quite a while now.

Kernel: 5.13.11-1
MESA: 21.1.6-1
Linux-Firmware: 20210315.3568f96-2

Was too afraid to try a more recent firmware version. But maybe that would narrow down the problem.

The linux-firmware package just got upgraded to 20210818-c46b8c3 in the testing branch, which hopefully should include the commits you mentioned. I’ll install that version on my laptop as soon as possible, and we’ll see if the crashes returned or have stopped on my end. :crossed_fingers:

In other news, it’s been a week since I downgraded linux-firmware to 20210511 on my laptop, and there have been no crash-and-reset episodes so far.

Edit: @philm has confirmed that the latest version of linux-firmware contains the aformentioned commits.

2 Likes

I dug a bit into the linux-firmware repo:

amdgpu/picasso_sdma.bin was added 2018-12-13 by the author and commited on 2019-01-14.

It remained unchanged for more than two years until a new version was commited on 2021-03-22. After another update on 2021-06-29 it was finally reverted back to the original version from 2018/2019 on 2021-08-11.

The March date correlates with the start of my stability problems. I know the difference between correlation and causation, but fingers crossed that this firmware is the reason for the issues.

4 Likes

Update: It’s been a little over a week since the latest linux-firmware update was pushed in testing, and so far my ThinkPad hasn’t had those crash-and-reset episodes. Now let’s see if it can survive an entire month without running into that bug.

1 Like

I am on the latest stable update ( linux-firmware 20210818.c46b8c3-1) and right now it looks good, no issues so far. I hope it will last

linux-firmware 20210818.c46b8c3-1: stable, but my Vega-GPU (RavenRidge) doesn’t downclock anymore. Keeps between 1000-1000 Mhz including higher temps and louder fans. Reverting once again to firmware from 15.3.2021 :c

On my and my colleague’s notebook linux-firmware 20210818.c46b8c3-1 runs without any incidence on RavenRidge so far.

However, my desktop at home with Polaris still suffers from GPU crashes. I wasn’t even able to get into a TTY and just turned it off. This was the second crash since the update and the first since I updated to Kernel 5.13.

Here is the journalctl:

-- Journal begins at Thu 2021-02-04 08:38:19 CET, ends at Fri 2021-09-10 20:58:57 CEST. --
Sep 10 20:46:32 ManjaroGamingPC kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Sep 10 20:46:32 ManjaroGamingPC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x0352a004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0010366A
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090A0004
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x04, vmid 4, pasid 32771) at page 1062506, write from 'CB4' (0x43423400) (160)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x03122004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090A0004
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x04, vmid 4, pasid 32771) at page 0, write from 'CB4' (0x43423400) (160)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x03529004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00104061
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090D0010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1065057, write from 'CB7' (0x43423700) (208)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x03129004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00103640
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x09020010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1062464, write from 'CB2' (0x43423200) (32)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x03522004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0010378C
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x09010010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1062796, write from 'CB3' (0x43423300) (16)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x035a6004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00104026
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x09020010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1064998, write from 'CB2' (0x43423200) (32)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x031a6004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00104051
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090E0010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1065041, write from 'CB6' (0x43423600) (224)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x031a5004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00104183
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090D0010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1065347, write from 'CB7' (0x43423700) (208)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x031a2004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00103605
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090A0010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1062405, write from 'CB4' (0x43423400) (160)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x035a2004 for process kwin_x11 pid 1178 thread kwin_x11:cs0 pid 1197
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00103728
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x090D0010
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x10, vmid 4, pasid 32771) at page 1062696, write from 'CB7' (0x43423700) (208)
Sep 10 20:46:32 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: IH ring buffer overflow (0x0008E2E0, 0x00000F00, 0x0000E2F0)
Sep 10 20:46:42 ManjaroGamingPC kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Sep 10 20:46:42 ManjaroGamingPC kernel: gmc_v8_0_process_interrupt: 158 callbacks suppressed
Sep 10 20:46:42 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 146 0x0a20480c for process firefox pid 1903 thread firefox:cs0 pid 1959
Sep 10 20:46:42 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00102944
Sep 10 20:46:42 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0604800C
Sep 10 20:46:42 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x0c, vmid 3, pasid 32775) at page 1059140, read from 'TC4' (0x54433400) (72)
Sep 10 20:46:42 ManjaroGamingPC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Sep 10 20:46:52 ManjaroGamingPC kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Sep 10 20:46:52 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 147 0x0ff28802 for process firefox pid 1903 thread firefox:cs0 pid 1959
Sep 10 20:46:52 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x000013FE
Sep 10 20:46:52 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x07088002
Sep 10 20:46:52 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x02, vmid 3, pasid 32775) at page 5118, write from 'TC6' (0x54433600) (136)
Sep 10 20:46:52 ManjaroGamingPC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Sep 10 20:47:02 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU fault detected: 147 0x08a24802 for process Xorg pid 989 thread Xorg:cs0 pid 1017
Sep 10 20:47:02 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00036714
Sep 10 20:47:02 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x03048002
Sep 10 20:47:02 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: VM fault (0x02, vmid 1, pasid 32769) at page 222996, write from 'TC4' (0x54433400) (72)
Sep 10 20:47:02 ManjaroGamingPC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Sep 10 20:47:12 ManjaroGamingPC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=856042, emitted seq=856045
Sep 10 20:47:12 ManjaroGamingPC kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 989 thread Xorg:cs0 pid 1017
Sep 10 20:47:12 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU reset begin!
Sep 10 20:47:16 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: failed to suspend display audio
Sep 10 20:47:17 ManjaroGamingPC kernel: amdgpu: cp is busy, skip halt cp
Sep 10 20:47:17 ManjaroGamingPC kernel: amdgpu: rlc is busy, skip halt rlc
Sep 10 20:47:17 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: BACO reset
Sep 10 20:47:17 ManjaroGamingPC kernel: amdgpu 0000:26:00.0: amdgpu: GPU reset succeeded, trying to resume
Sep 10 20:47:17 ManjaroGamingPC kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400300000).
Sep 10 20:47:17 ManjaroGamingPC kernel: [drm] VRAM is lost due to GPU reset!
Sep 10 20:47:19 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:20 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:21 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:22 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:23 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:24 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:25 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!
Sep 10 20:47:26 ManjaroGamingPC kernel: [drm:uvd_v6_0_start [amdgpu]] *ERROR* UVD not responding, trying to reset the VCPU!!!

I have a very similar issue, though my PC just freezes and cursor is the only thing that remains responsive.

Kernel: 5.14
GPU: AMD R9 380

It happened 3 times today…

journalctl:

-- Journal begins at Sun 2021-08-22 21:13:27 CEST, ends at Fri 2021-09-10 21:12:09 CEST. --
Sep 10 20:36:48 username kwin_x11[900]: kwin_core: XCB error: 3 (BadWindow), sequence: 23819, resource id: 11152053, major code: 129 (SHAPE), minor code: 8 (GetRectangles)
Sep 10 20:36:53 username kwin_x11[900]: kwin_core: XCB error: 3 (BadWindow), sequence: 38180, resource id: 11152612, major code: 129 (SHAPE), minor code: 8 (GetRectangles)
Sep 10 20:37:11 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 56110, resource id: 11144036, major code: 3 (GetWindowAttributes), minor code: 0
Sep 10 20:37:11 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 56111, resource id: 11144036, major code: 14 (GetGeometry), minor code: 0
Sep 10 20:37:23 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 63981, resource id: 11152002, major code: 3 (GetWindowAttributes), minor code: 0
Sep 10 20:37:23 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 63982, resource id: 11152002, major code: 14 (GetGeometry), minor code: 0
Sep 10 20:37:23 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 64800, resource id: 11154247, major code: 3 (GetWindowAttributes), minor code: 0
Sep 10 20:37:23 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 64801, resource id: 11154247, major code: 14 (GetGeometry), minor code: 0
Sep 10 20:37:24 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 2384, resource id: 11154865, major code: 3 (GetWindowAttributes), minor code: 0
Sep 10 20:37:24 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 2385, resource id: 11154865, major code: 14 (GetGeometry), minor code: 0
Sep 10 20:37:36 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 10168, resource id: 11154972, major code: 3 (GetWindowAttributes), minor code: 0
Sep 10 20:37:36 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 10169, resource id: 11154972, major code: 14 (GetGeometry), minor code: 0
Sep 10 20:37:39 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 14113, resource id: 11155537, major code: 3 (GetWindowAttributes), minor code: 0
Sep 10 20:37:39 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 9 (BadDrawable), sequence: 14114, resource id: 11155537, major code: 14 (GetGeometry), minor code: 0
Sep 10 20:38:46 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 9635, resource id: 90177605, major code: 18 (ChangeProperty), minor code: 0
Sep 10 20:39:00 username kwin_x11[900]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 34080, resource id: 23070235, major code: 15 (QueryTree), minor code: 0
Sep 10 20:39:20 username kernel: [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
Sep 10 20:39:20 username kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:57:crtc-2] commit wait timed out
Sep 10 20:39:30 username kernel: [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
Sep 10 20:39:30 username kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:46:plane-3] commit wait timed out
Sep 10 20:39:30 username kernel: ------------[ cut here ]------------
Sep 10 20:39:30 username kernel: WARNING: CPU: 1 PID: 810 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8614 amdgpu_dm_atomic_commit_tail+0x25ac/0x2620 [amdgpu]
Sep 10 20:39:30 username kernel: Modules linked in: veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay squashfs loop joydev mousedev amdgpu snd_hda_codec_realtek rtl8192ce snd_hda_codec_generic gpu_sched rtl_pci i2c_algo_bit rtl8192c_common drm_ttm_helper ttm ledtrig_audio rtlwifi snd_hda_codec_hdmi snd_hda_intel drm_kms_helper mac80211 snd_intel_dspcfg cec agpgart syscopyarea cfg80211 sysfillrect snd_intel_sdw_acpi sysimgblt fb_sys_fops snd_hda_codec snd_hda_core rfkill libarc4 r8169 intel_spi_platform realtek mdio_devres intel_spi snd_hwdep libphy iTCO_wdt spi_nor snd_pcm snd_timer intel_pmc_bxt mtd mei_hdcp iTCO_vendor_support at24 snd intel_rapl_msr video soundcore i2c_i801 intel_rapl_common i2c_smbus mei_me mei mac_hid lpc_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm intel_smartconnect pcspkr irqbypass crct10dif_pclmul crc32_pclmul
Sep 10 20:39:30 username kernel:  ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate intel_uncore uinput sg crypto_user drm fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 uas usb_storage usbhid crc32c_intel xhci_pci
Sep 10 20:39:30 username kernel: CPU: 1 PID: 810 Comm: Xorg Not tainted 5.14.0-0-MANJARO #1
Sep 10 20:39:30 username kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B85 Anniversary, BIOS P1.40 07/27/2015
Sep 10 20:39:30 username kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x25ac/0x2620 [amdgpu]
Sep 10 20:39:30 username kernel: Code: 00 00 e9 64 e5 ff ff 48 8b 8d 54 fd ff ff 31 c0 49 39 8c 24 f8 04 00 00 0f 85 02 fd ff ff e9 02 fd ff ff 0f 0b e9 45 f9 ff ff <0f> 0b e9 b6 f9 ff ff e8 88 a2 00 00 e9 9c f0 ff ff 0f 0b 0f 0b e9
Sep 10 20:39:30 username kernel: RSP: 0018:ffffaeecc0deb918 EFLAGS: 00010002
Sep 10 20:39:30 username kernel: RAX: 0000000000000002 RBX: 00000000000ee9fa RCX: ffff970bcaa29918
Sep 10 20:39:30 username kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff970bcaf00178
Sep 10 20:39:30 username kernel: RBP: ffffaeecc0debcb8 R08: 0000000000000005 R09: ffffaeecc0deb87c
Sep 10 20:39:30 username kernel: R10: ffffaeecc0deb880 R11: 0000000000000000 R12: 0000000000000286
Sep 10 20:39:30 username kernel: R13: ffff970bcaa29918 R14: ffff970ce1395400 R15: ffff970bcaa29800
Sep 10 20:39:30 username kernel: FS:  00007fca7ff7af40(0000) GS:ffff970ecec80000(0000) knlGS:0000000000000000
Sep 10 20:39:30 username kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 10 20:39:30 username kernel: CR2: 000035df097c8000 CR3: 000000010dbc4003 CR4: 00000000001706e0
Sep 10 20:39:30 username kernel: Call Trace:
Sep 10 20:39:30 username kernel:  commit_tail+0x94/0x120 [drm_kms_helper]
Sep 10 20:39:30 username kernel:  drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
Sep 10 20:39:30 username kernel:  drm_mode_obj_set_property_ioctl+0x156/0x3d0 [drm]
Sep 10 20:39:30 username kernel:  ? drm_property_create_blob.part.0+0xda/0x120 [drm]
Sep 10 20:39:30 username kernel:  ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
Sep 10 20:39:30 username kernel:  drm_ioctl_kernel+0xb2/0x100 [drm]
Sep 10 20:39:30 username kernel:  drm_ioctl+0x22a/0x3d0 [drm]
Sep 10 20:39:30 username kernel:  ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
Sep 10 20:39:30 username kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Sep 10 20:39:30 username kernel:  __x64_sys_ioctl+0x82/0xb0
Sep 10 20:39:30 username kernel:  do_syscall_64+0x3b/0x90
Sep 10 20:39:30 username kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Sep 10 20:39:30 username kernel: RIP: 0033:0x7fca80a0159b
Sep 10 20:39:30 username kernel: Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48
Sep 10 20:39:30 username kernel: RSP: 002b:00007ffe4cf42d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Sep 10 20:39:30 username kernel: RAX: ffffffffffffffda RBX: 00007ffe4cf42dc0 RCX: 00007fca80a0159b
Sep 10 20:39:30 username kernel: RDX: 00007ffe4cf42dc0 RSI: 00000000c01864ba RDI: 000000000000000d
Sep 10 20:39:30 username kernel: RBP: 00000000c01864ba R08: 0000000000000063 R09: 00000000cccccccc
Sep 10 20:39:30 username kernel: R10: 0000000000000fff R11: 0000000000000246 R12: 00005555793222b0
Sep 10 20:39:30 username kernel: R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000003
Sep 10 20:39:30 username kernel: ---[ end trace 5f1937d5299f4684 ]---
Sep 10 20:39:30 username kernel: ------------[ cut here ]------------
Sep 10 20:39:30 username kernel: WARNING: CPU: 1 PID: 810 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8201 amdgpu_dm_atomic_commit_tail+0x25bf/0x2620 [amdgpu]
Sep 10 20:39:30 username kernel: Modules linked in: veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay squashfs loop joydev mousedev amdgpu snd_hda_codec_realtek rtl8192ce snd_hda_codec_generic gpu_sched rtl_pci i2c_algo_bit rtl8192c_common drm_ttm_helper ttm ledtrig_audio rtlwifi snd_hda_codec_hdmi snd_hda_intel drm_kms_helper mac80211 snd_intel_dspcfg cec agpgart syscopyarea cfg80211 sysfillrect snd_intel_sdw_acpi sysimgblt fb_sys_fops snd_hda_codec snd_hda_core rfkill libarc4 r8169 intel_spi_platform realtek mdio_devres intel_spi snd_hwdep libphy iTCO_wdt spi_nor snd_pcm snd_timer intel_pmc_bxt mtd mei_hdcp iTCO_vendor_support at24 snd intel_rapl_msr video soundcore i2c_i801 intel_rapl_common i2c_smbus mei_me mei mac_hid lpc_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm intel_smartconnect pcspkr irqbypass crct10dif_pclmul crc32_pclmul
Sep 10 20:39:30 username kernel:  ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate intel_uncore uinput sg crypto_user drm fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 uas usb_storage usbhid crc32c_intel xhci_pci
Sep 10 20:39:30 username kernel: CPU: 1 PID: 810 Comm: Xorg Tainted: G        W         5.14.0-0-MANJARO #1
Sep 10 20:39:30 username kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B85 Anniversary, BIOS P1.40 07/27/2015
Sep 10 20:39:30 username kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x25bf/0x2620 [amdgpu]
Sep 10 20:39:30 username kernel: Code: 24 f8 04 00 00 0f 85 02 fd ff ff e9 02 fd ff ff 0f 0b e9 45 f9 ff ff 0f 0b e9 b6 f9 ff ff e8 88 a2 00 00 e9 9c f0 ff ff 0f 0b <0f> 0b e9 c2 f9 ff ff 49 8b 06 41 0f b6 8e 2d 01 00 00 48 c7 c6 68
Sep 10 20:39:30 username kernel: RSP: 0018:ffffaeecc0deb918 EFLAGS: 00010086
Sep 10 20:39:30 username kernel: RAX: 0000000000000001 RBX: 00000000000ee9fa RCX: ffff970bcaa29918
Sep 10 20:39:30 username kernel: RDX: 0000000000000001 RSI: 0000000000000297 RDI: ffff970bcaf00178
Sep 10 20:39:30 username kernel: RBP: ffffaeecc0debcb8 R08: 0000000000000005 R09: ffffaeecc0deb87c
Sep 10 20:39:30 username kernel: R10: ffffaeecc0deb880 R11: 0000000000000000 R12: 0000000000000286
Sep 10 20:39:30 username kernel: R13: ffff970bcaa29918 R14: ffff970ce1395400 R15: ffff970bcaa29800
Sep 10 20:39:30 username kernel: FS:  00007fca7ff7af40(0000) GS:ffff970ecec80000(0000) knlGS:0000000000000000
Sep 10 20:39:30 username kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 10 20:39:30 username kernel: CR2: 000035df097c8000 CR3: 000000010dbc4003 CR4: 00000000001706e0
Sep 10 20:39:30 username kernel: Call Trace:
Sep 10 20:39:30 username kernel:  commit_tail+0x94/0x120 [drm_kms_helper]
Sep 10 20:39:30 username kernel:  drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
Sep 10 20:39:30 username kernel:  drm_mode_obj_set_property_ioctl+0x156/0x3d0 [drm]
Sep 10 20:39:30 username kernel:  ? drm_property_create_blob.part.0+0xda/0x120 [drm]
Sep 10 20:39:30 username kernel:  ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
Sep 10 20:39:30 username kernel:  drm_ioctl_kernel+0xb2/0x100 [drm]
Sep 10 20:39:30 username kernel:  drm_ioctl+0x22a/0x3d0 [drm]
Sep 10 20:39:30 username kernel:  ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
Sep 10 20:39:30 username kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Sep 10 20:39:30 username kernel:  __x64_sys_ioctl+0x82/0xb0
Sep 10 20:39:30 username kernel:  do_syscall_64+0x3b/0x90
Sep 10 20:39:30 username kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Sep 10 20:39:30 username kernel: RIP: 0033:0x7fca80a0159b
Sep 10 20:39:30 username kernel: Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48
Sep 10 20:39:30 username kernel: RSP: 002b:00007ffe4cf42d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Sep 10 20:39:30 username kernel: RAX: ffffffffffffffda RBX: 00007ffe4cf42dc0 RCX: 00007fca80a0159b
Sep 10 20:39:30 username kernel: RDX: 00007ffe4cf42dc0 RSI: 00000000c01864ba RDI: 000000000000000d
Sep 10 20:39:30 username kernel: RBP: 00000000c01864ba R08: 0000000000000063 R09: 00000000cccccccc
Sep 10 20:39:30 username kernel: R10: 0000000000000fff R11: 0000000000000246 R12: 00005555793222b0
Sep 10 20:39:30 username kernel: R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000003
Sep 10 20:39:30 username kernel: ---[ end trace 5f1937d5299f4685 ]---
Sep 10 20:39:30 username kscreenlocker_greet[29021]: Qt: Session management error: networkIdsList argument is NULL
Sep 10 20:39:31 username kscreenlocker_greet[29021]: qt.virtualkeyboard.hunspell: Hunspell dictionary is missing for "en_US" . Search paths ("/usr/share/qt/qtvirtualkeyboard/hunspell", "/usr/share/hunspell", "/usr/share/myspell/dicts")
Sep 10 20:39:31 username kscreenlocker_greet[29021]: qt.virtualkeyboard.hunspell: Hunspell dictionary is missing for "en_US" . Search paths ("/usr/share/qt/qtvirtualkeyboard/hunspell", "/usr/share/hunspell", "/usr/share/myspell/dicts")
Sep 10 20:39:31 username kscreenlocker_greet[29021]: qt.virtualkeyboard.hunspell: Hunspell dictionary is missing for "en_US" . Search paths ("/usr/share/qt/qtvirtualkeyboard/hunspell", "/usr/share/hunspell", "/usr/share/myspell/dicts")
Sep 10 20:40:36 username kscreenlocker_greet[29021]: qrc:/QtQuick/VirtualKeyboard/content/components/Keyboard.qml:807:9: QML QQuickItem: Binding loop detected for property "width"
Sep 10 20:41:05 username kernel: kauditd_printk_skb: 5 callbacks suppressed
Sep 10 20:41:06 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:41:48 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:33 username org_kde_powerdevil[942]: org.kde.powerdevil: The profile  "AC" tried to activate "DimDisplay" a non-existent action. This is usually due to an installation problem, a configuration problem, or because the action is not supported
Sep 10 20:42:35 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:37 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:38 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:40 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:42 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:44 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:50 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10
Sep 10 20:42:51 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:51 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.2
Sep 10 20:42:52 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:52 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:54 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:56 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:58 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
Sep 10 20:42:59 username upowerd[1053]: treating change event as add on /sys/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10.1
-- Boot 71eb6baf1cb84e48b64551962472e0d7 --

My logs have some kind of a stack trace so i’m not sure if that’s the same issue

I think its been fixed in linux-firmware 20210818 update:
image
https://bugzilla.kernel.org/show_bug.cgi?id=213391

I have this version installed and its been a while since it last died for me
image

I have a good feeling that this will hold true for a while. I still haven’t had any of those crash-and-reset episodes since installing that version of linux-firmware.

Well that’s the exact version I’m running right now, and the issues is still present.

$ pacman -Q | grep linux-firmware
linux-firmware 20210818.c46b8c3-1