System frequently crashing after GPU drivers update

same problem with amd cpu & gpu, kernel 5.12 rc not help

1 Like

Ohhh, I’m sorry to hear that. I’d have thought that is some disk issue, but if your scan showed no faulty devices then that may not be it. Are you running on the experimental kernel?

1 Like

Same problem also using an AMD 3400G, since update a few days ago system is unstable and will page fault and crash, sometimes if i hit ctrl-alt-f1 it will reset to login screen but sometimes it’s just total hard freeze.
Journalctl shows long list of errors in red

 Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32778, for process brave pid 25334 thread brave:cs0 pid 25361)
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:   in page starting at address 0x0000800000579000 from client 27
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00400C31
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:          Faulty UTCL2 client ID: CPG (0x6)
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 27 21:27:04 desktop kernel: amdgpu 0000:09:00.0: amdgpu:          RW: 0x0

etc

I am not a power user or anything and i dont know how to downgrade stuff, i just update everything every couple of days.

2 Likes

this may help,maybe,not sure, still normal for almost 48 hours, :innocent:

kernel-5.10 with AMD Ryzen 5 3550H with Radeon Vega Mobile Gfx (8)

still crashing, :broken_heart:

I will try pop os !

2 Likes

My mistake really as my post is somewhat related - but only because I was experiencing random system freezes since the last update - as I have an AMD GPU - I thought your situation was perhaps connected. Though I saw no error log connected to my graphics and only a disk error, sometimes though, I find you fix one error and then the other is revealed - however my system is now stable and no serious issues visible in any logs. My kernel is the latest but not experimental. But thanks for taking the time out to reply and I hope you find a solution soon.

2 Likes

Ohh, sorry to hear that you got affected by this too. It seems there’s a big bunch of us suffering this already :cry:

Have you tried running the 5.12 experimental kernel? I wouldn’t call it a definitive solution, but it has proved to enhance the experience of many user here (mine as well), since it introduced a fix related to GPU memory overflows (to put it in a simple way; a better and detailed explanation can be found some comments above, with a reference to the commit regarding that fix).

You can simply do so by opening the Kernel application (if you’re running KDE) or by running sudo pacman -S linux512 at your terminal. Both will install the latest kernel, and then you’ll have to reboot your machine to start it (GRUB should have it set as the default kernel after installing it). You can verify what kernel you’re running after logging in by typing uname -r in the terminal.

Nice! I hadn’t thought nor seeked through those principles as a possible cause for this. I’ll give it a deeper read later and probably get in touch with that comment’s author to see what we can intersect :thinking:

Quick update from my system:

I´m still using Kernel 5.12rc7 but added the following boot parameter:

rcu_nocbs=0-7

Since a few days I had no GPU related crashes/freezes (just one instance of USB driver crash… but thats another problem).

3 Likes

I’ll definitely try that! Did you add it to your GRUB setup or are you adding it manually before booting?

I’ve been experiencing less crashes than before with the newest kernel, but they still happen some times. I’ve just had a system freeze right now after some screen tearing effect I had never seen before (maybe it’s related to the GPU power issue that @happyxhw mentioned), but was able to softly stop the system by TTY-ing and executing a shutdown now.

UPDATE: I hope that kernel parameter is really a solution, but per its documentation I’d bet it’s a different thing. Here it says that it’s for removing certain CPU threads from the candidates list for RCU callbacks (Read-Copy-Update); maybe it has some influence on GPU processes :thinking:

I’ve just read the announcement post of the latest stable update at [Stable Update] 2021-04-28 - Kernels, Wine, Ruby, JDK, KDE-Dev, Mesa 21.0.3, KDE Apps 21.04, Python, Haskell, Mate 1.24.2, Virtualbox, Thunderbird, and it claims that

Mesa got fixed for issues reported on AMD graphics cards

Hopefully this is the case. I’ll try running a system upgrade.

Awesome, i’ve upgraded too and to 5.12 kernel and we’ll see if i have the problem again.

So the install of the 5.12 kernel did help but i still have some less frequent (it’s a feeling not really fact based) issues. The error shifted from amdgpu to Wayland but i’m not sure if it’s progress or an accident :wink:
i have a ryzen 3500U and Vega 8 Graphics

		Apr 29 19:08:32 kernel: audit: type=1130 audit(1619716112.115:273): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-coredump@0-8682-0 comm="systemd" exe="/usr/lib/systemd/>
Apr 29 19:08:33 systemd-coredump[8683]: Process 1532 (Xwayland) of user 1000 dumped core.
   
   Stack trace of thread 1532:
   #0  0x00007fcb1bb3fef5 raise (libc.so.6 + 0x3cef5)
   #1  0x00007fcb1bb29862 abort (libc.so.6 + 0x26862)
   #2  0x0000562cb5272fdb n/a (Xwayland + 0x15dfdb)
   #3  0x0000562cb527cc3d n/a (Xwayland + 0x167c3d)
   #4  0x0000562cb52714b5 n/a (Xwayland + 0x15c4b5)
   #5  0x00007fcb1bb3ff80 __restore_rt (libc.so.6 + 0x3cf80)
   #6  0x00007fcb1a25ef8d n/a (radeonsi_dri.so + 0x940f8d)
   #7  0x00007fcb1a25863d n/a (radeonsi_dri.so + 0x93a63d)
   #8  0x00007fcb1a19232d n/a (radeonsi_dri.so + 0x87432d)
   #9  0x00007fcb1a19475c n/a (radeonsi_dri.so + 0x87675c)
   #10 0x00007fcb1a2591dd n/a (radeonsi_dri.so + 0x93b1dd)
   #11 0x00007fcb1a22683c n/a (radeonsi_dri.so + 0x90883c)
   #12 0x00007fcb1a23dd09 n/a (radeonsi_dri.so + 0x91fd09)
   #13 0x00007fcb1a23df49 n/a (radeonsi_dri.so + 0x91ff49)
   #14 0x00007fcb1a50e48b n/a (radeonsi_dri.so + 0xbf048b)
   #15 0x00007fcb1a240717 n/a (radeonsi_dri.so + 0x922717)
   #16 0x00007fcb1a240da4 n/a (radeonsi_dri.so + 0x922da4)
   #17 0x00007fcb1a4f54ff n/a (radeonsi_dri.so + 0xbd74ff)
   #18 0x00007fcb1a4f624b n/a (radeonsi_dri.so + 0xbd824b)
   #19 0x00007fcb1a82a271 n/a (radeonsi_dri.so + 0xf0c271)
   #20 0x00007fcb1a507818 n/a (radeonsi_dri.so + 0xbe9818)
   #21 0x00007fcb19ae019a n/a (radeonsi_dri.so + 0x1c219a)
   #22 0x0000562cb515dc2a n/a (Xwayland + 0x48c2a)
   #23 0x0000562cb51b19f5 n/a (Xwayland + 0x9c9f5)
   #24 0x0000562cb5143cc4 n/a (Xwayland + 0x2ecc4)
   #25 0x00007fcb1bb2ab25 __libc_start_main (libc.so.6 + 0x27b25)
   #26 0x0000562cb514500e n/a (Xwayland + 0x3000e)
   
   Stack trace of thread 1790:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1788:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1781:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1782:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1783:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1784:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1791:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1785:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1794:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1792:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1793:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1796:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1805:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1789:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1786:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)
   
   Stack trace of thread 1787:
   #0  0x00007fcb1b9f19ba __futex_abstimed_wait_common64 (libpthread.so.0 + 0x159ba)
   #1  0x00007fcb1b9eb260 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xf260)
   #2  0x00007fcb19a709ac n/a (radeonsi_dri.so + 0x1529ac)
   #3  0x00007fcb19a6a5f8 n/a (radeonsi_dri.so + 0x14c5f8)
   #4  0x00007fcb1b9e5299 start_thread (libpthread.so.0 + 0x9299)
   #5  0x00007fcb1bc02053 __clone (libc.so.6 + 0xff053)

Apr 29 19:08:33 systemd[1]: systemd-coredump@0-8682-0.service: Succeeded.
Apr 29 19:08:33 systemd[1]: systemd-coredump@0-8682-0.service: Succeeded.
Apr 29 19:08:33 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-coredump@0-8682-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? te>
Apr 29 19:08:33 kernel: audit: type=1131 audit(1619716113.888:274): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-coredump@0-8682-0 comm="systemd" exe="/usr/lib/systemd/>
Apr 29 19:08:33 Ashes of the Singularity Escalation.desktop[5896]: [0429/190833.968540:ERROR:x11_util.cc(109)] X IO error received (X server probably went away)
Apr 29 19:08:33 firefox.desktop[2824]: Gdk-Message: 19:08:33.968: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 chromium.desktop[8333]: [8333:8333:0429/190833.968464:ERROR:connection.cc(66)] X connection error received.
Apr 29 19:08:33 chromium.desktop[8333]: [8333:8355:0429/190833.968464:ERROR:connection.cc(66)] X connection error received.
Apr 29 19:08:33 firefox.desktop[2943]: Gdk-Message: 19:08:33.968: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox.desktop[2732]: Gdk-Message: 19:08:33.970: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox.desktop[2651]: Gdk-Message: 19:08:33.971: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox[2576]: firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox.desktop[2704]: Gdk-Message: 19:08:33.971: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox.desktop[2712]: Gdk-Message: 19:08:33.970: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox.desktop[2777]: Gdk-Message: 19:08:33.970: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:33 firefox.desktop[2886]: Gdk-Message: 19:08:33.968: /usr/lib/firefox/firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Apr 29 19:08:34 audit: BPF prog-id=46 op=UNLOAD
Apr 29 19:08:34 kernel: audit: type=1334 audit(1619716114.065:275): prog-id=46 op=UNLOAD
Apr 29 19:08:34 kernel: audit: type=1334 audit(1619716114.065:276): prog-id=45 op=UNLOAD
Apr 29 19:08:34 audit: BPF prog-id=45 op=UNLOAD
Apr 29 19:08:34 systemd[1]: tmp-.mount_eiskalW8kzoR.mount: Succeeded.
Apr 29 19:08:34 systemd[1409]: tmp-.mount_eiskalW8kzoR.mount: Succeeded.
Apr 29 19:08:34 gsd-xsettings[1937]: gsd-xsettings: Fatal IO error 11 (Resource temporarily unavailable) on X server :1.
Apr 29 19:08:34 ibus-daemon[1934]: GChildWatchSource: Exit status of a child process was requested but ECHILD was received by waitpid(). See the documentation of g_child_watch_source_new() for possible>
Apr 29 19:08:34 systemd[1409]: app-gnome-chromium-7875.scope: Succeeded.
Apr 29 19:08:34 systemd[1409]: org.gnome.SettingsDaemon.XSettings.service: Main process exited, code=exited, status=1/FAILURE
Apr 29 19:08:34 systemd[1409]: org.gnome.SettingsDaemon.XSettings.service: Failed with result 'exit-code'.
Apr 29 19:08:34 systemd[1409]: app-gnome-com.nextcloud.desktopclient.nextcloud-1849.scope: Succeeded.
Apr 29 19:08:34 firefox.desktop[2802]: Exiting due to channel error.
Apr 29 19:08:34 firefox.desktop[2757]: Exiting due to channel error.
Apr 29 19:08:34 systemd[1409]: app-gnome-firefox-2576.scope: Succeeded.
Apr 29 19:08:35 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:08:35 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:08:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:08:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:08:37 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:08:38 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:08:45 wpa_supplicant[692]: nl80211: send_and_recv->nl_recvmsgs failed: -33
Apr 29 19:09:35 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:09:35 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:09:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:09:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:09:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:09:38 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:10:35 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:10:35 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:10:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:10:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist
Apr 29 19:10:36 gsd-color[1729]: could not find device: property match 'XRANDR_name'='HDMI-1' does not exist

Do you also use X or are you (already) Wayland only?

/// on Maintopic
I slightly get the vibe it’s only the Integrated Graphics that are confronted with this problem isn’t it? (I am myself using an AMD 3400G, therefore gotcha…)

Do your know what exactly this Kernel Parameter changes? e.g. how it interacts with the GPU side of of the drivers?

pop os now :joy:

waiting for fixing, then back

This kernal parameter shifts the task of Read-Copy-Updates (RCU) from softIRQs to kernel threads for the (logical) cores listed. In general it lowers the interrupt load for the spcified CPU cores and might therefore have a positive effect on any timing related issues.

I found it mentioned in many older threads dealing with page fault errors on older Ryzen CPUs with integrated graphics. As it sounded like a good idea anyway I thought I could give it a try.
And, at least for me, it improved the situation, thats why I thought it would be worth sharing.

3 Likes

I use Xorg, but another user who I’ve been talking to about this same issue told me that switching from Wayland to Xorg didn’t solve the problem definitely; it only reduced the crash frequency.

Yep, that makes some sense since we have integrated GPUs to our CPUs – maybe they’re side-affected by this parameter. I can’t state nor deny it :sweat_smile:

Neither could I :innocent:
Fingers crossed, that this parameter together with the new Mesa version solves the problem.

4 Likes

I’ve still got the newest 5.12 Kernel up and running.

But sadly, today was the first day since I’ve switched Kernel version 11 days ago, that the same amdgpu error occurred again:

[31060.588655] amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 773 thread Xorg:cs0 pid 787)
[31060.588671] amdgpu 0000:06:00.0: amdgpu:   in page starting at address 0x80011e600000 from client 27
[31060.588682] amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[31060.588687] amdgpu 0000:06:00.0: amdgpu: 	 Faulty UTCL2 client ID: TCP (0x8)
[31060.588691] amdgpu 0000:06:00.0: amdgpu: 	 MORE_FAULTS: 0x1
[31060.588695] amdgpu 0000:06:00.0: amdgpu: 	 WALKER_ERROR: 0x0
[31060.588698] amdgpu 0000:06:00.0: amdgpu: 	 PERMISSION_FAULTS: 0x3
[31060.588702] amdgpu 0000:06:00.0: amdgpu: 	 MAPPING_ERROR: 0x0
[31060.588705] amdgpu 0000:06:00.0: amdgpu: 	 RW: 0x0
[31060.588713] amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 773 thread Xorg:cs0 pid 787)
[31060.588720] amdgpu 0000:06:00.0: amdgpu:   in page starting at address 0x80011e601000 from client 27

So as many others have noted, this problem may occur less frequently by updating the kernel, but it’s not yet fixed.

3 Likes