System frequently crashing after GPU drivers update

Mine crashes no matter what. VM running or not, KDE Connect or not, Firefox and Thunderbird were the only 2 pieces of software that were consistently on. But it would freeze graphically, it would show me title bars but no content.

1 Like

@HoneyBear52 I’m so sorry to hear that :frowning: I hope we could get some help/advice of the experts that wander around this forums, but I guess they just haven’t seen this.

In my case, I tested some more cases and just came to a few conclusions:

  • This is not a BIOS firmware-GPU driver combination issue, since I updated both to the latest and would still experience freezes.
  • VLC Media Player, which was one of the applications I would experience most crashes at, wasn’t the culprit for it. At first, I thought its latest version (3.0.12.3) would be faulty or incompatible, but even downgrading it would make me experience such crashes and all. The problem must be the mesa driver, along with some other thing, since I experienced VLC crashes only on the latest version of mesa (and not only at VLC, but at other apps as well).
  • Using sudo downgrade mesa, which was my mean of installing an older version of it, is highly unstable since it’ll literally only downgrade that package, not all of its dependencies as well. So yesterday I took a look at my Pacman update log and downgraded all the mesa- related (or at least all I could think of) to their previous version, using sudo downgrade as well; such were mesa, lib32-mesa, lib32-libva-mesa-driver, lib32-mesa-vdpau, libva-mesa-driver, mesa-vdpau.

So I’ll be running with all of those packages downgraded and the 5.4 kernel, non-fallback version. I’ll let you know if anything else comes up.

Last but not least, I’ve thought of trying this solution as well. It’s kinda risky, yeah, but I’d just give it a try if nothing else works, at least to restore my packages to what they were like when everything was working: https://ostechnix.com/downgrade-packages-specific-date-arch-linux/. Maybe using the mirror-date-edition option with the date of my previous yay -Syyu. We’ll get through this :muscle:

1 Like

Downgrading mesa did not work for me. First reboot after downgrading it went for longer, but it eventually froze again.

1 Like

Well, mine froze anyways earlier. I’m re-upgrading mesa just to see what happens.

1 Like

I’ve found another post at this forum which describes a circumstance veeery similar to ours: https://forum.manjaro.org/t/xfce-and-all-apps-terminated-drm-amdgpu-job-timedout-amdgpu-vlc-gpu-reset Even though it’s on an XFCE-based system, I’d bet the culprit is the same as our cases’. Will keep an eye for any update there.

1 Like

Today’s update: I experienced a crash very quickly when running on the non-fallback kernel, but then rebooted through it and been running for almost 11 hours with no crash at all :eyes: Still not trusting anything.

In case somebody finds this helpful, some people at this post Issue with mesa after [Stable Update] 2021-04-09 - #2 by AlbertoSalviaNovella claim that there’s a venv that, when set properly, solved many of their GPU issues (while playing games, tho). We may give it a try.

1 Like

I also came across this (kwin keymap issue after upgrading to plasma 5.21 / Applications & Desktop Environments / Arch Linux Forums) which I set, then it failed. Then after reboot it’s been up. Keep in mind mine stayed up for a long time too the first time I switched to fallback.

1 Like

And just like that, it crashed again, minimal programs running.

1 Like

@HoneyBear52 Are you by chance running some KDE Plasmoid called “Thermal Monitor”? I used to run this one at first, but then found out some comments claiming it caused several crash issues and I replaced it with this other one. It improved my performance, yes, but still had crashes randomly.

Since yesterday, I removed it from my taskbar widgets, and have been running smoothly (touching wood while writing this :sweat_smile:). Maybe it has something to do with the GPU driver issues? Since it’s constantly reading the GPU’s temperature (if you explicitly ask for it, which I used to do). If you’re using it, try to disable it; otherwise, ignore this.

1 Like

Nope, I have stock whatever Manjaro gave me.

1 Like

It froze again, and looking through a ton of different threads; I wanted to try to restart plasmashell, but killing it doesn’t do anything. So from a TTY, I killed /usr/bin/startplasma-x and it restarted no problem. So that’s progress, at least seeing what to restart to fix it.

1 Like

@HoneyBear52 yeah, I thought I had gone through this but it seems not. I had literally not ran into any issues yesterday (booting from normal 5.4 kernel, with downgraded mesa drivers and no thermal monitor plasmoid) and it went like a charm through the hole day.

But just some minutes ago, I was using Slack and the DE got dumb again: windows froze, I could only move the pointer but then not click anyhing at all. I didn’t need to reboot my notebook, but I got automatically logged out. I could log back in after starting a while, but got no chance to get into a TTY session.

Something worth noting about this last issue is that, at first, neither my keyboard (not an external one, the notebook’s own) nor pointer wouldn’t respond (I couldn’t switch Caps, nor enter TTY nor anything). Then, around 5 seconds later, they would both start working again and I could log back in; but this is weeeird. Checking out journalctl's output, I could note the following errors:

  • A lot of page fault errors, starting with [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 838 thread Xorg:cs0 pid 910).
  • Then an [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out! error amidst the page faults.
  • An [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered error ending up the GPU-related ones.
  • Some applications dumping core: Xorg, kglobalaccel5, klauncher, drkonqi, kglobalaccel5, kwin_x11.
  • As well, a couple of errors like This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem. Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, xcb.

Getting really confused here. I might try that environment variable I posted a few replies above. Will keep you updated!

3 Likes

I saw the same [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout when mine failed. The only useful error I see that we both have is This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem. Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, xcb.

But what am I reinstalling? I see available options, but I don’t see what’s failing that needs to be reinstalled. Unless it means all the KDE ones/Xorg, as listed in the point above on yours.

1 Like

Yep, I don’t know which of all those should we re-install. I don’t even know which application is the one who failed to start because of the Qt plugin. Might that Qt plugin be the app itself?

1 Like

Hi,
I have encountered this exact same problem over these two days. Here’s my config (kernel was updated from 5.10.30-1-MANJARO an hour ago):

System:    Kernel: 5.11.14-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.0 
           parameters: BOOT_IMAGE=/boot/vmlinuz-5.11-x86_64 root=UUID=e01fea91-c7bf-49d3-a38b-60c2d9a35d7a rw quiet 
           resume=UUID=9e06eccd-8d79-4717-b02b-51f413526ed6 iommu=soft ivrs_ioapic[32]=00:14.0 ivrs_ioapic[33]=00:00.1 
           Desktop: KDE Plasma 5.21.4 tk: Qt 5.15.2 wm: kwin_x11 vt: 1 dm: SDDM Distro: Manjaro Linux base: Arch Linux 
Machine:   Type: Laptop System: LENOVO product: 20KUA001CD v: ThinkPad E485 serial: <filter> Chassis: type: 10 
           serial: <filter> 
           Mobo: LENOVO model: 20KUA001CD v: SDK0L77769 WIN serial: <filter> UEFI: LENOVO v: R0UET65W (1.45 ) date: 09/29/2018 
Battery:   ID-1: BAT0 charge: 42.1 Wh (99.5%) condition: 42.3/45.0 Wh (94.1%) volts: 12.2 min: 11.1 model: LGC 01AV445 
           type: Li-poly serial: <filter> status: Unknown cycles: 93 
Memory:    RAM: total: 7.4 GiB used: 2.31 GiB (31.2%) 
           RAM Report: permissions: Unable to run dmidecode. Root privileges required. 
CPU:       Info: Quad Core model: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx bits: 64 type: MT MCP arch: Zen 
           family: 17 (23) model-id: 11 (17) stepping: N/A microcode: 810100B cache: L2: 2 MiB bogomips: 31958 
           Speed: 1368 MHz min/max: 1600/2000 MHz boost: enabled Core speeds (MHz): 1: 1368 2: 1542 3: 1368 4: 1368 5: 1533 
           6: 1465 7: 1371 8: 1368 
           Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1 bmi2 bpext clflush clflushopt clzero cmov 
           cmp_legacy constant_tsc cpb cpuid cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid f16c flushbyasid fma fpu 
           fsgsbase fxsr fxsr_opt ht hw_pstate ibpb irperf lahf_lm lbrv lm mca mce misalignsse mmx mmxext monitor movbe msr 
           mtrr mwaitx nonstop_tsc nopl npt nrip_save nx osvw overflow_recov pae pat pausefilter pclmulqdq pdpe1gb 
           perfctr_core perfctr_llc perfctr_nb pfthreshold pge pni popcnt pse pse36 rdrand rdseed rdtscp rep_good sep sev 
           sev_es sha_ni skinit smap smca sme smep ssbd sse sse2 sse4_1 sse4_2 sse4a ssse3 succor svm svm_lock syscall tce 
           topoext tsc tsc_scale v_vmsave_vmload vgif vmcb_clean vme vmmcall wdt xgetbv1 xsave xsavec xsaveerptr xsaveopt 
           xsaves 
           Vulnerabilities: Type: itlb_multihit status: Not affected 
           Type: l1tf status: Not affected 
           Type: mds status: Not affected 
           Type: meltdown status: Not affected 
           Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, STIBP: disabled, RSB filling 
           Type: srbds status: Not affected 
           Type: tsx_async_abort status: Not affected 
Graphics:  Device-1: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] vendor: Lenovo driver: amdgpu v: kernel 
           bus-ID: 05:00.0 chip-ID: 1002:15dd class-ID: 0300 
           Device-2: IMC Networks Integrated Camera type: USB driver: uvcvideo bus-ID: 3-2:3 chip-ID: 13d3:56a6 class-ID: 0e02 
           serial: <filter> 
           Display: x11 server: X.org 1.20.11 compositor: kwin_x11 driver: loaded: amdgpu,ati unloaded: modesetting 
           alternate: fbdev,vesa resolution: <missing: xdpyinfo> 
           OpenGL: renderer: AMD Radeon Vega 8 Graphics (RAVEN DRM 3.40.0 5.11.14-1-MANJARO LLVM 11.1.0) v: 4.6 Mesa 20.3.4 
           direct render: Yes 
Audio:     Device-1: Advanced Micro Devices [AMD/ATI] Raven/Raven2/Fenghuang HDMI/DP Audio vendor: Lenovo 
           driver: snd_hda_intel v: kernel bus-ID: 05:00.1 chip-ID: 1002:15de class-ID: 0403 
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: Lenovo driver: snd_hda_intel v: kernel 
           bus-ID: 05:00.6 chip-ID: 1022:15e3 class-ID: 0403 
           Sound Server-1: ALSA v: k5.11.14-1-MANJARO running: yes 
           Sound Server-2: JACK v: 1.9.17 running: no 
           Sound Server-3: PulseAudio v: 14.2 running: yes 
           Sound Server-4: PipeWire v: 0.3.25 running: no 
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Lenovo driver: r8168 v: 8.048.03-NAPI 
           modules: r8169 port: 2000 bus-ID: 02:00.0 chip-ID: 10ec:8168 class-ID: 0200 
           IF: enp2s0 state: down mac: <filter> 
           Device-2: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter vendor: Lenovo driver: ath10k_pci v: kernel 
           port: 2000 bus-ID: 04:00.0 chip-ID: 168c:0042 class-ID: 0280 
           IF: wlp4s0 state: up mac: <filter> 
           IP v4: <filter> type: dynamic noprefixroute scope: global broadcast: <filter> 
           IP v6: <filter> type: dynamic noprefixroute scope: global 
           IP v6: <filter> type: noprefixroute scope: link 
           IF-ID-1: enp5s0f3u3 state: unknown speed: N/A duplex: N/A mac: <filter> 
           IP v4: <filter> type: dynamic noprefixroute scope: global broadcast: <filter> 
           IP v6: <filter> type: noprefixroute scope: link 
           IF-ID-2: outline-tun0 state: down mac: N/A 
           IP v4: <filter> scope: global 
           WAN IP: <filter> 
Bluetooth: Device-1: Qualcomm Atheros type: USB driver: btusb v: 0.8 bus-ID: 3-1:2 chip-ID: 0cf3:e500 class-ID: e001 
           Report: This feature requires one of these tools: hciconfig/bt-adapter 
Logical:   Message: No LVM data was found. 
RAID:      Message: No RAID data was found. 
Drives:    Local Storage: total: 238.47 GiB used: 119.36 GiB (50.1%) 
           SMART Message: Unable to run smartctl. Root privileges required. 
           ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Toshiba model: N/A size: 238.47 GiB block-size: physical: 512 B 
           logical: 512 B speed: 31.6 Gb/s lanes: 4 rotation: SSD serial: <filter> rev: 5108AALA temp: 31.9 C scheme: GPT 
           Message: No Optical or Floppy data was found. 
Partition: ID-1: / raw-size: 229.37 GiB size: 224.77 GiB (97.99%) used: 119.36 GiB (53.1%) fs: ext4 dev: /dev/nvme0n1p2 
           maj-min: 259:2 label: N/A uuid: e01fea91-c7bf-49d3-a38b-60c2d9a35d7a 
           ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%) used: 264 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 
           maj-min: 259:1 label: N/A uuid: 6B8C-6662 
Swap:      Kernel: swappiness: 60 (default) cache-pressure: 100 (default) 
           ID-1: swap-1 type: partition size: 8.8 GiB used: 0 KiB (0.0%) priority: -2 dev: /dev/nvme0n1p3 maj-min: 259:3 
           label: N/A uuid: 9e06eccd-8d79-4717-b02b-51f413526ed6 
Unmounted: Message: No Unmounted partitions found. 
USB:       Hub-1: 1-0:1 info: Full speed (or root) Hub ports: 4 rev: 2.0 speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
           Device-1: 1-3:4 info: Xiaomi Mi/Redmi series (RNDIS) type: CDC-Data driver: rndis_host interfaces: 2 rev: 2.0 
           speed: 480 Mb/s power: 500mA chip-ID: 2717:ff80 class-ID: 0a00 serial: <filter> 
           Device-2: 1-4:3 info: Lenovo Laser Wireless Mouse type: Mouse driver: hid-generic,usbhid interfaces: 1 rev: 2.0 
           speed: 12 Mb/s power: 100mA chip-ID: 17ef:6039 class-ID: 0301 
           Hub-2: 2-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s chip-ID: 1d6b:0003 class-ID: 0900 
           Hub-3: 3-0:1 info: Full speed (or root) Hub ports: 2 rev: 2.0 speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
           Device-1: 3-1:2 info: Qualcomm Atheros type: Bluetooth driver: btusb interfaces: 2 rev: 2.0 speed: 12 Mb/s 
           power: 100mA chip-ID: 0cf3:e500 class-ID: e001 
           Device-2: 3-2:3 info: IMC Networks Integrated Camera type: Video driver: uvcvideo interfaces: 2 rev: 2.0 
           speed: 480 Mb/s power: 500mA chip-ID: 13d3:56a6 class-ID: 0e02 serial: <filter> 
           Hub-4: 4-0:1 info: Full speed (or root) Hub ports: 1 rev: 3.1 speed: 10 Gb/s chip-ID: 1d6b:0003 class-ID: 0900 
Sensors:   System Temperatures: cpu: 57.6 C mobo: 0.0 C gpu: amdgpu temp: 57.0 C 
           Fan Speeds (RPM): cpu: 0 
Info:      Processes: 231 Uptime: 20m wakeups: 3 Init: systemd v: 247 tool: systemctl Compilers: gcc: 10.2.0 clang: 11.1.0 
           Packages: pacman: 1596 lib: 418 Shell: Zsh v: 5.8 running-in: yakuake inxi: 3.3.03 

It began yesterday with VLC hanging (Very Large Cone more like Very Likely to Crash amirite), then a frozen system, in which I can’t even Ctrl+Alt+F2 for another TTY. Today, I experimented with the symptoms and their potential causes. journalctl -e dump (excerpt):

Apr 17 13:40:24 fakefred plasmashell[1122]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/CompactRepresentation.qml:145:13: QML PropertyChanges: Cannot assign to non-existent property "visible"
Apr 17 13:40:24 fakefred plasmashell[1122]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationPopup.qml:116:15: QML QQuickItem: Binding loop detected for property "height"
Apr 17 13:40:24 fakefred plasmashell[1122]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationPopup.qml:116:15: QML QQuickItem: Binding loop detected for property "height"
Apr 17 13:40:30 fakefred plasmashell[1122]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/CompactRepresentation.qml:145:13: QML PropertyChanges: Cannot assign to non-existent property "visible"
Apr 17 13:40:30 fakefred plasmashell[1122]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationPopup.qml:116:15: QML QQuickItem: Binding loop detected for property "height"
Apr 17 13:40:30 fakefred plasmashell[1122]: file:///usr/share/plasma/plasmoids/org.kde.plasma.notifications/contents/ui/NotificationPopup.qml:116:15: QML QQuickItem: Binding loop detected for property "height"
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105600000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105601000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105602000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105603000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105604000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105605000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105606000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105607000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105608000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105609000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: gmc_v9_0_process_interrupt: 47350 callbacks suppressed
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105600000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105605000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105601000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105607000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105604000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105602000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105606000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105603000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105600000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105605000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: gmc_v9_0_process_interrupt: 47218 callbacks suppressed
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105605000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105603000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105606000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105607000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105604000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105602000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105601000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105600000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105605000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 866 thread Xorg:cs0 pid 867)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:   in page starting at address 0x0000800105603000 from client 27
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MORE_FAULTS: 0x1
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          WALKER_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          MAPPING_ERROR: 0x0
Apr 17 13:40:45 fakefred kernel: amdgpu 0000:05:00.0: amdgpu:          RW: 0x0
Apr 17 13:40:45 fakefred kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
Apr 17 13:40:49 fakefred kcminit[5627]: Initializing  "kcm_mouse" :  "kcminit_mouse"
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39067, resource id: 4194309, major code: 18 (ChangeProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39071, resource id: 39847493, major code: 19 (DeleteProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39074, resource id: 39847493, major code: 19 (DeleteProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39075, resource id: 39847493, major code: 18 (ChangeProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39076, resource id: 39847493, major code: 19 (DeleteProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39077, resource id: 39847493, major code: 19 (DeleteProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39078, resource id: 39847493, major code: 19 (DeleteProperty), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39079, resource id: 39847493, major code: 7 (ReparentWindow), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39080, resource id: 39847493, major code: 6 (ChangeSaveSet), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39081, resource id: 39847493, major code: 2 (ChangeWindowAttributes), minor code: 0
Apr 17 13:40:49 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39082, resource id: 39847493, major code: 10 (UnmapWindow), minor code: 0
Apr 17 13:40:51 fakefred kcminit[5644]: Initializing  "kcm_mouse" :  "kcminit_mouse"
Apr 17 13:40:51 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39277, resource id: 4194309, major code: 18 (ChangeProperty), minor code: 0
Apr 17 13:40:51 fakefred kcminit[5651]: Initializing  "kcm_mouse" :  "kcminit_mouse"
Apr 17 13:40:51 fakefred kwin_x11[1048]: qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 39338, resource id: 4194309, major code: 18 (ChangeProperty), minor code: 0

Measures I have taken:

  • removed all plasmoids on the desktop (did not work)
  • downgraded mesa and its dependencies (did not work on its own at least)
  • booted into initramfs (problem persisted. I’m no longer in initramfs)
  • booted with kernel param amdgpu.noretry=0 (once; param is now removed)
  • updated the kernel from 5.10.30-1 to 5.11.14-1

I updated the kernel because of this commit:
linux kernel commit 1b0b6e939f112949089e32ec89fd27796677263a

Apparently it fixed some size overflow in amdgpu. Does this help avoid page faults in the journal? I don’t really know, I’m not a kernel person.

Thank you all for your investigation which really offloaded a burden for me. uptime says 44min, I don’t want to claim a premature victory, but running VLC didn’t crash anything for me as of now.

Edit: a few more changes I noticed include the lack of animations when switching between virtual desktops, invoke the dropdown terminal, and a black background where there should have been my wallpaper in the shutdown dialog.

Edit: uptime is at 6.5 hrs now and still no sign of crashing. Been running OpenGL and multimedia stuff (incl. VLC) for hours. Will follow this thread to see what other solutions there are.

7 Likes

How did you upgrade kernel? I don’t have another update, but I’m on stable. I’ve resigned myself to drop to tty and kill /usr/bin process because it works.

2 Likes

@fkfd Thanks for answering and posting your researches! Your journal logs are reeeally similar to ours, so I guess we all are having the same problem :frowning: .

That kernel update you mentioned looks way interesting! I’m no kernel expert either, but many of our GPU error logs seem related to page faults and memory overflows, which that commit sounds like fixing (mostly if we have 64-bit machines I guess, since it casts unsigned 32-bit integers to 64-bit ones; hopefully that prevents a good amount of overflows, if not all of them); I’ll keep my fingers crossed :crossed_fingers: hoping that it helps us. Still, don’t wanna be pessimist but this issue is way too tricky, so I don’t know if 6.5 hours or even a day are enough (the other day I had no issues at all, but the next day I got like 3 freezes in a row lol).

Would you tell me what version of mesa drivers are you using while running that kernel version? And could you tell me which dependencies did you downgrade and which version to, please? So we can try it on our own.

I’ll keep y’all updated about any finding I can make. Let’s keep in touch for that miraculous solution we’re looking for :muscle:

2 Likes

@fkfd Would you also tell me how did you install that kernel version? The latest 5.11 version I’m getting is 5.11.10-1, not the one you mentioned :confused:

1 Like

I downgraded mesa with the command I found in this thread: pacman -U mesa-20.3.4-3-x86_64.pkg.tar.zst lib32-mesa-20.3.4-3-x86_64.pkg.tar.zst lib32-libva-mesa-driver-20.3.4-3-x86_64.pkg.tar.zst lib32-mesa-vdpau-20.3.4-3-x86_64.pkg.tar.zst libva-mesa-driver-20.3.4-3-x86_64.pkg.tar.zst mesa-vdpau-20.3.4-3-x86_64.pkg.tar.zst so the versions are 20.3.4

As for how I installed the kernel, it might be because I’m on the testing branch. The kernel was installed with the mhwd frontend.

2 Likes

That makes a lot of sense, I’m on the stable branch. Guess I’ll have to wait for an updated kernel version to be released to the stable branch :sweat_smile:.

Did you experience any crash/freeze while running with that kernel? Did you achieve a good, successful uptime?

1 Like