System Freezes randomly

If you you’re still wondering how to fix this - I got the same error. Using the experimental 5.12 kernel fixed this for me, I’ve described it here (add dots to the URL since Manjaro Forum doesn’t let me post links for whatever reason):

forum manjaro org/t/system-frequently-crashing-after-gpu-drivers-update/62139/51

1 Like

From post above.

Will give that a go. I’m getting freezes as well.

Didn’t stop my system from freezing. I’m thinking it’s something to do with nvidia.

With mine it will freeze then after a few seconds the mouse starts moving again but it will not ‘click’ on anything. Left or right.

same problem after last big update

  1. CPU: AMD Ryzen 5 3550H
  2. GPU: Radeon Vega Mobile Gfx (8)
  3. DE: gnome3
  4. Kernel: 5.10
  5. Driver: video-linux

kernel bug or video-linux bug ?

journal log:

4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32774, for process gnome-shell pid 39473 thread gnome-shel:cs0 pid 39512)
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting at address 0x0000800100a00000 from client 27
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00441051
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:          MORE_FAULTS: 0x1
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:          WALKER_ERROR: 0x0
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:          PERMISSION_FAULTS: 0x5
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:          MAPPING_ERROR: 0x0
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x1
4月 26 14:51:26 happyxhw kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:4 pasid:32774, for process gnome-shell pid 39473 thread gnome-shel:cs0 pid 39512)

try to use kernel5.12rc for luck

same with kernel 5.12rc

I am frustrated !

I need help !

I do not want to update manjaro forever !

Are you sure that you’re using kernel 5.12? Because I am, and it stopped the freezing issues. You have to install it through this GUI and then choose it on GRUB (Advanced settings for Manjaro → use kernel 5.12)

Seems it did work. After installing the newer kernel & booting the laptop froze after a few minutes. That’s why I thought it was a driver issue. Since rebooting again it has been running perfectly. :slight_smile: :+1:

Update from my system:

I was running kernel 5.12 for some time now with the latest mesa version installed.
Had another round of the retry page fault error yesterday.

So, for me the problem occurs less frequent but is still there.

2 Likes

Although I might be talking to myself here is another update:

My system is running stable now for 9 days without any freeze.

The changes I did:

remove the following boot parameters:

  • amd_iommu=on
  • iommu=pt

Reason I had set them were severe problems with horizontal glitches/flickering lines six month+ ago.
These problems went away after setting the boot parameters so I left them in until now. As the problems did not appear again after removing, the underlying problem seems to be solved.

I added the following boot parameter:

  • amdgpu.noretry=0

Source: https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-APU-noretry

In addition I reset BIOS to default settings and explicitly disabled IOMMU afterwards (switch from AUTO to DISABLED).

Not sure if the BIOS reset is required.

Fingers crossed that the system remains stable.

3 Likes

Same problem in Manjaro Gnome and KDE using 5.10 kernel. 5.12 kernel fixed the problem. Using a Ryzen 5 3400g.

the same problem.

CPU: AMD R5 3400G
Kernel: 5.4 & 5.12

log:

6月 03 14:47:27 r-lc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x0
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00501031
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x800126199000 from client 27
6月 03 14:47:26 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:5 pasid:32772, for process chrome pid 15031 thread chrome:cs0 pid 15082)
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(4) succeeded!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: [drm] Skip scheduling IBs!
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow done
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow start
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
6月 04 11:44:02 r-lc kernel: [drm] VCN decode and encode initialized successfully(under SPG Mode).
6月 04 11:44:02 r-lc kernel: [drm] kiq ring mec 2 pipe 1 q 0
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available
6月 04 11:44:02 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available
6月 04 11:44:01 r-lc kernel: [drm] reserve 0x400000 from 0xf47fc00000 for PSP TMR
6月 04 11:44:01 r-lc kernel: [drm] PSP is resuming...
6月 04 11:44:01 r-lc kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
6月 04 11:44:01 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
6月 04 11:44:01 r-lc kernel: [Hardware Error]: cache level: L3/GEN, mem/io: IO, mem-tx: IRD, part-proc: SRC (no timeout)
6月 04 11:44:01 r-lc kernel: [Hardware Error]: Coherent Slave Ext. Error Code: 1, Address Violation.
6月 04 11:44:01 r-lc kernel: [Hardware Error]: IPID: 0x0000002e00000000, Syndrome: 0x000000005b240203
6月 04 11:44:01 r-lc kernel: [Hardware Error]: Error Addr: 0x00007ffcffffff00
6月 04 11:44:01 r-lc kernel: [Hardware Error]: CPU:0 (17:18:1) MC20_STATUS[-|-|MiscV|AddrV|-|-|SyndV|UECC|Deferred|-|-]: 0x9c2030000001085b
6月 04 11:44:01 r-lc kernel: [Hardware Error]: Deferred error, no action required.
6月 04 11:44:01 r-lc kernel: mce: [Hardware Error]: Machine check events logged
6月 04 11:44:01 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: MODE2 reset
6月 04 11:44:01 r-lc kernel: [drm] free PSP TMR buffer
6月 04 11:44:01 r-lc kernel: [drm] psp command (0x2) failed and response status is (0x117)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c280 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c260 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c240 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c220 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c200 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c1e0 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c1c0 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c1a0 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c180 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x11b80c160 flags=0x0070]
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
6月 04 11:44:00 r-lc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 754 thread Xorg:cs0 pid 782
6月 04 11:44:00 r-lc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2203937, emitted seq=2203939
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x000080010360a000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103608000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103600000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x000080010361a000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103610000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103612000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103602000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103618000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x000080010360a000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0x7
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x0
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CB (0x0)
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x001C0071
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000800103608000 from client 27
6月 04 11:44:00 r-lc kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 754 thread Xorg:cs0 pid 782)

Same problem here. kenrel 5.12, gnome with wayland and amdgpu.

Having the same issue. Randomly freezing everything and the screen going dark. Only a hard reset helps.

Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:6 pasid:32773, for process firefox pid 54437 thread firefox:cs0 pid 54491)
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:   in page starting at address     0x0000800109201000 from client 27
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00641051
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:          Faulty UTCL2 client ID: TCP (0x8)
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:          MORE_FAULTS: 0x1
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:          WALKER_ERROR: 0x0
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:          PERMISSION_FAULTS: 0x5
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:          MAPPING_ERROR: 0x0
Jun 10 14:37:28 T495 kernel: amdgpu 0000:06:00.0: amdgpu:          RW: 0x1

There is also a post in Arch forums.

No solution.

After a lot of tinkering, I have reached a stable state without any crash for more than a week now.

I removed all boot parameters I added over the last month.

I followed this description and downgraded linux-firmware to an older version from march:
System frequently crashing after GPU drivers update - #149 by Hans12

I’m now running Kernel 5.12 with linux-firmware 20210315.3568f96-3 and latest mesa.

Might be worth a shot if you still have problems.

The problem seems to be in the amdgpu driver which is in linux-firmware. Don´t forget to add linux-firmware to the pacman ignore list if you don´t want it to be updated to the latest version again.

The new mesa version 21.1.4 is available. Huge thanks to the developers.
Here is some short tutorial for testing it:

How is it? Did you upgrade the firmware to the newest version again?

Following the Arch forum, seams not solved…

I had this issue a month or so back, upgraded to 5.11 and it was fixed. After an update yesterday this error is back in full effect, now on Kernel 5.12. The error is the exact same as always:

Downgrading Mesa to 21.1.4 did NOT fix the problem as I initially thought, and the crashing persists after the downgrade too. I don’t know what to do to solve this issue currently, and it appears the error is closely linked to Steam’s usage of mesa. Any ideas?

Edit: The other thread on this issue recommends downgrading linux-firmware. Linux firmware is already downgraded from the last time I attempted to fix this and is exempt from being updated, but I’ll try fickling with it and see if the crashes cease.

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.