Gpu crashes when I play games or when I use the system

Are you using the repository MESA or the “non-free” MESA?

//EDIT: apparently the nonfree.eu MESA is not existing anymore but I know I had issues with that in the past.

yes im using the repository MESA

it fixed the instant crash on some specific games,and can play games for a while, played for +1 hour and everything was fine till I had a really hard crash, colors were messed up and etc. so might be time for me to check temps again since isn’t crashing right away anymore

edit: is not the temps they don’t ever go above 60C

Aug 21 01:53:16 rotip309 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 21 01:53:16 rotip309 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 21 01:53:16 rotip309 kernel: [drm] amdgpu kernel modesetting enabled.
Aug 21 01:53:16 rotip309 kernel: amdgpu: Virtual CRAT table created for CPU
Aug 21 01:53:16 rotip309 kernel: amdgpu: Topology: Add CPU node
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Aug 21 01:53:16 rotip309 kernel: amdgpu: ATOM BIOS: xxx-xxx-xxx
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Aug 21 01:53:16 rotip309 kernel: [drm] amdgpu: 4096M of VRAM memory ready
Aug 21 01:53:16 rotip309 kernel: [drm] amdgpu: 4944M of GTT memory ready.
Aug 21 01:53:16 rotip309 kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Aug 21 01:53:16 rotip309 kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Aug 21 01:53:16 rotip309 kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
Aug 21 01:53:16 rotip309 kernel: amdgpu: Virtual CRAT table created for GPU
Aug 21 01:53:16 rotip309 kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Aug 21 01:53:16 rotip309 kernel: kfd kfd: amdgpu: added device 1002:67ef
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Aug 21 01:53:16 rotip309 kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 1
Aug 21 01:53:16 rotip309 kernel: fbcon: amdgpudrmfb (fb0) is primary device
Aug 21 01:53:16 rotip309 kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 21 01:53:17 rotip309 kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Aug 21 01:55:28 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Disabling VM faults because of PRT request!
Aug 21 11:20:10 rotip309 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 21 11:20:10 rotip309 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 21 11:20:10 rotip309 kernel: [drm] amdgpu kernel modesetting enabled.
Aug 21 11:20:10 rotip309 kernel: amdgpu: Virtual CRAT table created for CPU
Aug 21 11:20:10 rotip309 kernel: amdgpu: Topology: Add CPU node
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Aug 21 11:20:10 rotip309 kernel: amdgpu: ATOM BIOS: xxx-xxx-xxx
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Aug 21 11:20:10 rotip309 kernel: [drm] amdgpu: 4096M of VRAM memory ready
Aug 21 11:20:10 rotip309 kernel: [drm] amdgpu: 4944M of GTT memory ready.
Aug 21 11:20:10 rotip309 kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Aug 21 11:20:10 rotip309 kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Aug 21 11:20:10 rotip309 kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
Aug 21 11:20:10 rotip309 kernel: amdgpu: Virtual CRAT table created for GPU
Aug 21 11:20:10 rotip309 kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Aug 21 11:20:10 rotip309 kernel: kfd kfd: amdgpu: added device 1002:67ef
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Aug 21 11:20:10 rotip309 kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 1
Aug 21 11:20:10 rotip309 kernel: fbcon: amdgpudrmfb (fb0) is primary device
Aug 21 11:20:10 rotip309 kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 21 11:20:11 rotip309 kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Aug 21 17:51:06 rotip309 kernel: amdgpu 0000:01:00.0: [drm] enabling link 0 failed: 15
Aug 21 20:33:03 rotip309 kernel:  amdgpu_ttm_tt_populate+0x7c/0xc0 [amdgpu e0763ab4404125a49af42b819560f63816d42ca6]
Aug 21 20:33:03 rotip309 kernel:  amdgpu_device_prepare+0x56/0xf0 [amdgpu e0763ab4404125a49af42b819560f63816d42ca6]
Aug 21 20:33:03 rotip309 kernel: amdgpu 0000:01:00.0: PM: device_prepare(): pci_pm_prepare returns -12
Aug 21 20:33:03 rotip309 kernel: amdgpu 0000:01:00.0: PM: not prepared for power transition: code -12
Aug 21 20:33:05 rotip309 kernel: amdgpu 0000:01:00.0: PM: device_prepare(): pci_pm_prepare returns -12
Aug 21 20:33:05 rotip309 kernel: amdgpu 0000:01:00.0: PM: not prepared for power transition: code -12
Aug 22 11:56:16 rotip309 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 22 11:56:16 rotip309 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 22 11:56:16 rotip309 kernel: [drm] amdgpu kernel modesetting enabled.
Aug 22 11:56:16 rotip309 kernel: amdgpu: Virtual CRAT table created for CPU
Aug 22 11:56:16 rotip309 kernel: amdgpu: Topology: Add CPU node
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Aug 22 11:56:16 rotip309 kernel: amdgpu: ATOM BIOS: xxx-xxx-xxx
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Aug 22 11:56:16 rotip309 kernel: [drm] amdgpu: 4096M of VRAM memory ready
Aug 22 11:56:16 rotip309 kernel: [drm] amdgpu: 4944M of GTT memory ready.
Aug 22 11:56:16 rotip309 kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Aug 22 11:56:16 rotip309 kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Aug 22 11:56:16 rotip309 kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
Aug 22 11:56:16 rotip309 kernel: amdgpu: Virtual CRAT table created for GPU
Aug 22 11:56:16 rotip309 kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Aug 22 11:56:16 rotip309 kernel: kfd kfd: amdgpu: added device 1002:67ef
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Aug 22 11:56:16 rotip309 kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 1
Aug 22 11:56:16 rotip309 kernel: fbcon: amdgpudrmfb (fb0) is primary device
Aug 22 11:56:16 rotip309 kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 22 11:56:17 rotip309 kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Aug 22 12:15:25 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Disabling VM faults because of PRT request!
Aug 22 14:33:02 rotip309 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=4682901, emitted seq=4682903
Aug 22 14:33:02 rotip309 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Gw2-64.exe pid 7167 thread dxvk-submit pid 7519
Aug 22 14:33:02 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Aug 22 14:33:03 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
Aug 22 14:33:03 rotip309 kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Aug 22 14:33:03 rotip309 kernel: amdgpu: cp is busy, skip halt cp
Aug 22 14:33:03 rotip309 kernel: amdgpu: rlc is busy, skip halt rlc
Aug 22 14:33:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: BACO reset
Aug 22 14:33:04 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 22 14:33:04 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.1 test failed (-110)
Aug 22 14:33:04 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.4 test failed (-110)
Aug 22 14:33:04 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow start
Aug 22 14:33:04 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow done
Aug 22 14:33:04 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset(2) succeeded!
Aug 22 14:33:04 rotip309 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Aug 22 14:33:04 rotip309 steam[4492]: radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
                                                  #16 0x000077b62ef9c529 n/a (amdgpu_drv.so + 0x18529)
                                                  #17 0x000077b62ef97b31 n/a (amdgpu_drv.so + 0x13b31)
                                                  #6  0x000077b62ef8d931 n/a (amdgpu_drv.so + 0x9931)
Aug 22 22:00:03 rotip309 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 22 22:00:03 rotip309 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 22 22:00:03 rotip309 kernel: [drm] amdgpu kernel modesetting enabled.
Aug 22 22:00:03 rotip309 kernel: amdgpu: Virtual CRAT table created for CPU
Aug 22 22:00:03 rotip309 kernel: amdgpu: Topology: Add CPU node
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Aug 22 22:00:03 rotip309 kernel: amdgpu: ATOM BIOS: xxx-xxx-xxx
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Aug 22 22:00:03 rotip309 kernel: [drm] amdgpu: 4096M of VRAM memory ready
Aug 22 22:00:03 rotip309 kernel: [drm] amdgpu: 4944M of GTT memory ready.
Aug 22 22:00:03 rotip309 kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Aug 22 22:00:03 rotip309 kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Aug 22 22:00:03 rotip309 kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
Aug 22 22:00:03 rotip309 kernel: amdgpu: Virtual CRAT table created for GPU
Aug 22 22:00:03 rotip309 kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Aug 22 22:00:03 rotip309 kernel: kfd kfd: amdgpu: added device 1002:67ef
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Aug 22 22:00:03 rotip309 kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 1
Aug 22 22:00:03 rotip309 kernel: fbcon: amdgpudrmfb (fb0) is primary device
Aug 22 22:00:03 rotip309 kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 22 22:00:05 rotip309 kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Aug 22 23:01:22 rotip309 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 22 23:01:22 rotip309 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 22 23:01:22 rotip309 kernel: [drm] amdgpu kernel modesetting enabled.
Aug 22 23:01:22 rotip309 kernel: amdgpu: Virtual CRAT table created for CPU
Aug 22 23:01:22 rotip309 kernel: amdgpu: Topology: Add CPU node
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Aug 22 23:01:22 rotip309 kernel: amdgpu: ATOM BIOS: xxx-xxx-xxx
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Aug 22 23:01:22 rotip309 kernel: [drm] amdgpu: 4096M of VRAM memory ready
Aug 22 23:01:22 rotip309 kernel: [drm] amdgpu: 4944M of GTT memory ready.
Aug 22 23:01:22 rotip309 kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Aug 22 23:01:22 rotip309 kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Aug 22 23:01:22 rotip309 kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
Aug 22 23:01:22 rotip309 kernel: amdgpu: Virtual CRAT table created for GPU
Aug 22 23:01:22 rotip309 kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Aug 22 23:01:22 rotip309 kernel: kfd kfd: amdgpu: added device 1002:67ef
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Aug 22 23:01:22 rotip309 kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 1
Aug 22 23:01:22 rotip309 kernel: fbcon: amdgpudrmfb (fb0) is primary device
Aug 22 23:01:22 rotip309 kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 22 23:01:23 rotip309 kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Aug 23 01:15:53 rotip309 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=3873135, emitted seq=3873137
Aug 23 01:15:53 rotip309 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 9738 thread firefox:cs0 pid 9791
Aug 23 01:15:53 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Aug 23 01:15:53 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
Aug 23 01:15:53 rotip309 kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Aug 23 01:15:54 rotip309 kernel: amdgpu: cp is busy, skip halt cp
Aug 23 01:15:54 rotip309 kernel: amdgpu: rlc is busy, skip halt rlc
Aug 23 01:15:54 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: BACO reset
Aug 23 01:15:54 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 23 01:15:55 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.1 test failed (-110)
Aug 23 01:15:55 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.5 test failed (-110)
Aug 23 01:15:55 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.7 test failed (-110)
Aug 23 01:15:55 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow start
Aug 23 01:15:55 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow done
Aug 23 01:15:55 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset(2) succeeded!
Aug 23 01:15:55 rotip309 kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
                                                  #16 0x0000781c913bc529 n/a (amdgpu_drv.so + 0x18529)
                                                  #17 0x0000781c913b7b31 n/a (amdgpu_drv.so + 0x13b31)
                                                  #6  0x0000781c913ad931 n/a (amdgpu_drv.so + 0x9931)

another crash

What if you use a LTS kernel, and NOT a Real Time kernel? I know you have been advised to switch to 6.10 but for troubleshooting why adding more layers of issues?

Go with the latest LTS non RT kernel. Do more tests with this Stable and supported kernel.
I would even try older LTS kernel as you have not the “latest and greatest” hardware but “old” hardware.

Indeed. I tried a RT kernel and found I had issues with VirtualBox. So I guess this could affect other software too.

Do you have a VRAM Temp sensor? Do you see your GPU Hotspott?

You said already, your GPU crashed on Windows too.

This can’t be solved in Linux… maybe if you want to downclock your GPU Core/VRAM to the bottom.

Replace your GPU thermal paste is the cheapest way or buy another GPU (second hand or new…).

already got some grub config which it worked, just prob needing to thinker with that already, I installed some other gpu apps. since got a

[amdgpu]] *ERROR* KCQ disable failed

so installed amdvlk, hoping this would do it, if not then going to downclock the gpu

also don’t know how to see the hotspot in manjaro via system monitor but vram temps are normal too

sensors might provide this information? You might need to install lm_sensors

sudo pacman -Syu lm_sensors

… I also have these packages installed, but you can probably ignore the qt5 listing and the last one, if not all the others:

$ pamac search sensors | grep -i installed
plasma-systemmonitor  6.0.5-1 [Installed]                                  extra
i2c-tools  4.3-6 [Installed]                                               extra
qt6-sensors  6.7.2-1 [Installed]                                           extra
qt5-sensors  5.15.14-1 [Installed]                                         extra
lm_sensors  1:3.6.0.r41.g31d1f125-3 [Installed]                            extra
lib32-lm_sensors  1:3.6.0.r41.g31d1f125-2 [Installed]                   multilib

Aight just installed those, temps seems to be normal but I do still have crashes, even installed amdvlk for the

[amdgpu]] *ERROR* KCQ disable failed

still got a crash, prob I have to thinker more around or maybe edit the gpu bios. which I don’t really know how to do so since its a biostar rx560.

here another log, the latest one

Aug 26 12:56:32 rotip309 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 26 12:56:32 rotip309 kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.10-rt-x86_64 root=UUID=31e79a47-7939-4b02-aa22-ba5d901298ba rw amdgpu.ppfeaturemask=0xfff7bffb quiet splash udev.log_priority=3
Aug 26 12:56:32 rotip309 kernel: [drm] amdgpu kernel modesetting enabled.
Aug 26 12:56:32 rotip309 kernel: amdgpu: Virtual CRAT table created for CPU
Aug 26 12:56:32 rotip309 kernel: amdgpu: Topology: Add CPU node
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: No more image in the PCI ROM
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
Aug 26 12:56:32 rotip309 kernel: amdgpu: ATOM BIOS: xxx-xxx-xxx
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used)
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
Aug 26 12:56:32 rotip309 kernel: [drm] amdgpu: 4096M of VRAM memory ready
Aug 26 12:56:32 rotip309 kernel: [drm] amdgpu: 4944M of GTT memory ready.
Aug 26 12:56:32 rotip309 kernel: amdgpu: hwmgr_sw_init smu backed is polaris10_smu
Aug 26 12:56:32 rotip309 kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart
Aug 26 12:56:32 rotip309 kernel: kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
Aug 26 12:56:32 rotip309 kernel: amdgpu: Virtual CRAT table created for GPU
Aug 26 12:56:32 rotip309 kernel: amdgpu: Topology: Add dGPU node [0x67ef:0x1002]
Aug 26 12:56:32 rotip309 kernel: kfd kfd: amdgpu: added device 1002:67ef
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 1, CU per SH 8, active_cu_number 14
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Using BACO for runtime pm
Aug 26 12:56:32 rotip309 kernel: [drm] Initialized amdgpu 3.57.0 20150101 for 0000:01:00.0 on minor 1
Aug 26 12:56:32 rotip309 kernel: fbcon: amdgpudrmfb (fb0) is primary device
Aug 26 12:56:32 rotip309 kernel: amdgpu 0000:01:00.0: [drm] fb0: amdgpudrmfb frame buffer device
Aug 26 12:56:33 rotip309 kernel: snd_hda_intel 0000:01:00.1: bound 0000:01:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
Aug 26 14:09:52 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: Disabling VM faults because of PRT request!
Aug 26 16:30:19 rotip309 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=4106271, emitted seq=4106273
Aug 26 16:30:19 rotip309 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Gw2-64.exe pid 18587 thread dxvk-submit pid 18891
Aug 26 16:30:19 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Aug 26 16:30:19 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
Aug 26 16:30:19 rotip309 kernel: [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Aug 26 16:30:19 rotip309 kernel: amdgpu: cp is busy, skip halt cp
Aug 26 16:30:19 rotip309 kernel: amdgpu: rlc is busy, skip halt rlc
Aug 26 16:30:19 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: BACO reset
Aug 26 16:30:20 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 26 16:30:20 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.1 test failed (-110)
Aug 26 16:30:20 rotip309 kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring comp_1.0.3 test failed (-110)
Aug 26 16:30:21 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow start
Aug 26 16:30:21 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow done
Aug 26 16:30:21 rotip309 kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset(2) succeeded!
                                                  #16 0x00007c23bcbbb529 n/a (amdgpu_drv.so + 0x18529)
                                                  #17 0x00007c23bcbb6b31 n/a (amdgpu_drv.so + 0x13b31)
                                                  #6  0x00007c23bcbac931 n/a (amdgpu_drv.so + 0x9931)

You dont want amdvlk I dont think.

But maybe try also

iommu=pt

boot option.

You may also try more aggressive feature mask options (replacing the original).
Such as seen here:
https://bugzilla.kernel.org/show_bug.cgi?id=206903

I guess iommu=pt goes in the grub.cfg file?

No, same as previously.
Add it next to the mask option in /etc/default/grub and update grub afterwards.

GRUB_CMDLINE_LINUX="iommu=pt amdgpu.ppfeaturemask=0xfff7bffb"

soo that didn’t fixed so I tried to flash the bios with amdvbflash, it flashed it but it shows no video anymore, I made a backup of the bios and I want to flash it back to what it was, but right now using another system and really scared of accidentally flashing a gtx1650 super with amdvbflash

SINCE it shows like this

AMDVBFLASH version 4.71, Copyright (c) 2020 Advanced Micro Devices, Inc.


adapter seg  bn dn dID       asic           flash      romsize test    bios p/n    
======= ==== == == ==== =============== ============== ======= ==== ================
   0    0000 25 00 67FF Polaris11       GD25Q41B         80000 pass       -