System crashes and reboots because either the CPU or GPU

I’m experiencing crashes and reboots from time to time that are either caused by the CPU or the GPU.
Sudden reboots are sometimes caused while gaming causing the screen to first turn completely green and then turning off as the system reboots. This led to the following error in journlactl:

Okt 25 13:28:13 Desktop kernel: mce: [Hardware Error]: CPU 7: Machine Check: 0 Bank 5: bea0000000000108
Okt 25 13:28:13 Desktop kernel: mce: [Hardware Error]: TSC 0 ADDR 7fd79d72e6e6 MISC d012000100000000 SYND 4d000000 IPID 500b000000000  
Okt 25 13:28:13 Desktop kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1603628885 SOCKET 0 APIC 3 microcode 8701021
Okt 25 13:28:13 Desktop kernel: mce: [Hardware Error]: CPU 9: Machine Check: 0 Bank 5: bea0000000000108
Okt 25 13:28:13 Desktop kernel: mce: [Hardware Error]: TSC 0 ADDR 7fd79d20fe90 MISC d012000100000000 SYND 4d000000 IPID 500b000000000  
Okt 25 13:28:13 Desktop kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1603628885 SOCKET 0 APIC 9 microcode 8701021

I can’t recall what exactly happend the time before this crash, but sometimes the system wakes up from hibernation but I get no video output and my display is switching between input and power saving mode. Maybe unrelated to this I’ve saved the following log suggesting a problem with the GPU:

Okt 09 18:46:22 Desktop kernel: amdgpu: [powerplay] failed send message:     RunBtc (58)         param: 0x00000000 response 0xffffffc2
Okt 09 18:46:22 Desktop kernel: amdgpu: [powerplay] RunBtc failed!
Okt 09 18:46:22 Desktop kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <smu> failed -62
Okt 09 18:46:22 Desktop kernel: [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-62).
Okt 09 18:46:22 Desktop kernel: PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -62
Okt 09 18:46:22 Desktop kernel: PM: Device 0000:28:00.0 failed to resume async: error -62
Okt 09 18:46:22 Desktop kernel: Move buffer fallback to memcpy unavailable
Okt 09 18:46:22 Desktop kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Okt 09 18:46:23 Desktop kernel: Move buffer fallback to memcpy unavailable
Okt 09 18:46:23 Desktop kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Okt 09 18:46:23 Desktop kernel: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
Okt 09 18:46:23 Desktop kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Okt 09 18:46:23 Desktop kernel: #PF: supervisor read access in kernel mode
Okt 09 18:46:23 Desktop kernel: #PF: error_code(0x0000) - not-present page
Okt 09 18:46:23 Desktop kernel: Move buffer fallback to memcpy unavailable
Okt 09 18:46:23 Desktop kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Okt 09 18:46:23 Desktop kernel: Move buffer fallback to memcpy unavailable
Okt 09 18:46:23 Desktop kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Okt 09 18:46:23 Desktop kernel: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
Okt 09 18:46:23 Desktop kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Okt 09 18:46:23 Desktop kernel: #PF: supervisor read access in kernel mode
Okt 09 18:46:23 Desktop kernel: #PF: error_code(0x0000) - not-present page
Okt 09 18:46:25 Desktop kernel: amdgpu: [powerplay] Msg issuing pre-check failed and SMU may be not in the right state!
Okt 09 18:46:27 Desktop kernel: amdgpu: [powerplay] Msg issuing pre-check failed and SMU may be not in the right state!
Okt 09 18:46:28 Desktop kernel: ata14: softreset failed (device not ready)
Okt 09 18:46:38 Desktop kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:62:crtc-0] flip_done timed out
Okt 09 18:46:38 Desktop kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:62:crtc-0] flip_done timed out
Okt 09 18:46:38 Desktop kernel: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
Okt 09 18:46:38 Desktop kernel: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
Okt 09 18:46:38 Desktop kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Okt 09 18:46:38 Desktop kernel: #PF: supervisor read access in kernel mode
Okt 09 18:46:38 Desktop kernel: #PF: error_code(0x0000) - not-present page
Okt 09 18:46:38 Desktop kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Okt 09 18:46:38 Desktop kernel: #PF: supervisor read access in kernel mode
Okt 09 18:46:38 Desktop kernel: #PF: error_code(0x0000) - not-present page

I know that this could or should be different issues but I’m not sure what is causing which crashes and/or reboots, but I got indicators for both the CPU and GPU being the problem. So forgive me for putting everything in one thread. I also want to notice that the GPU is having the notorious reset bug, which might be causing the system to not wake up correctly from hibernation.

And finally here’s my inxi:

inxi -Fazy
System:
  Kernel: 5.6.14-arch1-1-fsync x86_64 bits: 64 compiler: gcc v: 10.2.0 
  parameters: BOOT_IMAGE=/boot/vmlinuz-linux-fsync 
  root=UUID=e5ae83f7-b83c-4ba1-9165-c526522f90bc rw quiet 
  cryptdevice=UUID=14ab5635-177f-4285-a1bc-67a452dca803:luks-14ab5635-177f-4285-a1bc-67a452dca803 
  root=/dev/mapper/luks-14ab5635-177f-4285-a1bc-67a452dca803 apparmor=1 
  security=apparmor 
  resume=/dev/mapper/luks-17da867d-acdb-4759-9a6b-6435d17897a8 
  udev.log_priority=3 
  Desktop: KDE Plasma 5.20.1 tk: Qt 5.15.1 wm: kwin_x11 dm: SDDM 
  Distro: Manjaro Linux 
Machine:
  Type: Desktop Mobo: Micro-Star model: B450 TOMAHAWK MAX (MS-7C02) v: 1.0 
  serial: <filter> UEFI: American Megatrends v: 3.70 date: 06/09/2020 
CPU:
  Info: 6-Core model: AMD Ryzen 5 3600 bits: 64 type: MT MCP arch: Zen 2 
  family: 17 (23) model-id: 71 (113) stepping: N/A microcode: 8701021 
  L2 cache: 3072 KiB 
  flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm 
  bogomips: 86424 
  Speed: 3597 MHz min/max: 2200/3600 MHz boost: enabled Core speeds (MHz): 
  1: 3597 2: 2055 3: 2200 4: 2200 5: 2229 6: 2200 7: 2200 8: 2200 9: 3515 
  10: 2057 11: 2397 12: 2200 
  Vulnerabilities: Type: itlb_multihit status: Not affected 
  Type: l1tf status: Not affected 
  Type: mds status: Not affected 
  Type: meltdown status: Not affected 
  Type: spec_store_bypass 
  mitigation: Speculative Store Bypass disabled via prctl and seccomp 
  Type: spectre_v1 
  mitigation: usercopy/swapgs barriers and __user pointer sanitization 
  Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, STIBP: 
  conditional, RSB filling 
  Type: tsx_async_abort status: Not affected 
Graphics:
  Device-1: AMD Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] 
  vendor: Micro-Star MSI driver: amdgpu v: kernel bus ID: 28:00.0 
  chip ID: 1002:731f 
  Display: x11 server: X.Org 1.20.9 compositor: kwin_x11 driver: amdgpu 
  display ID: :0 screens: 1 
  Screen-1: 0 s-res: 2560x1440 s-dpi: 96 s-size: 677x381mm (26.7x15.0") 
  s-diag: 777mm (30.6") 
  Monitor-1: HDMI-A-0 res: 2560x1440 hz: 60 dpi: 118 
  size: 553x311mm (21.8x12.2") diag: 634mm (25") 
  OpenGL: renderer: AMD Radeon RX 5700 (NAVI10 DRM 3.36.0 5.6.14-arch1-1-fsync 
  LLVM 10.0.1) 
  v: 4.6 Mesa 20.2.1 direct render: Yes 
Audio:
  Device-1: AMD Navi 10 HDMI Audio driver: snd_hda_intel v: kernel 
  bus ID: 28:00.1 chip ID: 1002:ab38 
  Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI 
  driver: snd_hda_intel v: kernel bus ID: 2a:00.4 chip ID: 1022:1487 
  Device-3: Yamaha type: USB driver: snd-usb-audio bus ID: 3-2.1:3 
  chip ID: 0499:170f 
  Sound Server: ALSA v: k5.6.14-arch1-1-fsync 
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet 
  vendor: Micro-Star MSI driver: r8169 v: kernel port: f000 bus ID: 22:00.0 
  chip ID: 10ec:8168 
  IF: enp34s0 state: up speed: 100 Mbps duplex: full mac: <filter> 
  Device-2: Microsoft Xbox 360 Wireless Adapter type: USB driver: usbfs 
  bus ID: 1-8:4 chip ID: 045e:0719 serial: <filter> 
Drives:
  Local Storage: total: 9.46 TiB used: 6.22 TiB (65.7%) 
  SMART Message: Unable to run smartctl. Root privileges required. 
  ID-1: /dev/sda vendor: Samsung model: SSD 850 EVO 250GB size: 232.89 GiB 
  block size: physical: 512 B logical: 512 B speed: 6.0 Gb/s serial: <filter> 
  rev: 2B6Q scheme: GPT 
  ID-2: /dev/sdb vendor: Samsung model: SSD 840 EVO 120GB size: 111.79 GiB 
  block size: physical: 512 B logical: 512 B speed: 6.0 Gb/s serial: <filter> 
  rev: DB6Q scheme: GPT 
  ID-3: /dev/sdc vendor: Seagate model: ST4000DM004-2CV104 size: 3.64 TiB 
  block size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s 
  rotation: 5425 rpm serial: <filter> rev: 0001 scheme: GPT 
  ID-4: /dev/sdd vendor: Western Digital model: WD30EZRX-00D8PB0 
  size: 2.73 TiB block size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s 
  rotation: 5400 rpm serial: <filter> rev: 0A80 scheme: GPT 
  ID-5: /dev/sde vendor: Western Digital model: WD10EADS-00M2B0 
  size: 931.51 GiB block size: physical: 512 B logical: 512 B speed: 3.0 Gb/s 
  serial: <filter> rev: 0A01 scheme: GPT 
  ID-6: /dev/sdf vendor: Seagate model: ST2000DM008-2FR102 size: 1.82 TiB 
  block size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s 
  rotation: 7200 rpm serial: <filter> rev: 0001 scheme: GPT 
  ID-7: /dev/sdg type: USB vendor: SanDisk model: Cruzer Slice size: 29.82 GiB 
  block size: physical: 512 B logical: 512 B serial: <filter> rev: 1.20 
  scheme: MBR 
  SMART Message: Unknown USB bridge. Flash drive/Unsupported enclosure? 
Partition:
  ID-1: / raw size: 215.37 GiB size: 210.99 GiB (97.97%) 
  used: 65.08 GiB (30.8%) fs: ext4 dev: /dev/dm-0 
Swap:
  Kernel: swappiness: 60 (default) cache pressure: 100 (default) 
  ID-1: swap-1 type: partition size: 17.21 GiB used: 0 KiB (0.0%) priority: -2 
  dev: /dev/dm-1 
Sensors:
  System Temperatures: cpu: 42.8 C mobo: N/A gpu: amdgpu temp: 50.0 C 
  mem: 50.0 C 
  Fan Speeds (RPM): N/A gpu: amdgpu fan: 0 
Info:
  Processes: 334 Uptime: 16m Memory: 15.65 GiB used: 3.32 GiB (21.2%) 
  Init: systemd v: 246 Compilers: gcc: 10.2.0 Packages: pacman: 1392 lib: 381 
  flatpak: 0 Shell: Bash v: 5.0.18 running in: konsole inxi: 3.1.08