Intermittent MCE errors on Ryzen 3600 X

Not Manjaro-specific but something better suited for this forum than Reddit. Once every half year or so I get an MCE (machine check error) while doing tasks that are CPU and GPU intensive (gaming, for example):

kernel: mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 5: bea0000000000108
kernel: mce: [Hardware Error]: TSC 0 ADDR 7f1fd3fcd85e MISC d012000100000000 SYND 4d000000 IPID 500b000000000
kernel: mce: [Hardware Error]: PROCESSOR 2:870f10 TIME 1630685104 SOCKET 0 APIC 8 microcode 8701013

My system uses a Ryzen 3600 X and an RX 5600 XT. Kernel version is 5.13. amd-ucode is installed and the system passes mprime tests just fine. Any insights into this problem?

While looking for this, i came across this discussion https://bugzilla.kernel.org/show_bug.cgi?id=206903
Maybe worth reading. Some ideas from there:

  • update BIOS
  • check PSU if is enough for your rig

i came across this Sudden reboots under load and kernel error mce: [Hardware Error]? / Kernel & Hardware / Arch Linux Forums

2 Likes

I’ve never actually considered an underpowered PSU to be the culprit…mine is 450w, which should be enough for a 3600x+5600xt. Also I game quite frequently and the MCE only appears very rarely, so I’m not sure if this is the cause.
I thought updating the BIOS is largely pointless if cpu microcode is installed and configured? Most of the discussions around this issue (including the arch forum one you pointed to) tail off without a consensus around the root cause…