Manjaro crashing randomly without warning

Hello I recently updated my Bios on my ASUS Prime X570P because of the rdrand bug with Ryzen 3000 but since then I get random crashes. System just shuts down without any warning.

the following error is maybe the suspect:

Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: Corrected error, no action required.
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: Platform Security Processor Ext. Error Code: 62
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN

inxi --admin --verbosity=7 --filter --width returns this

  Kernel: 5.13.19-2-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0
    parameters: BOOT_IMAGE=/boot/vmlinuz-5.13-x86_64
    root=UUID=0bc33de2-959d-433a-8788-244c33b2697f rw quiet apparmor=1
    security=apparmor udev.log_priority=3
  Desktop: Xfce 4.16.0 tk: Gtk 3.24.29 info: xfce4-panel wm: xfwm 4.16.1
    vt: 7 dm: LightDM 1.30.0 Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Desktop Mobo: ASUSTeK model: PRIME X570-P v: Rev X.0x
    serial: <superuser required> UEFI: American Megatrends v: 4021
    date: 08/09/2021
Battery:
  Message: No system battery data found. Is one present?
Memory:
  RAM: total: 15.54 GiB used: 2.21 GiB (14.2%)
  RAM Report:
    permissions: Unable to run dmidecode. Root privileges required.
CPU:
  Info: model: AMD Ryzen 5 3600 bits: 64 type: MT MCP arch: Zen 2
    family: 0x17 (23) model-id: 0x71 (113) stepping: 0 microcode: 0x8701021
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
    L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 3 MiB desc: 6x512 KiB
    L3: 32 MiB desc: 2x16 MiB
  Speed (MHz): avg: 2467 high: 3717 min/max: 2200/4208 boost: enabled
    scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 2911 2: 2057
    3: 2054 4: 2199 5: 2200 6: 2199 7: 3613 8: 3717 9: 2057 10: 2202 11: 2199
    12: 2200 bogomips: 86440
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1
    bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy
    constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total
    cqm_occup_llc cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid
    f16c flushbyasid fma fpu fsgsbase fxsr fxsr_opt ht hw_pstate ibpb ibs
    irperf lahf_lm lbrv lm mba mca mce misalignsse mmx mmxext monitor movbe
    msr mtrr mwaitx nonstop_tsc nopl npt nrip_save nx osvw overflow_recov pae
    pat pausefilter pclmulqdq pdpe1gb perfctr_core perfctr_llc perfctr_nb
    pfthreshold pge pni popcnt pse pse36 rdpid rdpru rdrand rdseed rdt_a
    rdtscp rep_good sep sev sev_es sha_ni skinit smap smca sme smep ssbd sse
    sse2 sse4_1 sse4_2 sse4a ssse3 stibp succor svm svm_lock syscall tce
    topoext tsc tsc_scale umip v_spec_ctrl v_vmsave_vmload vgif vmcb_clean vme
    vmmcall wbnoinvd wdt xgetbv1 xsave xsavec xsaveerptr xsaveopt xsaves
  Vulnerabilities:
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: spec_store_bypass
    mitigation: Speculative Store Bypass disabled via prctl and seccomp
  Type: spectre_v1
    mitigation: usercopy/swapgs barriers and __user pointer sanitization
  Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, STIBP:
    conditional, RSB filling
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
    vendor: Sapphire Limited driver: amdgpu v: kernel bus-ID: 0a:00.0
    chip-ID: 1002:731f class-ID: 0300
  Display: x11 server: X.Org 1.21.1.2 compositor: xfwm4 v: 4.16.1 driver:
    loaded: amdgpu display-ID: :0.0 screens: 1
  Screen-1: 0 s-res: 2560x1440 s-dpi: 96 s-size: 677x381mm (26.7x15.0")
    s-diag: 777mm (30.6")
  Monitor-1: DisplayPort-0 res: 2560x1440 dpi: 109
    size: 597x336mm (23.5x13.2") diag: 685mm (27")
  OpenGL: renderer: AMD Radeon RX 5700 XT (NAVI10 DRM 3.41.0
    5.13.19-2-MANJARO LLVM 13.0.0)
    v: 4.6 Mesa 21.2.5 direct render: Yes
Audio:
  Device-1: AMD Navi 10 HDMI Audio driver: snd_hda_intel v: kernel
    bus-ID: 0a:00.1 chip-ID: 1002:ab38 class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel bus-ID: 0c:00.4 chip-ID: 1022:1487
    class-ID: 0403
  Sound Server-1: ALSA v: k5.13.19-2-MANJARO running: yes
  Sound Server-2: JACK v: 1.9.19 running: no
  Sound Server-3: PulseAudio v: 15.0 running: yes
  Sound Server-4: PipeWire v: 0.3.40 running: no
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: ASUSTeK PRIME B450M-A driver: r8169 v: kernel port: f000
    bus-ID: 04:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IP v4: <filter> type: dynamic noprefixroute scope: global
    broadcast: <filter>
  IP v6: <filter> type: dynamic noprefixroute scope: global
  IP v6: <filter> type: noprefixroute scope: link
  WAN IP: <filter>
Bluetooth:
  Message: No bluetooth data found.
Logical:
  Message: No logical block device data found.
RAID:
  Message: No RAID data found.
Drives:
  Local Storage: total: 3.18 TiB used: 56.07 GiB (1.7%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 970 EVO 500GB
    size: 465.76 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
    lanes: 4 type: SSD serial: <filter> rev: 2B2QEXE7 temp: 30.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Western Digital
    model: WD20EZRZ-00Z5HB0 size: 1.82 TiB block-size: physical: 4096 B
    logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 5400 serial: <filter>
    rev: 0A80 scheme: GPT
  ID-3: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 860 QVO 1TB
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    type: SSD serial: <filter> rev: 2B6Q scheme: GPT
  Message: No optical or floppy data found.
Partition:
  ID-1: / raw-size: 465.47 GiB size: 457.09 GiB (98.20%)
    used: 49.16 GiB (10.8%) fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2
    label: N/A uuid: 0bc33de2-959d-433a-8788-244c33b2697f
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 25.5 MiB (8.5%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
    label: NO_LABEL uuid: 6D5F-3BC0
  ID-3: /mnt/Fast Library raw-size: 915.89 GiB size: 900.51 GiB (98.32%)
    used: 6.88 GiB (0.8%) fs: ext4 dev: /dev/sdb1 maj-min: 8:17
    label: Fast Library uuid: 6a9f3a32-0b2d-47db-a750-49ff7315f9f0
  ID-4: /mnt/Slow Archive raw-size: 1.82 TiB size: 1.79 TiB (98.37%)
    used: 28 KiB (0.0%) fs: ext4 dev: /dev/sda1 maj-min: 8:1 label: Slow Archive
    uuid: f3e28752-bf8b-44b3-9ec9-5befd5a00509
Swap:
  Alert: No swap data was found.
Unmounted:
  ID-1: /dev/sdb2 maj-min: 8:18 size: 15.62 GiB fs: swap label: N/A
    uuid: 9a48c4a1-7a4a-4f3a-82b0-888ec862a47d
USB:
  Hub-1: 1-0:1 info: Hi-speed hub with single TT ports: 6 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Hub-2: 2-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
  Hub-3: 3-0:1 info: Hi-speed hub with single TT ports: 6 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 3-3:2 info: tshort Dactyl-Manuform (5x6) type: Keyboard,HID
    driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s
    power: 500mA chip-ID: 444d:3536 class-ID: 0300
  Device-2: 3-4:3 info: Logitech G502 Proteus Spectrum Optical Mouse
    type: Mouse,HID driver: hid-generic,usbhid interfaces: 2 rev: 2.0
    speed: 12 Mb/s power: 300mA chip-ID: 046d:c332 class-ID: 0300
    serial: <filter>
  Device-3: 3-6:4 info: ASUSTek AURA LED Controller type: HID
    driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 16mA
    chip-ID: 0b05:18f3 class-ID: 0300 serial: <filter>
  Hub-4: 4-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
  Hub-5: 5-0:1 info: Hi-speed hub with single TT ports: 4 rev: 2.0
    speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900
  Device-1: 5-2:2 info: Prusa Original Prusa i3 MK3
    type: Abstract (modem),CDC-Data driver: cdc_acm interfaces: 2 rev: 2.0
    speed: 12 Mb/s power: 100mA chip-ID: 2c99:0002 class-ID: 0a00
    serial: <filter>
  Hub-6: 6-0:1 info: Super-speed hub ports: 4 rev: 3.1 speed: 10 Gb/s
    chip-ID: 1d6b:0003 class-ID: 0900
Sensors:
  System Temperatures: cpu: 37.2 C mobo: N/A gpu: amdgpu temp: 42.0 C
    mem: 40.0 C
  Fan Speeds (RPM): N/A gpu: amdgpu fan: 0
Info:
  Processes: 298 Uptime: 27m wakeups: 0 Init: systemd v: 249 tool: systemctl
  Compilers: gcc: 11.1.0 Packages: 1263 pacman: 1254 lib: 394 flatpak: 5
  snap: 4 Shell: Bash v: 5.1.12 running-in: xfce4-terminal inxi: 3.3.11

here is all the errors from sudo dmesg --level emerg,alert,crit,err,warn

[    0.945245] ata3.00: supports DRM functions and may not be fully accessible
[    0.947317] ata3.00: supports DRM functions and may not be fully accessible
[    1.600900] acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[    1.600957] acpi PNP0C14:02: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[    1.600987] acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[    1.601080] acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
[    2.160449] usb 3-6: config 1 has an invalid interface number: 2 but max is 1
[    2.160453] usb 3-6: config 1 has no interface number 1
[    8.829631] kauditd_printk_skb: 71 callbacks suppressed
[   13.833541] kauditd_printk_skb: 11 callbacks suppressed
[  313.883912] [Hardware Error]: Corrected error, no action required.
[  313.883916] [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
[  313.883922] [Hardware Error]: IPID: 0x000100ff03830400
[  313.883924] [Hardware Error]: Platform Security Processor Ext. Error Code: 62
[  313.883925] [Hardware Error]: cache level: RESV, tx: INSN
[  627.910402] [Hardware Error]: Corrected error, no action required.
[  627.910406] [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
[  627.910411] [Hardware Error]: IPID: 0x000100ff03830400
[  627.910412] [Hardware Error]: Platform Security Processor Ext. Error Code: 62
[  627.910413] [Hardware Error]: cache level: RESV, tx: INSN
[  941.936679] [Hardware Error]: Corrected error, no action required.
[  941.936682] [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
[  941.936687] [Hardware Error]: IPID: 0x000100ff03830400
[  941.936689] [Hardware Error]: Platform Security Processor Ext. Error Code: 62
[  941.936689] [Hardware Error]: cache level: RESV, tx: INSN
[ 1255.963285] [Hardware Error]: Corrected error, no action required.
[ 1255.963288] [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
[ 1255.963293] [Hardware Error]: IPID: 0x000100ff03830400
[ 1255.963295] [Hardware Error]: Platform Security Processor Ext. Error Code: 62
[ 1255.963296] [Hardware Error]: cache level: RESV, tx: INSN
[ 1569.989891] [Hardware Error]: Corrected error, no action required.
[ 1569.989893] [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
[ 1569.989898] [Hardware Error]: IPID: 0x000100ff03830400
[ 1569.989899] [Hardware Error]: Platform Security Processor Ext. Error Code: 62
[ 1569.989900] [Hardware Error]: cache level: RESV, tx: INSN
[zirnit@zirnit-pc ~]$ 

journalctl --catalog --priority=3 --boot=-2

Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:05 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:06 zirnit-pc kernel: kvm: disabled by bios
Dez 27 21:58:12 zirnit-pc lightdm[1138]: gkr-pam: unable to locate daemon contr>
Dez 27 21:58:38 zirnit-pc pulseaudio[1235]: GetManagedObjects() failed: org.fre>
Dez 27 22:03:16 zirnit-pc kernel: [Hardware Error]: Corrected error, no action >
Dez 27 22:03:16 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS>
Dez 27 22:03:16 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 27 22:03:16 zirnit-pc kernel: [Hardware Error]: Platform Security Processor>
Dez 27 22:03:16 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN
Dez 27 22:08:30 zirnit-pc kernel: [Hardware Error]: Corrected error, no action >
Dez 27 22:08:30 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS>
Dez 27 22:08:30 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 27 22:08:30 zirnit-pc kernel: [Hardware Error]: Platform Security Processor>
Dez 27 22:08:30 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: Corrected error, no action >
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS>
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: Platform Security Processor>
Dez 27 22:24:13 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN
Dez 27 22:29:27 zirnit-pc kernel: [Hardware Error]: Corrected error, no action >
Dez 27 22:29:27 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS>
Dez 27 22:29:27 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 27 22:29:27 zirnit-pc kernel: [Hardware Error]: Platform Security Processor>
Dez 27 22:29:27 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN
Dez 27 22:34:41 zirnit-pc kernel: [Hardware Error]: Corrected error, no action >
Dez 27 22:34:41 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS>
Dez 27 22:34:41 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 27 22:34:41 zirnit-pc kernel: [Hardware Error]: Platform Security Processor>
Dez 27 22:34:41 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN

Hello Zirnit,
Please see this post here about providing good info :slight_smile:

please note the part about “3 backticks `” as this will enable you to post logs without it looking like a wall of text.

1 Like

ahh thank you I was using the wrong ones

I added the logs now thank you for telling me how to do it

I dont know if it will fix your issue but the first thing i would do is change to a kernel that is not end of life.
5.10 is the latest LTS version but 5.15 is also working well (for me, and i have both on my system in case i need a fall back)

1 Like

Okay I will do so I will change to 5.15.7-1
I also enabled virtualisation for good measure
I’ll write if I still run into crashes but it might take some time since I cant recreate it

okay had another crash right now

journalctl --catalog --priority=3 --boot=-1 gives me

Dez 29 15:52:48 zirnit-pc kernel: [Hardware Error]: Corrected error, no action required.
Dez 29 15:52:48 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
Dez 29 15:52:48 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 29 15:52:48 zirnit-pc kernel: [Hardware Error]: Platform Security Processor Ext. Error Code: 62
Dez 29 15:52:49 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN
Dez 29 15:58:03 zirnit-pc kernel: [Hardware Error]: Corrected error, no action required.
Dez 29 15:58:03 zirnit-pc kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000
Dez 29 15:58:03 zirnit-pc kernel: [Hardware Error]: IPID: 0x000100ff03830400
Dez 29 15:58:03 zirnit-pc kernel: [Hardware Error]: Platform Security Processor Ext. Error Code: 62
Dez 29 15:58:03 zirnit-pc kernel: [Hardware Error]: cache level: RESV, tx: INSN

so I assume this must be the bug

okay Ive read through a few forums its either a hardware error or a problem with cooling, though I doubt its cooling since I have a Alpenfoehn Brocken 3 (really big tower cooler), but I will reapply some thermal paste just to make sure. I guess I might have to get a new cpu

Did you have this issue before the bios update?

I rarely had some crashes when still running windows but otherwise no

Looks like this bug:
https://wiki.gentoo.org/wiki/Ryzen#Random_reboots_with_mce_events
Also, enabling amd ibs in bios helped with my previous cpu ryzen 1600x.

sadly its not that since the error also appears while there is no idletime happening sadly the code is also different.

Though I couldnt find IBM in my BIOS

Hello I somehow have fixed the crashes. It seems like I had some setting running in my bios which broke something. After resetting it I had no crashes anymore and also no errors anymore in my logs I will tag this now as solved but I might answer again If I find out more when I test what settings caused the crashes.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.