Hardware Error Crash on Fresh Install

I’ve been using Manjaro on 2 laptops for years with no issues. Today, I finally got a 2nd SSD and used it to dual boot Manjaro on my desktop PC (along with Windows for gaming). However, I keep having seemingly random crashes while in Manjaro and I’m seeing references to a hardware error in the error logs. Please see the below logs/info.

dmesg errors:

[    2.945876] [Hardware Error]: System Fatal error.
[    2.945883] [Hardware Error]: CPU:3 (17:71:0) MC5_STATUS[-|UE|MiscV|AddrV|PCC|TCC|SyndV|-|-|-]: 0xbea0000000000108
[    2.945895] [Hardware Error]: Error Addr: 0x0001ffffc0d2a01c
[    2.945899] [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
[    2.945906] [Hardware Error]: Execution Unit Ext. Error Code: 0
[    2.945907] [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN
[    3.765139] usb: port power management may be unreliable
[    4.252142] EXT4-fs error (device nvme0n1p3): ext4_orphan_get:1424: comm mount: bad orphan inode 13112487
[    4.252148] ext4_test_bit(bit=5286, block=52428816) = 0
[    4.252204] EXT4-fs error (device nvme0n1p3): ext4_orphan_get:1424: comm mount: bad orphan inode 12752655
[    4.252207] ext4_test_bit(bit=5902, block=50855956) = 0
[    4.252208] EXT4-fs error (device nvme0n1p3): ext4_orphan_get:1424: comm mount: bad orphan inode 12752494
[    4.252210] ext4_test_bit(bit=5741, block=50855956) = 0
[    4.889237] systemd-journald[402]: File /var/log/journal/198a26225de74e1eb1977cf0f3bfb971/system.journal corrupted or uncleanly shut down, renaming and replacing.
[    5.108760] ACPI Warning: SystemIO range 0x0000000000000B00-0x0000000000000B08 conflicts with OpRegion 0x0000000000000B00-0x0000000000000B0F (\GSA1.SMBI) (20240322/utaddress-204)
[    5.540031] nvidia: loading out-of-tree module taints kernel.
[    5.540041] nvidia: module license 'NVIDIA' taints kernel.
[    5.540043] Disabling lock debugging due to kernel taint
[    5.540047] nvidia: module license taints kernel.
[    5.904455] EXT4-fs error (device nvme0n1p4): ext4_orphan_get:1424: comm mount: bad orphan inode 17957815
[    5.904464] ext4_test_bit(bit=950, block=71827472) = 0

[    6.124987] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  550.120  Fri Sep 13 10:10:01 UTC 2024
[    6.225601] nvidia-gpu 0000:09:00.3: i2c timeout error e0000000
[    6.225607] ucsi_ccg 0-0008: i2c_transfer failed -110
[    6.225611] ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
[    6.225615] ucsi_ccg 0-0008: probe with driver ucsi_ccg failed with error -110
[    7.614004] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[   48.582509] systemd-journald[402]: File /var/log/journal/198a26225de74e1eb1977cf0f3bfb971/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
[   51.758079] [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
[   55.638407] warning: `kdeconnectd' uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211
[  132.685773] [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
[  132.686820] [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
[  132.687226] [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
[  132.750265] [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership

inxi:

System:
  Kernel: 6.10.13-3-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 14.2.1
    clocksource: tsc avail: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.10-x86_64
    root=UUID=290b2c84-98af-42be-9374-0d2775ac678c rw quiet splash
    resume=UUID=e0ddd7a7-45be-45a4-90e1-4bb5fd987b56 udev.log_priority=3
  Desktop: KDE Plasma v: 6.1.5 tk: Qt v: N/A info: frameworks v: 6.6.0
    wm: kwin_x11 vt: 2 dm: SDDM Distro: Manjaro base: Arch Linux
Machine:
  Type: Desktop Mobo: Gigabyte model: X570 UD serial: <superuser required>
    uuid: <superuser required> BIOS: American Megatrends LLC. v: F40d
    date: 09/02/2024
Battery:
  Message: No system battery data found. Is one present?
Memory:
  System RAM: total: 16 GiB available: 15.54 GiB used: 2.02 GiB (13.0%)
  Message: For most reliable report, use superuser + dmidecode.
  Array-1: capacity: 128 GiB slots: 4 modules: 2 EC: None
    max-module-size: 32 GiB note: est.
  Device-1: Channel-A DIMM 0 type: no module installed
  Device-2: Channel-A DIMM 1 type: DDR4 detail: synchronous unbuffered
    (unregistered) size: 8 GiB speed: 2400 MT/s volts: note: check curr: 1
    min: 1 max: 1 width (bits): data: 64 total: 64 manufacturer: TeamGroup
    part-no: UD4-3200 serial: <filter>
  Device-3: Channel-B DIMM 0 type: no module installed
  Device-4: Channel-B DIMM 1 type: DDR4 detail: synchronous unbuffered
    (unregistered) size: 8 GiB speed: 2400 MT/s volts: note: check curr: 1
    min: 1 max: 1 width (bits): data: 64 total: 64 manufacturer: TeamGroup
    part-no: UD4-3200 serial: <filter>
CPU:
  Info: model: AMD Ryzen 5 3600X bits: 64 type: MT MCP arch: Zen 2 gen: 2
    level: v3 note: check built: 2020-22 process: TSMC n7 (7nm) family: 0x17 (23)
    model-id: 0x71 (113) stepping: 0 microcode: 0x8701034
  Topology: cpus: 1x dies: 1 clusters: 1 cores: 6 threads: 12 tpc: 2
    smt: enabled cache: L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 3 MiB
    desc: 6x512 KiB L3: 32 MiB desc: 2x16 MiB
  Speed (MHz): avg: 3800 min/max: 2200/4409 boost: enabled scaling:
    driver: acpi-cpufreq governor: schedutil cores: 1: 3800 2: 3800 3: 3800
    4: 3800 5: 3800 6: 3800 7: 3800 8: 3800 9: 3800 10: 3800 11: 3800 12: 3800
    bogomips: 91067
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1
    bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy
    constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total
    cqm_occup_llc cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid
    f16c flushbyasid fma fpu fsgsbase fxsr fxsr_opt ht hw_pstate ibpb ibs
    irperf lahf_lm lbrv lm mba mca mce misalignsse mmx mmxext monitor movbe
    msr mtrr mwaitx nonstop_tsc nopl npt nrip_save nx osvw overflow_recov pae
    pat pausefilter pclmulqdq pdpe1gb perfctr_core perfctr_llc perfctr_nb
    pfthreshold pge pni popcnt pse pse36 rapl rdpid rdpru rdrand rdseed rdt_a
    rdtscp rep_good sep sev sev_es sha_ni skinit smap smca smep ssbd sse sse2
    sse4_1 sse4_2 sse4a ssse3 stibp succor svm_lock syscall tce topoext tsc
    tsc_scale umip v_spec_ctrl v_vmsave_vmload vgif vmcb_clean vme vmmcall
    wbnoinvd wdt x2apic xgetbv1 xsave xsavec xsaveerptr xsaveopt xtopology
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed mitigation: untrained return thunk; SMT enabled with STIBP
    protection
  Type: spec_rstack_overflow mitigation: Safe RET
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; STIBP:
    always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA TU106 [GeForce RTX 2070] vendor: Micro-Star MSI
    driver: nvidia v: 550.120 alternate: nouveau,nvidia_drm non-free: 550.xx+
    status: current (as of 2024-09; EOL~2026-12-xx) arch: Turing code: TUxxx
    process: TSMC 12nm FF built: 2018-2022 pcie: gen: 3 speed: 8 GT/s lanes: 16
    ports: active: none off: DP-1,DP-2 empty: DP-3,HDMI-A-1 bus-ID: 09:00.0
    chip-ID: 10de:1f02 class-ID: 0300
  Device-2: 2M UVC CAMERA NexiGo N60 FHD Webcam
    driver: snd-usb-audio,uvcvideo type: USB rev: 2.0 speed: 480 Mb/s lanes: 1
    mode: 2.0 bus-ID: 7-3.4:5 chip-ID: 1d6c:0103 class-ID: 0102
    serial: <filter>
  Display: x11 server: X.Org v: 21.1.14 with: Xwayland v: 24.1.4
    compositor: kwin_x11 driver: X: loaded: N/A failed: nvidia
    gpu: nvidia,nvidia-nvswitch display-ID: :0 screens: 1
  Screen-1: 0 s-res: 3840x1080 s-dpi: 101 s-size: 966x272mm (38.03x10.71")
    s-diag: 1004mm (39.51")
  Monitor-1: DP-0 pos: left res: 1920x1080 hz: 60 dpi: 102
    size: 476x267mm (18.74x10.51") diag: 546mm (21.49") modes: N/A
  Monitor-2: DP-2 pos: primary,right res: 1920x1080 hz: 60 dpi: 102
    size: 476x267mm (18.74x10.51") diag: 546mm (21.49") modes: N/A
  API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 2
    drv: swrast surfaceless: drv: nvidia x11: drv: nvidia
    inactive: gbm,wayland,device-1
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.120
    glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce RTX 2070/PCIe/SSE2
    memory: 7.81 GiB
  API: Vulkan v: 1.3.295 layers: 5 device: 0 type: discrete-gpu
    name: NVIDIA GeForce RTX 2070 driver: nvidia v: 550.120 device-ID: 10de:1f02
    surfaces: xcb,xlib
Audio:
  Device-1: NVIDIA TU106 High Definition Audio vendor: Micro-Star MSI
    driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16
    bus-ID: 09:00.1 chip-ID: 10de:10f9 class-ID: 0403
  Device-2: Advanced Micro Devices [AMD] Starship/Matisse HD Audio
    vendor: Gigabyte driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s
    lanes: 16 bus-ID: 0b:00.4 chip-ID: 1022:1487 class-ID: 0403
  Device-3: Razer USA USB Sound Card driver: hid-generic,snd-usb-audio,usbhid
    type: USB rev: 2.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-1:2
    chip-ID: 1532:0529 class-ID: 0300 serial: <filter>
  Device-4: 2M UVC CAMERA NexiGo N60 FHD Webcam
    driver: snd-usb-audio,uvcvideo type: USB rev: 2.0 speed: 480 Mb/s lanes: 1
    mode: 2.0 bus-ID: 7-3.4:5 chip-ID: 1d6c:0103 class-ID: 0102
    serial: <filter>
  API: ALSA v: k6.10.13-3-MANJARO status: kernel-api with: aoss
    type: oss-emulator tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 1.2.5 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    vendor: Gigabyte driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s
    lanes: 1 port: e000 bus-ID: 04:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IP v4: <filter> type: dynamic noprefixroute scope: global
    broadcast: <filter>
  IP v6: <filter> type: dynamic noprefixroute scope: global
  IP v6: <filter> type: noprefixroute scope: link
  Device-2: Realtek RTL8812AE 802.11ac PCIe Wireless Network Adapter
    driver: rtl8821ae v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: d000
    bus-ID: 05:00.0 chip-ID: 10ec:8812 class-ID: 0280
  IF: wlp5s0 state: down mac: <filter>
  Device-3: Realtek 8812AU/8821AU 802.11ac WLAN Adapter [USB Wireless
    Dual-Band 2.4/5Ghz] driver: N/A type: USB rev: 2.1 speed: 480 Mb/s lanes: 1
    mode: 2.0 bus-ID: 1-3:3 chip-ID: 0bda:0811 class-ID: 0000 serial: <filter>
  Info: services: NetworkManager, systemd-timesyncd, wpa_supplicant
  WAN IP: <filter>
Bluetooth:
  Device-1: Realtek Bluetooth Radio driver: btusb v: 0.8 type: USB rev: 1.1
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 7-2:2 chip-ID: 6655:8771
    class-ID: e001 serial: <filter>
  Report: rfkill ID: hci0 rfk-id: 0 state: up address: see --recommends
Logical:
  Message: No logical block device data found.
RAID:
  Message: No RAID data found.
Drives:
  Local Storage: total: 5.68 TiB used: 16.9 GiB (0.3%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 980 PRO 2TB
    size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: 5B2QGXA7 temp: 35.9 C
    scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: A-Data model: SU650 size: 223.57 GiB
    block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s tech: SSD
    serial: <filter> fw-rev: 1C0 scheme: MBR
  ID-3: /dev/sdb maj-min: 8:16 vendor: Toshiba model: DT01ACA200
    size: 1.82 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    tech: HDD rpm: 7200 serial: <filter> fw-rev: ABB0 scheme: GPT
  ID-4: /dev/sdc maj-min: 8:32 vendor: Seagate model: ST2000DM008-2FR102
    size: 1.82 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    tech: HDD rpm: 7200 serial: <filter> fw-rev: 0001 scheme: GPT
  Message: No optical or floppy data found.
Partition:
  ID-1: / raw-size: 1 TiB size: 1007.26 GiB (98.33%) used: 10.1 GiB (1.0%)
    fs: ext4 dev: /dev/nvme0n1p3 maj-min: 259:3 label: root
    uuid: 290b2c84-98af-42be-9374-0d2775ac678c
  ID-2: /boot/efi raw-size: 1024 MiB size: 1022 MiB (99.80%)
    used: 4 KiB (0.0%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1 label: BOOT
    uuid: C0C8-76F2
  ID-3: /home raw-size: 833.59 GiB size: 819.43 GiB (98.30%)
    used: 6.79 GiB (0.8%) fs: ext4 dev: /dev/nvme0n1p4 maj-min: 259:4 label: home
    uuid: e5eeab9a-05bf-4eaa-b453-7fee605c23be
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: yes
    compressor: zstd max-pool: 20%
  ID-1: swap-1 type: partition size: 4 GiB used: 0 KiB (0.0%) priority: -2
    dev: /dev/nvme0n1p2 maj-min: 259:2 label: swap
    uuid: e0ddd7a7-45be-45a4-90e1-4bb5fd987b56
Unmounted:
  ID-1: /dev/nvme0n1p5 maj-min: 259:5 size: 8.6 MiB fs: <superuser required>
    label: N/A uuid: N/A
  ID-2: /dev/sda1 maj-min: 8:1 size: 50 MiB fs: ntfs label: System Reserved
    uuid: 66B0C5ADB0C583D1
  ID-3: /dev/sda2 maj-min: 8:2 size: 222.98 GiB fs: ntfs label: N/A
    uuid: CA10E61110E60473
  ID-4: /dev/sda3 maj-min: 8:3 size: 546 MiB fs: ntfs label: N/A
    uuid: 284CAB1B4CAAE2B4
  ID-5: /dev/sdb1 maj-min: 8:17 size: 1.82 TiB fs: vfat label: N/A
    uuid: 7112-3EBB
  ID-6: /dev/sdc1 maj-min: 8:33 size: 529 MiB fs: ntfs label: Recovery
    uuid: 586EA6D96EA6AEE6
  ID-7: /dev/sdc2 maj-min: 8:34 size: 99 MiB fs: vfat label: N/A
    uuid: FEA7-B07E
  ID-8: /dev/sdc3 maj-min: 8:35 size: 16 MiB fs: <superuser required>
    label: N/A uuid: N/A
  ID-9: /dev/sdc4 maj-min: 8:36 size: 1.82 TiB fs: ntfs label: N/A
    uuid: C01AE34C1AE33E52
USB:
  Hub-1: 1-0:1 info: hi-speed hub with single TT ports: 6 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 chip-ID: 1d6b:0002
    class-ID: 0900
  Device-1: 1-1:2 info: Razer USA USB Sound Card type: audio,HID
    driver: hid-generic,snd-usb-audio,usbhid interfaces: 4 rev: 2.0
    speed: 12 Mb/s (1.4 MiB/s) lanes: 1 mode: 1.1 power: 100mA
    chip-ID: 1532:0529 class-ID: 0300 serial: <filter>
  Device-2: 1-3:3 info: Realtek 8812AU/8821AU 802.11ac WLAN Adapter [USB
    Wireless Dual-Band 2.4/5Ghz] type: WiFi driver: N/A interfaces: 1 rev: 2.1
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 power: 500mA
    chip-ID: 0bda:0811 class-ID: 0000 serial: <filter>
  Hub-2: 2-0:1 info: super-speed hub ports: 4 rev: 3.1
    speed: 10 Gb/s (1.16 GiB/s) lanes: 1 mode: 3.2 gen-2x1 chip-ID: 1d6b:0003
    class-ID: 0900
  Hub-3: 3-0:1 info: hi-speed hub with single TT ports: 6 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 chip-ID: 1d6b:0002
    class-ID: 0900
  Hub-4: 4-0:1 info: super-speed hub ports: 4 rev: 3.1
    speed: 10 Gb/s (1.16 GiB/s) lanes: 1 mode: 3.2 gen-2x1 chip-ID: 1d6b:0003
    class-ID: 0900
  Hub-5: 5-0:1 info: hi-speed hub with single TT ports: 2 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 chip-ID: 1d6b:0002
    class-ID: 0900
  Hub-6: 6-0:1 info: super-speed hub ports: 4 rev: 3.1
    speed: 10 Gb/s (1.16 GiB/s) lanes: 1 mode: 3.2 gen-2x1 chip-ID: 1d6b:0003
    class-ID: 0900
  Hub-7: 7-0:1 info: hi-speed hub with single TT ports: 4 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 chip-ID: 1d6b:0002
    class-ID: 0900
  Device-1: 7-2:2 info: Realtek Bluetooth Radio type: bluetooth driver: btusb
    interfaces: 2 rev: 1.1 speed: 12 Mb/s (1.4 MiB/s) lanes: 1 mode: 1.1
    power: 500mA chip-ID: 6655:8771 class-ID: e001 serial: <filter>
  Hub-8: 7-3:3 info: Genesys Logic Hub ports: 4 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 power: 100mA
    chip-ID: 05e3:0608 class-ID: 0900
  Hub-9: 7-3.1:4 info: ASIX AX68002 KVM Switch SoC ports: 7 rev: 1.0
    speed: 12 Mb/s (1.4 MiB/s) lanes: 1 mode: 1.1 power: 100mA chip-ID: 0b95:6802
    class-ID: 0900
  Device-1: 7-3.1.1:6 info: Razer USA Naga Hex type: mouse,keyboard
    driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s (1.4 MiB/s)
    lanes: 1 mode: 1.1 power: 100mA chip-ID: 1532:0041 class-ID: 0301
  Device-2: 7-3.1.2:7 info: Corsair K95 RGB Platinum Keyboard [RGP0056]
    type: keyboard,HID driver: hid-generic,usbhid interfaces: 2 rev: 2.0
    speed: 12 Mb/s (1.4 MiB/s) lanes: 1 mode: 1.1 power: 100mA
    chip-ID: 1b1c:1b2d class-ID: 0300 serial: <filter>
  Device-3: 7-3.4:5 info: 2M UVC CAMERA NexiGo N60 FHD Webcam
    type: video,audio driver: snd-usb-audio,uvcvideo interfaces: 4 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 power: 500mA
    chip-ID: 1d6c:0103 class-ID: 0102 serial: <filter>
  Hub-10: 8-0:1 info: super-speed hub ports: 4 rev: 3.1
    speed: 10 Gb/s (1.16 GiB/s) lanes: 1 mode: 3.2 gen-2x1 chip-ID: 1d6b:0003
    class-ID: 0900
Sensors:
  System Temperatures: cpu: 41.9 C mobo: 29.0 C gpu: nvidia temp: 36 C
  Fan Speeds (rpm): N/A gpu: nvidia fan: 31%
Info:
  Processes: 329 Power: uptime: 2m states: freeze,mem,disk suspend: deep
    avail: s2idle wakeups: 0 hibernate: platform avail: shutdown, reboot,
    suspend, test_resume image: 6.19 GiB services: org_kde_powerdevil,
    power-profiles-daemon, upowerd Init: systemd v: 256 default: graphical
    tool: systemctl
  Packages: pm: pacman pkgs: 1225 libs: 355 tools: pamac,yay pm: flatpak
    pkgs: 0 Compilers: N/A Shell: Bash v: 5.2.37 running-in: konsole inxi: 3.3.36

journalctl errors:

Nov 29 22:54:45 linux kernel: [Hardware Error]: System Fatal error.
Nov 29 22:54:45 linux kernel: [Hardware Error]: CPU:4 (17:71:0) MC5_STATUS[-|UE|MiscV|AddrV|PCC|TCC|SyndV|-|-|-]: 0xbea0000000000108
Nov 29 22:54:45 linux kernel: [Hardware Error]: Error Addr: 0x0001ffffc10dd07c
Nov 29 22:54:45 linux kernel: [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
Nov 29 22:54:45 linux kernel: [Hardware Error]: Execution Unit Ext. Error Code: 0
Nov 29 22:54:45 linux kernel: [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN
Nov 29 22:54:46 linux kernel: 
Nov 29 22:54:46 linux kernel: nvidia-gpu 0000:09:00.3: i2c timeout error e0000000
Nov 29 22:54:46 linux kernel: ucsi_ccg 0-0008: i2c_transfer failed -110
Nov 29 22:54:46 linux kernel: ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
Nov 29 22:54:46 linux kernel: ucsi_ccg 0-0008: probe with driver ucsi_ccg failed with error -110
Nov 29 22:55:11 linux kernel: [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
Nov 29 23:12:52 linux kernel: [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
Nov 29 23:12:52 linux kernel: [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
Nov 29 23:12:52 linux kernel: [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership
Nov 29 23:12:52 linux kernel: [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership

Update your kernel.

Kernel 6.10 has been removed from the repos. Kernel 6.12 is the new LTS kernel; you might try that, otherwise kernel 6.11.

Take care of this first, reboot and see what the status quo is then.

Regards.

Wow, I didn’t even realize the installer didn’t use the latest kernel.

I just used pacman to update all packages and updated the kernel using sudo mhwd-kernel -i linux612 and rebooted. At first it seemed like it fixed the issue, but then the crash happened and now it’s crashing during the boot cycle

Wrong order. You should have first installed a fresher kernel version. Then reboot - and THEN update your system.

Please, Manjaro chroot from a live ISO boot, then inside the chroot environment install the kernel:

pacman -Syu linux612
grub-mkconfig -o /boot/grub/grub.cfg
mkinitcpio -P
exit
exit

and reboot.

maybe something got wrong hardware wise when you mounted the hard drive…
check for connections…unmount the hard drive, and see if it still happens…
maybe your cpu is undervolted, check these posts:

and:
https://bbs.archlinux.org/viewtopic.php?id=293963

Strange, when I first tried to do it in the other order I got a warning message from mhwd-kernel. I used the live usb and can boot into the system again

I don’t think it’s an issue with the hard drive, it’s an M.2 NVME drive so it was pretty clearly plugged in all the way.

I’m not familiar with overclocking, but I did find a setting that seemed to match the acronym PBO from those answers. I couldn’t figure out what PBO+4 means, but I changed the setting from Disabled to Auto and will see if that works

Update: it’s still crashing

My motherboard doesn’t seem to support PBO curve optimizer, so I tried going into the advanced voltage settings and setting CPU VCores to 1.4V. It was noticeably more stable, but during web browsing I did see several CPU cores spike to 100% before the system crashed again.

New output from journalctl --catalog --priority=3 --boot=-1

Dec 01 13:04:30 linux kernel: [Hardware Error]: System Fatal error.
Dec 01 13:04:30 linux kernel: [Hardware Error]: CPU:5 (17:71:0) MC5_STATUS[-|UE|MiscV|AddrV|PCC|TCC|SyndV|-|-|-]: 0xbea0000000000108
Dec 01 13:04:30 linux kernel: [Hardware Error]: Error Addr: 0x0001ffffb04b9e22
Dec 01 13:04:30 linux kernel: [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
Dec 01 13:04:30 linux kernel: [Hardware Error]: Execution Unit Ext. Error Code: 0
Dec 01 13:04:30 linux kernel: [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN
Dec 01 13:04:30 linux kernel: EXT4-fs error (device nvme0n1p3): ext4_orphan_get:1414: comm mount: bad orphan inode 12738164
Dec 01 13:04:30 linux kernel: ext4_test_bit(bit=7795, block=50855954) = 0
Dec 01 13:04:31 linux kernel: 
Dec 01 13:04:31 linux kernel: nvidia-gpu 0000:09:00.3: i2c timeout error e0000000
Dec 01 13:04:31 linux kernel: ucsi_ccg 0-0008: i2c_transfer failed -110
Dec 01 13:04:31 linux kernel: ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
Dec 01 13:04:31 linux kernel: ucsi_ccg 0-0008: probe with driver ucsi_ccg failed with error -110
Dec 01 13:06:10 linux kernel: [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership

And the new output from dmesg --level=err+

[    2.957846] [Hardware Error]: System Fatal error.
[    2.957853] [Hardware Error]: CPU:3 (17:71:0) MC5_STATUS[-|UE|MiscV|AddrV|PCC|TCC|SyndV|-|-|-]: 0xbea0000000000108
[    2.957866] [Hardware Error]: Error Addr: 0x0001ffffc0af301c
[    2.957870] [Hardware Error]: IPID: 0x000500b000000000, Syndrome: 0x000000004d000000
[    2.957878] [Hardware Error]: Execution Unit Ext. Error Code: 0
[    2.957879] [Hardware Error]: cache level: RESV, tx: GEN, mem-tx: GEN
[    7.098624] EXT4-fs error (device nvme0n1p4): ext4_orphan_get:1414: comm mount: bad orphan inode 17957144
[    7.098631] ext4_test_bit(bit=279, block=71827472) = 0

[    7.476931] nvidia-gpu 0000:09:00.3: i2c timeout error e0000000
[    7.476936] ucsi_ccg 0-0008: i2c_transfer failed -110
[    7.476939] ucsi_ccg 0-0008: ucsi_ccg_init failed - -110
[    7.476943] ucsi_ccg 0-0008: probe with driver ucsi_ccg failed with error -110
[   21.846139] [drm:drm_new_set_master] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to grab modeset ownership

While this is not helpful to anyone else who comes across this in the future, I seem to have fixed the issue by taking advantage of black friday deals to run down to Micro Center and picking up a mobo/cpu/ram bundle. It seems like @brahma was correct and it was an issue with the CPU voltage but my old mobo was too old to do the recommended fix.

Marking this as solved to close the thread

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.