Random Reboot - Unknown Cause?

System:
  Kernel: 5.10.34-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.0 
  parameters: BOOT_IMAGE=/boot/vmlinuz-5.10-x86_64 
  root=UUID=ac1bab6b-e3af-4dd2-b618-7da545d55fde rw quiet amd_iommu=on 
  apparmor=1 security=apparmor 
  resume=UUID=770793eb-7acc-45f4-a2d3-507b493689dd udev.log_priority=3 
  Desktop: Xfce 4.16.0 tk: Gtk 3.24.24 info: xfce4-panel, plank wm: xfwm4 
  vt: 7 dm: LightDM 1.30.0 Distro: Manjaro Linux base: Arch Linux 
Machine:
  Type: Desktop System: Micro-Star product: MS-7C34 v: 1.0 serial: <filter> 
  Mobo: Micro-Star model: MEG X570 GODLIKE (MS-7C34) v: 1.0 serial: <filter> 
  UEFI: American Megatrends LLC. v: 1.D2 date: 03/23/2021 
Memory:
  RAM: total: 31.34 GiB used: 4.28 GiB (13.7%) 
  RAM Report: permissions: Unable to run dmidecode. Root privileges required. 
CPU:
  Info: 16-Core (2-Die) model: AMD Ryzen 9 3950X bits: 64 type: MT MCP MCM 
  arch: Zen 2 family: 17 (23) model-id: 71 (113) stepping: 0 
  microcode: 8701021 cache: L2: 8 MiB bogomips: 224000 
  Speed: 2199 MHz min/max: 2200/3500 MHz boost: enabled Core speeds (MHz): 
  1: 2199 2: 2200 3: 3630 4: 2100 5: 2200 6: 2201 7: 2198 8: 2199 9: 2200 
  10: 2200 11: 2199 12: 2199 13: 2200 14: 2199 15: 2200 16: 2199 17: 2200 
  18: 2140 19: 4159 20: 3716 21: 2059 22: 2200 23: 2200 24: 2200 25: 2199 
  26: 2200 27: 2199 28: 2200 29: 2200 30: 2200 31: 2109 32: 3128 
  Flags: 3dnowprefetch abm adx aes aperfmperf apic arat avic avx avx2 bmi1 
  bmi2 bpext cat_l3 cdp_l3 clflush clflushopt clwb clzero cmov cmp_legacy 
  constant_tsc cpb cpuid cqm cqm_llc cqm_mbm_local cqm_mbm_total cqm_occup_llc 
  cr8_legacy cx16 cx8 de decodeassists extapic extd_apicid f16c flushbyasid 
  fma fpu fsgsbase fxsr fxsr_opt ht hw_pstate ibpb ibs irperf lahf_lm lbrv lm 
  mba mca mce misalignsse mmx mmxext monitor movbe msr mtrr mwaitx nonstop_tsc 
  nopl npt nrip_save nx osvw overflow_recov pae pat pausefilter pclmulqdq 
  pdpe1gb perfctr_core perfctr_llc perfctr_nb pfthreshold pge pni popcnt pse 
  pse36 rdpid rdpru rdrand rdseed rdt_a rdtscp rep_good sep sev sev_es sha_ni 
  skinit smap smca sme smep ssbd sse sse2 sse4_1 sse4_2 sse4a ssse3 stibp 
  succor svm svm_lock syscall tce topoext tsc tsc_scale umip v_vmsave_vmload 
  vgif vmcb_clean vme vmmcall wbnoinvd wdt xgetbv1 xsave xsavec xsaveerptr 
  xsaveopt xsaves 
  Vulnerabilities: Type: itlb_multihit status: Not affected 
  Type: l1tf status: Not affected 
  Type: mds status: Not affected 
  Type: meltdown status: Not affected 
  Type: spec_store_bypass 
  mitigation: Speculative Store Bypass disabled via prctl and seccomp 
  Type: spectre_v1 
  mitigation: usercopy/swapgs barriers and __user pointer sanitization 
  Type: spectre_v2 mitigation: Full AMD retpoline, IBPB: conditional, STIBP: 
  conditional, RSB filling 
  Type: srbds status: Not affected 
  Type: tsx_async_abort status: Not affected 
Graphics:
  Device-1: NVIDIA TU102 [GeForce RTX 2080 Ti Rev. A] vendor: eVga.com. 
  driver: nvidia v: 465.27 alternate: nouveau,nvidia_drm bus-ID: 2f:00.0 
  chip-ID: 10de:1e07 class-ID: 0300 
  Device-2: NVIDIA TU102 [GeForce RTX 2080 Ti Rev. A] vendor: eVga.com. 
  driver: vfio-pci v: 0.2 alternate: nouveau,nvidia_drm,nvidia bus-ID: 30:00.0 
  chip-ID: 10de:1e07 class-ID: 0300 
  Display: x11 server: X.Org 1.20.11 driver: loaded: modesetting,nvidia 
  display-ID: :0.0 screens: 1 
  Screen-1: 0 s-res: 7680x1440 s-dpi: 92 s-size: 2120x393mm (83.5x15.5") 
  s-diag: 2156mm (84.9") 
  Monitor-1: DP-0 res: 2560x1440 hz: 60 dpi: 93 size: 698x393mm (27.5x15.5") 
  diag: 801mm (31.5") 
  Monitor-2: DP-2 res: 2560x1440 hz: 60 dpi: 93 size: 698x393mm (27.5x15.5") 
  diag: 801mm (31.5") 
  Monitor-3: DP-4 res: 2560x1440 hz: 60 dpi: 93 size: 698x393mm (27.5x15.5") 
  diag: 801mm (31.5") 
  OpenGL: renderer: NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2 
  v: 4.6.0 NVIDIA 465.27 direct render: Yes 
Audio:
  Device-1: NVIDIA TU102 High Definition Audio vendor: eVga.com. 
  driver: snd_hda_intel v: kernel bus-ID: 2f:00.1 chip-ID: 10de:10f7 
  class-ID: 0403 
  Device-2: NVIDIA TU102 High Definition Audio vendor: eVga.com. 
  driver: vfio-pci v: 0.2 alternate: snd_hda_intel bus-ID: 30:00.1 
  chip-ID: 10de:10f7 class-ID: 0403 
  Device-3: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI 
  driver: snd_hda_intel v: kernel bus-ID: 32:00.4 chip-ID: 1022:1487 
  class-ID: 0403 
  Device-4: SteelSeries ApS SteelSeries Arctis 7 type: USB 
  driver: hid-generic,snd-usb-audio,usbhid bus-ID: 3-2:3 chip-ID: 1038:12ad 
  class-ID: 0300 
  Sound Server-1: ALSA v: k5.10.34-1-MANJARO running: yes 
  Sound Server-2: JACK v: 0.125.0 running: no 
  Sound Server-3: PulseAudio v: 14.2 running: yes 
  Sound Server-4: PipeWire v: 0.3.26 running: no 
Network:
  Device-1: Realtek vendor: Micro-Star MSI driver: r8169 v: kernel port: c000 
  bus-ID: 29:00.0 chip-ID: 10ec:2600 class-ID: 0200 
  IF: enp41s0 state: down mac: <filter> 
  Device-2: Realtek vendor: Micro-Star MSI driver: r8169 v: kernel port: b000 
  bus-ID: 2a:00.0 chip-ID: 10ec:3000 class-ID: 0200 
  IF: enp42s0 state: down mac: <filter> 
  Device-3: Intel Wi-Fi 6 AX200 vendor: Rivet Networks driver: iwlwifi 
  v: kernel port: b000 bus-ID: 2b:00.0 chip-ID: 8086:2723 class-ID: 0280 
  IF: wlo1 state: up mac: <filter> 
  IP v4: <filter> type: dynamic noprefixroute scope: global 
  broadcast: <filter> 
  IP v6: <filter> type: noprefixroute scope: link 
  IF-ID-1: docker0 state: up speed: 10000 Mbps duplex: unknown mac: <filter> 
  IP v4: <filter> scope: global broadcast: <filter> 
  IP v6: <filter> scope: link 
  IF-ID-2: outline-tun0 state: down mac: N/A 
  IP v4: <filter> scope: global 
  IF-ID-3: veth0ec5d77 state: up speed: 10000 Mbps duplex: full mac: <filter> 
  IF-ID-4: virbr0 state: down mac: <filter> 
  IP v4: <filter> scope: global broadcast: <filter> 
  WAN IP: <filter> 
Bluetooth:
  Device-1: Intel AX200 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 3-4:4 
  chip-ID: 8087:0029 class-ID: e001 
  Report: rfkill ID: hci0 rfk-id: 1 state: down bt-service: enabled,running 
  rfk-block: hardware: no software: yes address: see --recommends 
Logical:
  Message: No logical block device data found. 
RAID:
  Message: No RAID data found. 
Drives:
  Local Storage: total: 1.82 TiB used: 241.54 GiB (13.0%) 
  SMART Message: Required tool smartctl not installed. Check --recommends 
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Seagate 
  model: FireCuda 520 SSD ZP2000GM30002 size: 1.82 TiB block-size: 
  physical: 512 B logical: 512 B speed: 63.2 Gb/s lanes: 4 rotation: SSD 
  serial: <filter> rev: STNSC014 temp: 35.9 C scheme: GPT 
  Message: No optical or floppy data found. 
Partition:
  ID-1: / raw-size: 1.81 TiB size: 1.78 TiB (98.37%) used: 241.54 GiB (13.2%) 
  fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2 label: N/A 
  uuid: ac1bab6b-e3af-4dd2-b618-7da545d55fde 
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%) 
  used: 296 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1 
  label: NO_LABEL uuid: 0C0A-17FB 
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) 
  ID-1: swap-1 type: partition size: 8.8 GiB used: 0 KiB (0.0%) priority: -2 
  dev: /dev/nvme0n1p3 maj-min: 259:3 label: N/A 
  uuid: 770793eb-7acc-45f4-a2d3-507b493689dd 
Unmounted:
  Message: No unmounted partitions found. 
USB:
  Hub-1: 1-0:1 info: Full speed (or root) Hub ports: 2 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 1-1:2 info: MCT Elektronikladen QUADRO type: HID 
  driver: hid-generic,usbhid interfaces: 2 rev: 2.1 speed: 12 Mb/s power: 2mA 
  chip-ID: 0c70:f00d class-ID: 0300 serial: <filter> 
  Device-2: 1-2:3 info: MCT Elektronikladen QUADRO type: HID 
  driver: hid-generic,usbhid interfaces: 2 rev: 2.1 speed: 12 Mb/s power: 2mA 
  chip-ID: 0c70:f00d class-ID: 0300 serial: <filter> 
  Hub-2: 2-0:1 info: Full speed (or root) Hub ports: 2 rev: 3.0 speed: 5 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-3: 3-0:1 info: Full speed (or root) Hub ports: 6 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 3-1:2 info: [Maxxter] Optical Gaming Mouse [Xtrem] 
  type: Mouse,Keyboard driver: hid-generic,usbhid interfaces: 2 rev: 1.1 
  speed: 1.5 Mb/s power: 100mA chip-ID: 18f8:0f97 class-ID: 0301 
  Device-2: 3-2:3 info: SteelSeries ApS SteelSeries Arctis 7 type: Audio,HID 
  driver: hid-generic,snd-usb-audio,usbhid interfaces: 6 rev: 1.1 
  speed: 12 Mb/s power: 100mA chip-ID: 1038:12ad class-ID: 0300 
  Device-3: 3-4:4 info: Intel AX200 Bluetooth type: Bluetooth driver: btusb 
  interfaces: 2 rev: 2.0 speed: 12 Mb/s power: 100mA chip-ID: 8087:0029 
  class-ID: e001 
  Device-4: 3-5:5 info: MCT Elektronikladen aquaero type: Keyboard,HID 
  driver: hid-generic,usbhid interfaces: 3 rev: 2.0 speed: 12 Mb/s 
  power: 100mA chip-ID: 0c70:f001 class-ID: 0300 serial: <filter> 
  Hub-4: 4-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-5: 5-0:1 info: Full speed (or root) Hub ports: 6 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 5-5:2 info: Micro Star MYSTIC LIGHT type: HID 
  driver: hid-generic,usbhid interfaces: 1 rev: 1.1 speed: 12 Mb/s 
  power: 500mA chip-ID: 1462:7c34 class-ID: 0300 serial: <filter> 
  Hub-6: 6-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-7: 7-0:1 info: Full speed (or root) Hub ports: 2 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Hub-8: 8-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 speed: 10 Gb/s 
  chip-ID: 1d6b:0003 class-ID: 0900 
  Hub-9: 9-0:1 info: Full speed (or root) Hub ports: 4 rev: 2.0 
  speed: 480 Mb/s chip-ID: 1d6b:0002 class-ID: 0900 
  Device-1: 9-1:2 info: Logitech G413 Gaming Keyboard type: Keyboard,HID 
  driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s 
  power: 500mA chip-ID: 046d:c33a class-ID: 0300 serial: <filter> 
  Device-2: 9-2:3 info: SteelSeries ApS SteelSeries Arctis 7 Bootloader 
  type: HID driver: hid-generic,usbhid interfaces: 1 rev: 1.1 speed: 12 Mb/s 
  power: 500mA chip-ID: 1038:12ae class-ID: 0300 
  Device-3: 9-4:4 info: Cyber Power System PR1500LCDRT2U UPS type: HID 
  driver: hid-generic,usbhid interfaces: 1 rev: 2.0 speed: 12 Mb/s power: 2mA 
  chip-ID: 0764:0601 class-ID: 0300 serial: <filter> 
  Hub-10: 10-0:1 info: Full speed (or root) Hub ports: 4 rev: 3.1 
  speed: 10 Gb/s chip-ID: 1d6b:0003 class-ID: 0900 
Sensors:
  System Temperatures: cpu: 32.4 C mobo: 34.0 C gpu: nvidia temp: 27 C 
  Fan Speeds (RPM): fan-1: 0 fan-2: 0 fan-3: 0 fan-4: 0 fan-5: 0 fan-6: 0 
  fan-7: 0 gpu: nvidia fan: 0% 
Info:
  Processes: 510 Uptime: 30m wakeups: 1 Init: systemd v: 247 tool: systemctl 
  Compilers: gcc: 10.2.0 Packages: pacman: 1206 lib: 340 flatpak: 0 
  Shell: Bash v: 5.1.4 running-in: xfce4-terminal inxi: 3.3.04 

System just randomly decided to reboot, and I’m unable to detect anything at all and no indications are given in any logs, as the last journalctl entry is 30 minutes prior to the reboot.

  • I can rule out power failure. The system is connected to a battery backup UPS which immediately alarms on power failure.

  • I can rule out temperatures or overheating, as the system is cooled by a custom loop that is monitored by multiple temperature sensors, logged, and tracked with no abnormalities.

  • I can rule out button events, such as pushing the button, as I have those configured to require prompting before action, and I was in the same room as the PC when it power cycled.

This is the first time it has happened, and while I don’t have any idea of what potentially could have caused it, is there a way to figure that out or at least turn on some sort of verbose logging scheme that would give me hints in the future?

Edit: To clarify, I’m looking specifically for answers that tell me if there are more verbose logging methods, or ways of setting up more informational logs/dumps that would be able to tell me exactly what happened or if it happens in the future again what caused it.

Edit: Did some more digging. While there is a gap of about an hour in my journalctl output, last -x did give some more information.

tellik   tty7         :0               Sat May 15 22:58   still logged in
reboot   system boot  5.10.34-1-MANJAR Sat May 15 22:58   still running
tellik   tty7         :0               Sat May  8 20:01 - crash (7+02:57)
reboot   system boot  5.10.34-1-MANJAR Sat May  8 20:01   still running
shutdown system down  5.10.34-1-MANJAR Sat May  8 20:01 - 20:01  (00:00)
tellik   tty2                          Sat May  8 19:51 - down   (00:09)
tellik   tty7         :0               Sat May  8 19:38 - 19:52  (00:14)
reboot   system boot  5.10.34-1-MANJAR Sat May  8 19:38 - 20:01  (00:23)
shutdown system down  5.10.32-1-MANJAR Sat May  8 19:37 - 19:38  (00:00)
tellik   tty7         :0               Mon May  3 10:49 - 19:37 (5+08:47)

However, I’m not really sure exactly how to make sense of the output. Is the crash (7+02:57) an indication of something crashing and causing a system reboot? Is there a way to discover more information about what caused that if so? Also, dmidecode gives me some additional output:

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: Micro-Star International Co., Ltd.
        Product Name: MS-7C34
        Version: 1.0
        Serial Number: To be filled by O.E.M.
        UUID: Not Present
        Wake-up Type: Power Switch
        SKU Number: To be filled by O.E.M.
        Family: To be filled by O.E.M.

Was a sudden reboot or like a safe reboot, closing applications and normally starting up again? Maybe you have a scheduled task?

Do you experience the reboot yourself? Is it instant or like a normal reboot sequence, as for the latter, journalctl should write a note about it.

may be add this option iommu=pt ( pass trough )

  • no end task abnormal before the reboot
  • maybe search with USB & ports

It was basically where I heard a click, and the RGB on my system flicked off and then back on. Then the computer started back up and logged in as normal. Chrome complained that it was shut down suddenly and asked if I wanted to restore the tabs.

I shouldn’t have any scheduled tasks, this is a relatively new install, and it was running for about 8+ days without happening before, and was the first time it’s happened like this. The only time I reboot is when an update, like the kernel, asks me to.

I didn’t experience it from being at the computer and actively using it if that’s what you mean. With the way I have everything setup right now, my PC is in my room where I am sleeping and I was laying down when it happened. But it was abnormal enough for me to jump up.

I did notice that on booting up, journal did the repair check on the disk like it normally does for unclean shutdown/reboots.

But yeah, like I said there was nothing that I could find in any logging or journalctl that mentioned it, or mentioned any event within the last 30 minutes before it happened.

Can you explain this a little more? Are you suggesting that IOMMU pass through might be a culprit? (Your response seems like it’s missing a little information there.)

Similar threads:

I can only sense hardware problem with the disk. The symptoms:

  • Journalctl didn’t (couldn’t?) write any log
  • Disk check for unclean shutdown on reboot

maybe try checking S.M.A.R.T. data on your disk to see if anything reaches a critical value.

I appreciate the attempt, but saying they’re similar because they contain the word reboot doesn’t actually work. It actually makes me feel like you didn’t even bother reading my original post. Especially when one of the random ‘similar’ threads talks about it being a temperature issue, which I can already safely rule out.

While I would like to know what caused it, I’m more looking for where it would be possible to tell me what caused it, or figure out a way to turn on more verbose logging in case it happens in the future.

This isn’t a constant problem, and has less to do with the rebooting aspect than it does with just trying to figure out what happened.

Effectively it’s like someone came and pushed the reset button on the PC. And while I do have cats, the chances of them pushing and holding down either the reset or power buttons is next to impossible in this case.

My first thought is that maybe the UPS battery backup that the PC goes through maybe had a hiccup and dropped power enough to cause a reboot. Though I would say that is more likely than an issue with that or the power supply than the disk itself.

If it is a problem with the disk, that shouldn’t cause a full system power cycle like it did, and instead I’d see other symptoms. However, running SMART tests reveals no issues and everything seems to be operating normally.