Sudden reboots

Hi there,
Been having a weird intermittent issue and I'm not sure where to start diagnosing this.

Once in a while my screen will go black while I'm in the middle of doing something (browsing the web, playing a game, doesn't seem to be task dependent) and a few seconds later I get my BIOS splash and my computer reboots.

Appreciate any help on this, and bear with me, bit of a noob on a lot of basic Linux diagnostics.

inxi -SxG
System: Host: ED11 Kernel: 5.2.11-1-MANJARO x86_64 bits: 64 compiler: gcc v: 9.1.0 Desktop: Gnome 3.32.2
Distro: Manjaro Linux
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] vendor: Tul driver: amdgpu
v: kernel bus ID: 0b:00.0
Display: x11 server: X.org 1.20.5 driver: none resolution:
OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.32.0 5.2.11-1-MANJARO LLVM 8.0.1) v: 4.5 Mesa 19.1.5
direct render: Yes

In my experience a complete sudden reboot is usually hardware rather than software related. You could test this for sure by booting a Live Linux CD distro (even Manjaro) - from a CD or USB stick. If you get the same intermittent reboot than you can be sure you have a hardware, not software, fault.

(Perhaps your CPU is overheating??)

1 Like

Post the full inxi.

inxi -Fxxxza --no-host
Please use the </> button with the pasted text so it'll be formatted properly.

2 Likes

with a thermal limit safety shut down enforced by EFI or legacy BIOS the computer would not reboot itself immediately, it would stay off. A hardware fault is entirely possible but that restarting behaviour rules out CPU overheating.

@cactus check for EFI/BIOS updates and if your motherboard is not running the current release update it.

3 Likes
inxi -Fxxxza --no-host
System:    Kernel: 5.2.11-1-MANJARO x86_64 bits: 64 compiler: gcc v: 9.1.0 
           parameters: BOOT_IMAGE=/boot/vmlinuz-5.2-x86_64 root=UUID=7ac8885e-f0a0-497b-96f4-7f2faf1ca394 rw quiet 
           Desktop: Gnome 3.32.2 wm: gnome-shell dm: GDM 3.32.0 Distro: Manjaro Linux 
Machine:   Type: Desktop Mobo: Gigabyte model: AX370-Gaming 5 serial: <filter> UEFI [Legacy]: American Megatrends v: F5 
           date: 04/07/2017 
CPU:       Topology: 8-Core model: AMD Ryzen 7 1700X bits: 64 type: MT MCP arch: Zen family: 17 (23) model-id: 1 stepping: 1 
           microcode: 800111C L2 cache: 4096 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 108624 
           Speed: 2120 MHz min/max: 2200/3400 MHz boost: enabled Core speeds (MHz): 1: 2672 2: 2634 3: 2338 4: 2334 5: 1747 
           6: 1748 7: 1744 8: 1738 9: 2057 10: 1977 11: 2083 12: 2276 13: 1930 14: 1920 15: 2933 16: 2847 
           Vulnerabilities: Type: l1tf status: Not affected 
           Type: mds status: Not affected 
           Type: meltdown status: Not affected 
           Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 mitigation: Full AMD retpoline, STIBP: disabled, RSB filling 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] vendor: Tul driver: amdgpu 
           v: kernel bus ID: 0b:00.0 chip ID: 1002:687f 
           Display: x11 server: X.org 1.20.5 driver: none compositor: gnome-shell resolution: <xdpyinfo missing> 
           OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.32.0 5.2.11-1-MANJARO LLVM 8.0.1) v: 4.5 Mesa 19.1.5 
           direct render: Yes 
Audio:     Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64] driver: snd_hda_intel v: kernel 
           bus ID: 0b:00.1 chip ID: 1002:aaf8 
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: Gigabyte driver: snd_hda_intel v: kernel 
           bus ID: 12:00.3 chip ID: 1022:1457 
           Device-3: C-Media Blue Snowball type: USB driver: hid-generic,snd-usb-audio,usbhid bus ID: 5-1:2 chip ID: 0d8c:0005 
           serial: <filter> 
           Device-4: C-Media type: USB driver: hid-generic,snd-usb-audio,usbhid bus ID: 5-2:3 chip ID: 0d8c:0004 
           Sound Server: ALSA v: k5.2.11-1-MANJARO 
Network:   Device-1: Intel I211 Gigabit Network vendor: Gigabyte driver: igb v: 5.6.0-k port: e000 bus ID: 06:00.0 
           chip ID: 8086:1539 
           IF: ens1 state: down mac: <filter> 
           Device-2: Qualcomm Atheros Killer E2500 Gigabit Ethernet vendor: Gigabyte driver: alx v: kernel port: d000 
           bus ID: 07:00.0 chip ID: 1969:e0b1 
           IF: enp7s0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
Drives:    Local Storage: total: 4.09 TiB used: 839.19 GiB (20.0%) 
           ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 960 EVO 500GB size: 465.76 GiB block size: physical: 512 B 
           logical: 512 B speed: 31.6 Gb/s lanes: 4 serial: <filter> rev: 2B7QCXE7 scheme: GPT 
           ID-2: /dev/sda vendor: Samsung model: SSD 860 QVO 1TB size: 931.51 GiB block size: physical: 512 B logical: 512 B 
           speed: 6.0 Gb/s serial: <filter> rev: 1B6Q scheme: MBR 
           ID-3: /dev/sdb vendor: Seagate model: ST1000DM003-1CH162 size: 931.51 GiB block size: physical: 4096 B 
           logical: 512 B speed: 6.0 Gb/s rotation: 7200 rpm serial: <filter> rev: CC49 scheme: MBR 
           ID-4: /dev/sdc vendor: Crucial model: CT1000MX500SSD1 size: 931.51 GiB block size: physical: 512 B logical: 512 B 
           speed: 6.0 Gb/s serial: <filter> rev: 020 scheme: GPT 
           ID-5: /dev/sdd vendor: Seagate model: ST1000DM003-9YN162 size: 931.51 GiB block size: physical: 4096 B 
           logical: 512 B speed: 6.0 Gb/s rotation: 7200 rpm serial: <filter> rev: CC4B scheme: MBR 
Partition: ID-1: / raw size: 232.60 GiB size: 227.95 GiB (98.00%) used: 21.31 GiB (9.3%) fs: ext4 dev: /dev/nvme0n1p5 
Sensors:   System Temperatures: cpu: 41.4 C mobo: N/A gpu: amdgpu temp: 28 C 
           Fan Speeds (RPM): N/A gpu: amdgpu fan: 914 
Info:      Processes: 517 Uptime: 3h 10m Memory: 15.67 GiB used: 3.67 GiB (23.4%) Init: systemd v: 242 Compilers: gcc: 9.1.0 
           Shell: bash v: 5.0.9 running in: guake inxi: 3.0.36

Update your BIOS. It's very old.

1 Like

Just finished doing that, fingers crossed!

Also test alternate kernels both newer and older.

2 Likes

To me with 30 yrs of experience this smells physical damaged RAM. It might be something else but I'd say 85% chance that's what wrong.

2 Likes

This is still happening unfortunately. I might try the RAM sticks next but this Hardware Error seems to be a fundamental issue with Ryzen 7.

journalctl -p err -b
-- Logs begin at Fri 2019-08-23 01:26:23 CDT, end at Wed 2019-09-18 19:51:43 CDT. --
Sep 18 19:45:18 ED11 kernel: mce: [Hardware Error]: CPU 15: Machine Check: 0 Bank 5: bea0000000000108
Sep 18 19:45:18 ED11 kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff86a6c12c MISC d012000100000000 SYND 4d000000 IPID 500b000000000 
Sep 18 19:45:18 ED11 kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1568853912 SOCKET 0 APIC f microcode 8001138
Sep 18 19:45:18 ED11 kernel: hid-generic 0003:0D8C:0005.0001: No inputs registered, leaving
Sep 18 19:45:18 ED11 kernel: usb 1-7.4.4: device descriptor read/64, error -32
Sep 18 19:45:18 ED11 kernel: usb 1-7.4.4: device descriptor read/64, error -32
Sep 18 19:45:18 ED11 kernel: usb 1-7.4.4: device descriptor read/64, error -32
Sep 18 19:45:18 ED11 kernel: usb 1-7.4.4: device descriptor read/64, error -32
Sep 18 19:45:18 ED11 kernel: usb 1-7.4.4: device not accepting address 7, error -71
Sep 18 19:45:18 ED11 kernel: usb 1-7.4.4: device not accepting address 8, error -71
Sep 18 19:45:18 ED11 kernel: usb 1-7.4-port4: unable to enumerate USB device
Sep 18 19:45:19 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:19 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:19 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:19 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:19 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:19 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:20 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:21 ED11 kernel: kvm: disabled by bios
Sep 18 19:45:26 ED11 gdm-password][1193]: gkr-pam: unable to locate daemon control file

An update: I disabled C-State 6 and greater in my UEFI and I haven't had a sudden reboot in the last week. :crossed_fingers:

I found a thread on the forum related to the issue if you can't disable via UEFI:

4 Likes

I was also getting this same issue and initially thought memory issue but notice a processor hardware fault during boot. After running this fix I have not had the random reboots.

Something to add to your fix is to add the msr module to /etc/modules-load.d

so it loads at boot before the service we created is ran.

# List of modules to load at boot
msr
1 Like

Forum kindly sponsored by Bytemark