System has randomly crashed 3 times in the past 24hours

Hello, my system has randomly crashed 3 times in the past 24 hours. Each time I’ve been away from the machine (one time for less than 5mins). By “crash” I mean no more mouse, can’t enter TTY, system is completely locked up.

I’m looking for how to diagnose this issue. My next step after the next crash is to fall back to LTS.

System:    Kernel: 5.10.7-3-MANJARO x86_64 bits: 64 compiler: gcc v: 10.2.1 Desktop: KDE Plasma 5.20.5 Distro: Manjaro Linux 
Machine:   Type: Desktop System: Dell product: Inspiron 5675 v: 1.3.7 serial: <filter> 
           Mobo: Dell model: 07PR60 v: A00 serial: <filter> UEFI: Dell v: 1.3.7 date: 03/14/2018 
CPU:       Info: 8-Core model: AMD Ryzen 7 1700X bits: 64 type: MT MCP arch: Zen rev: 1 L2 cache: 4 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 108639 
           Speed: 3140 MHz min/max: 2200/3400 MHz boost: enabled Core speeds (MHz): 1: 3140 2: 2918 3: 1927 4: 1866 5: 1922 
           6: 1868 7: 2801 8: 2696 9: 2029 10: 2048 11: 1864 12: 1804 13: 2181 14: 2141 15: 2481 16: 2057 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] vendor: Dell 
           driver: amdgpu v: kernel bus ID: 09:00.0 
           Display: x11 server: X.Org 1.20.10 driver: loaded: amdgpu,ati unloaded: modesetting resolution: 1: 1680x1050~60Hz 
           2: 2560x1440~60Hz 
           OpenGL: renderer: AMD Radeon RX 480 Graphics (POLARIS10 DRM 3.40.0 5.10.7-3-MANJARO LLVM 11.0.1) v: 4.6 Mesa 20.3.3 
           direct render: Yes 
Audio:     Device-1: AMD Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] vendor: Dell driver: snd_hda_intel v: kernel 
           bus ID: 09:00.1 
           Device-2: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: Dell driver: snd_hda_intel v: kernel 
           bus ID: 0b:00.3 
           Device-3: Sunplus Innovation FHD Capture type: USB driver: snd-usb-audio,uvcvideo bus ID: 4-3:2 
           Device-4: HTC (High Tech ) Vive type: USB driver: snd-usb-audio,uvcvideo bus ID: 1-4.1.5:10 
           Device-5: HTC (High Tech ) Vive type: USB driver: hid-generic,usbhid 
           Sound Server: ALSA v: k5.10.7-3-MANJARO 
Network:   Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Dell driver: r8169 v: kernel port: 3000 
           bus ID: 01:00.0 
           IF: enp1s0 state: up speed: 1000 Mbps duplex: full mac: <filter> 
           Device-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter vendor: Dell driver: ath10k_pci v: kernel 
           port: 3000 bus ID: 05:00.0 
           IF: wlp5s0 state: down mac: <filter> 
           Device-3: Qualcomm Atheros type: USB driver: btusb bus ID: 1-12:6 
Drives:    Local Storage: total: 6.14 TiB used: 349.68 GiB (5.6%) 
           ID-1: /dev/nvme0n1 vendor: Western Digital model: WDS250G2X0C-00L350 size: 232.89 GiB temp: 55.9 C 
           ID-2: /dev/sda vendor: Toshiba model: DT01ACA100 size: 931.51 GiB 
           ID-3: /dev/sdb vendor: Samsung model: SSD 860 EVO 500GB size: 465.76 GiB 
           ID-4: /dev/sdc type: USB vendor: Seagate model: BUP Portable size: 4.55 TiB 
Partition: ID-1: / size: 117.62 GiB used: 17.76 GiB (15.1%) fs: ext4 dev: /dev/sdb2 
           ID-2: /boot/efi size: 299.4 MiB used: 312 KiB (0.1%) fs: vfat dev: /dev/sdb1 
           ID-3: /home size: 330.15 GiB used: 134.2 GiB (40.6%) fs: ext4 dev: /dev/sdb3 
Swap:      ID-1: swap-1 type: partition size: 9 GiB used: 0 KiB (0.0%) dev: /dev/sdb4 
Sensors:   System Temperatures: cpu: 46.1 C mobo: N/A gpu: amdgpu temp: 54.0 C 
           Fan Speeds (RPM): fan-1: 858 fan-2: 1194 gpu: amdgpu fan: 800 
Info:      Processes: 387 Uptime: 11m Memory: 15.59 GiB used: 3.1 GiB (19.9%) Init: systemd Compilers: gcc: 10.2.0 
           Packages: 1480 Shell: Bash v: 5.1.0 inxi: 3.2.02 

I’m running KDE. Additionally I usually have Firefox, Kate, Dolphin, and Element running. All crashes all those softwares were running. I also run fcitx for my IME.

I try to keep a close eye on temps and memory usage at all times. I know the GPU temps are fine (I run my own fan software, and my PC turns into a jet engine when it spikes, it also has hardware throttling), CPU temps should be fine (but again, away from PC when it dies), and RAM usage on at least the 2 past crashes should have been very low.

I’ve had a similar issue in the past where if my NVME drive wasn’t loaded correctly (nvme_core.default_ps_max_latency_us=5500), if something tried to scan it (lspci), I’d get similar lockups, I found that incredibly tricky to diagnose.

Some people we are having this problem with 5.10 kernel, of crashes after some time. Maybe check with other kernel before, to see if it is the same issue.

