Hi,
for months now, every 2-5 days of uptime or so I get a nasty freeze.
I’ve enabled REISUB for such cases and it has allowed me to shut down the system in a “safer” way in many cases already but it doesn’t in this one.
sudo journalctl -b -1 -e
Mar 07 23:19:30 eviu-PC kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 23s! [khugepaged:122]
Mar 07 23:19:30 eviu-PC kernel: Modules linked in: vhost_net vhost vhost_iotlb tap tun ntfs3 snd_seq_dummy snd_seq_midi snd_hrtimer snd_seq_midi_event snd_seq xt_CH>
Mar 07 23:19:30 eviu-PC kernel: acpi_pad acpi_tad mac_hid nct6687(OE) dm_multipath uinput crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16>
Mar 07 23:19:30 eviu-PC kernel: CPU: 10 PID: 122 Comm: khugepaged Tainted: G OE 6.1.12-1-MANJARO #1 d419fb51ba9431ae2a4575820ea6b5b95f50a34f
Mar 07 23:19:30 eviu-PC kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.80 08/31/2022
Mar 07 23:19:30 eviu-PC kernel: RIP: 0010:smp_call_function_many_cond+0xee/0x310
Mar 07 23:19:30 eviu-PC kernel: Code: d0 48 89 df e8 93 47 40 00 3b 05 ed 94 ea 01 73 26 48 63 d0 49 8b 34 24 48 03 34 d5 a0 2a d6 ad 8b 56 08 83 e2 01 74 0a f3 90 >
Mar 07 23:19:30 eviu-PC kernel: RSP: 0018:ffffbf8440567c08 EFLAGS: 00000202
Mar 07 23:19:30 eviu-PC kernel: RAX: 0000000000000000 RBX: ffff9e7ba04b4108 RCX: 0000000000000001
Mar 07 23:19:30 eviu-PC kernel: RDX: 0000000000000001 RSI: ffffdf843f616a40 RDI: ffff9e7ba04b4108
Mar 07 23:19:30 eviu-PC kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e7ba04a0ee0
Mar 07 23:19:30 eviu-PC kernel: R10: ffff9e6d561b5000 R11: 0000000000000000 R12: ffff9e7ba04b4100
Mar 07 23:19:30 eviu-PC kernel: R13: 0000000000000001 R14: 0000000000000010 R15: 000000000000000a
Mar 07 23:19:30 eviu-PC kernel: FS: 0000000000000000(0000) GS:ffff9e7ba0480000(0000) knlGS:0000000000000000
Mar 07 23:19:30 eviu-PC kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 07 23:19:30 eviu-PC kernel: CR2: 00007f303deec000 CR3: 00000006299ae002 CR4: 0000000000f72ee0
Mar 07 23:19:30 eviu-PC kernel: PKRU: 55555554
Mar 07 23:19:30 eviu-PC kernel: Call Trace:
Mar 07 23:19:30 eviu-PC kernel: <TASK>
Mar 07 23:19:30 eviu-PC kernel: ? mm_take_all_locks+0x210/0x210
Mar 07 23:19:30 eviu-PC kernel: smp_call_function+0x2c/0x50
Mar 07 23:19:30 eviu-PC kernel: collapse_huge_page+0x5ba/0x1430
Mar 07 23:19:30 eviu-PC kernel: ? schedule+0x5e/0xd0
Mar 07 23:19:30 eviu-PC kernel: hpage_collapse_scan_pmd+0x5b2/0x820
Mar 07 23:19:30 eviu-PC kernel: khugepaged+0x500/0x970
Mar 07 23:19:30 eviu-PC kernel: ? collapse_pte_mapped_thp+0x5d0/0x5d0
Mar 07 23:19:30 eviu-PC kernel: kthread+0xdb/0x110
Mar 07 23:19:30 eviu-PC kernel: ? kthread_complete_and_exit+0x20/0x20
Mar 07 23:19:30 eviu-PC kernel: ret_from_fork+0x1f/0x30
Mar 07 23:19:30 eviu-PC kernel: </TASK>
Mar 07 23:19:58 eviu-PC kernel: watchdog: BUG: soft lockup - CPU#10 stuck for 49s! [khugepaged:122]
Non-standard stuff about my system:
pipewire
tkg-wine with fsync and yabridge for running audio plugins
RME FireFace 800 audio interface via LSI FW643 PCIe Firewire controller, the LSI card only shows up and works from a cold boot (not when I restart)
It also happened back on 5.19 kernels, now I’m on 6.1.12-1.
The strange thing is, if I have htop open while this happens, it continues to update but mouse and keyboard are locked-up. Any video will lock up too.
If there was any audio playing, the fireface will repeat its last buffer continuously.
Please tell me if you need more information
If we can’t make the issue go away, the second-best would be to shut down the system safely to avoid file system corruption (also have some NTFS and LUKS+ext4 HDDs mounted via fstab).