How to debug / fix / workaround system freeze?

I just switched my laptop from powersave to performance mode to make a build run faster, sadly a few seconds after that the laptop froze.

after rebooting journalctl -b -1 ends with those lines:

Aug 06 09:16:45 LenovoP16v rtkit-daemon[979]: Supervising 4 threads of 4 processes of 1 users.
Aug 06 09:16:58 LenovoP16v kernel: BUG: unable to handle page fault for address: 00000000000175c4
Aug 06 09:16:58 LenovoP16v kernel: #PF: supervisor read access in kernel mode
Aug 06 09:16:58 LenovoP16v kernel: #PF: error_code(0x0000) - not-present page
Aug 06 09:16:58 LenovoP16v kernel: PGD 0 P4D 0 
Aug 06 09:16:58 LenovoP16v kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Aug 06 09:16:58 LenovoP16v kernel: CPU: 8 PID: 39059 Comm: electron Tainted: P           OE      6.6.44-1-MANJARO #1 8598aea1d868f10d66f5d5ae2b57b59e735cd775
Aug 06 09:16:58 LenovoP16v kernel: Hardware name: LENOVO 21FC000QGE/21FC000QGE, BIOS N3UET22W (1.09 ) 11/15/2023

Is that a kernel panic? Interestingly unlike in those cases: Shutdown sometimes freezes laptop?
the caps-lock indicator was not blinking.

Hi @thomas85,

I don’t think it was a kernel panic. That would’ve been in the logs and you mentioned the Caps Lock light wasn’t blinking.

I don’t know how you performed the reboot you mentioned, but a hard poweroff or reset is a Bad Idea :tm: , I’ve been told. Instead, it’s better to use this: :point_down:

It might have still been busy, or a temperature problem.

1 Like

Well I did wait around for a time, but I wasn’t able to toggle num-lock, caps-lock or mute indicators, also I wasn’t able to strg+alt+F* to a TTY console, so a forced hardware shutdown seemed like the only way to get it back to a working state.

It just happened again, gonna try “balanced” cpu profile now, it did work stable for the last 2 days with the powersave profile.

journal of second freeze:

Aug 06 09:52:51 LenovoP16v thunderbird[1435]: JavaScript error: resource:///modules/calendar/CalStorageDatabase.jsm, line 264: NS_ERROR_FAILURE: error executing async statement
Aug 06 09:52:51 LenovoP16v thunderbird[1435]: console.error: Calendar:
Aug 06 09:52:51 LenovoP16v thunderbird[1435]:   Message: [Exception... "error executing async statement"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource:///modules/calendar/CalStorageDatabase.jsm :: handleCompletion :: line 264"  data: no]
Aug 06 09:52:51 LenovoP16v thunderbird[1435]:   Stack:
Aug 06 09:52:51 LenovoP16v thunderbird[1435]:     handleCompletion@resource:///modules/calendar/CalStorageDatabase.jsm:264:33
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: This plugin does not support raise()
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Aug 06 09:52:52 LenovoP16v kwin_wayland[944]: kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Aug 06 09:52:53 LenovoP16v kernel: BUG: unable to handle page fault for address: 000000000000158e
Aug 06 09:52:53 LenovoP16v kernel: #PF: supervisor read access in kernel mode
Aug 06 09:52:53 LenovoP16v kernel: #PF: error_code(0x0000) - not-present page
Aug 06 09:52:53 LenovoP16v kernel: PGD 0 P4D 0 
Aug 06 09:52:53 LenovoP16v kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Aug 06 09:52:53 LenovoP16v kernel: CPU: 6 PID: 28963 Comm: StreamTrans #22 Tainted: P           OE      6.6.44-1-MANJARO #1 8598aea1d868f10d66f5d5ae2b57b59e735cd775
Aug 06 09:52:53 LenovoP16v kernel: Hardware name: LENOVO 21FC000QGE/21FC000QGE, BIOS N3UET22W (1.09 ) 11/15/2023

some proprietary driver you are using, perhaps in conjunction with the kernel version
don’t know what OE means and didn’t look

My “wisdom” comes from here.

1 Like

Might this be related? I do have a 13th Gen Intel® Core™ i7-13700H CPU.

since switching my power-profile from performance to balanced it didn’t happen again (a few hours)

It might. I don’t know.

:man_shrugging:

1 Like

I’ve sticked to the “balance” profile for the rest of the workday and didn’t run into any more system freezes, so it definitely only happens when using the performance profile or sometimes also when I try to shutdown the laptop.

Not yet, but close. :point_down:

2 Likes

okey, so I did setup and test this REISUB thing @Mirdarthos linked (thanks!) and now I have time to try to make the system crash. hopefully using that method makes the system journal hold on to more details about the crash I can read after a reboot.

Gonna go look for a stress test method now to try and trigger it.

1 Like

“sadly”, I have to go do something offline now and so far none of the stress / benchmark tools I’ve ran while having the laptop in “performance” mode was able to reproduce those crashes from 2 days ago ^^

I’ll keep trying when I get back…