Random shutdowns with 6 series kernels on HP Victus Laptop

Hi, after updating from LTS kernel I experience random shutdown of my notebook.
After such shutdown it wont boot by first power on, just hangs, so I need to turn it off by holding power button and after this I can turn it on and boot.

Can someone help me to debug this?
I cannot use LTS kernel because my wifi card supported only in 6 series kernel.

OS: Manjaro Linux x86_64 
Host: Victus by HP Laptop 16-e0xxx 
Kernel: 6.1.1-1-MANJARO 
Shell: bash 5.1.16 
Resolution: 2560x1440 
DE: Plasma 5.26.4 
WM: KWin 
Theme: [Plasma], Breeze [GTK2/3]
Icons: [Plasma], breeze [GTK2/3]
Terminal: konsole 
CPU: AMD Ryzen 7 5800H with Radeon Graphics (16) @ 3.200GHz 
GPU: NVIDIA GeForce RTX 3060 Mobile / Max-Q 
GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series 
Memory: 8142MiB / 63643MiB

I suspect that my problem related to hybrid-graphics dynamic switching.
So I edited /etc/udev/rules.d/90-mhwd-prime-powermanagement.rules file.
And changed:

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

to:

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

Full file content:

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present (enable it for kernels lower than 5.5)
#ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

I have no idea whats going on here and what I am doing.

OK. This change nothing, still getting random reboots.

maybe try alternate BIOS acpi, power management settings, HP
messes a lot with the bios usually, also check for bios update.

There is almost none settings in BIOS so nothing to experiment with…

this is a bit complicated, here some ideas:

  1. maybe have a log to the ksystemlog entries and when you have an exact time of the
    “crash” view the kernel, X11 … log entries.

  2. maybe also check gsmartcontrol reading of your harddrive

I updated post.
I suspect that my problem related to hybrid-graphics dynamic switching.

It seams I found the solution.

AMD P-State EPP driver by default uses “performance” energy_performance_preference that put CPU and integrated GPU at constant 1.46V voltage, presumable causes some overheat events in light load scenarios and/or instability that leads to abrupt system shutdown.
Details here: AMD P-State EPP Scaling Driver on AMD Ryzen 7 5800H laptop

For now not a single shutdown.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.