Laptop freezes occasionally while booting

My laptop is an HP Omen 15 with KDE manjaro and win11 dual boot ( I have added output of inxi --admin --verbosity=7 --filter --no-host --width below) (I use windows rarely and only for some light gaming)

I has been using the free nouveu driver so far. I got an external monitor the other day and I had to switch to the nvidia driver (I just used the install proprietary driver option in the settings app, see screenshot) because there was too much tearing on the external monitor video (especially when playing video/ dragging a window etc).

My laptop now randomly freezes from time to time when booting. When this happens the system becomes completely unresponsive (not even REISUB works it seems). Once this happens it just ends up freezing even after I reboot and the only solution is to boot into windows and the shutdown and boot back into linux. This freezing does not happen always, only occasionally.

I think I also spotted some line saying ā€œPPM init failedā€ in one boot log (from journalctl)

System:
  Kernel: 6.1.41-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 13.1.1
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64
    root=UUID=38fd7989-2606-41e1-b018-3ec8243f181d rw quiet splash
    udev.log_priority=3
  Desktop: KDE Plasma v: 5.27.6 tk: Qt v: 5.15.10 wm: kwin_x11 vt: 2 dm: SDDM
    Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Laptop System: HP product: OMEN by HP Laptop 15-dc1xxx v: N/A
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: HP model: 8575 v: 21.17 serial: <superuser required> UEFI: AMI
    v: F.25 date: 07/19/2022
Battery:
  ID-1: BAT0 charge: 1.1 Wh (50.0%) condition: 2.2/2.2 Wh (100.0%) volts: 12.8
    min: 11.6 model: HP Primary type: Li-ion serial: N/A status: not charging
Memory:
  System RAM: total: 16 GiB available: 15.43 GiB used: 4.03 GiB (26.1%)
  RAM Report: permissions: Unable to run dmidecode. Root privileges required.
CPU:
  Info: model: Intel Core i7-9750H bits: 64 type: MT MCP arch: Coffee Lake
    gen: core 9 level: v3 note: check built: 2018 process: Intel 14nm family: 6
    model-id: 0x9E (158) stepping: 0xA (10) microcode: 0xF2
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
    L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 1.5 MiB desc: 6x256 KiB
    L3: 12 MiB desc: 1x12 MiB
  Speed (MHz): avg: 1450 high: 2600 min/max: 800/4500 scaling:
    driver: intel_pstate governor: powersave cores: 1: 2600 2: 2600 3: 888 4: 899
    5: 800 6: 2600 7: 900 8: 900 9: 900 10: 900 11: 2600 12: 822
    bogomips: 62431
  Flags: 3dnowprefetch abm acpi adx aes aperfmperf apic arat
    arch_capabilities arch_perfmon art avx avx2 bmi1 bmi2 bts clflush
    clflushopt cmov constant_tsc cpuid cpuid_fault cx16 cx8 de ds_cpl dtes64
    dtherm dts epb ept ept_ad erms est f16c flexpriority flush_l1d fma fpu
    fsgsbase fxsr ht hwp hwp_act_window hwp_epp hwp_notify ibpb ibrs ida
    intel_pt invpcid invpcid_single lahf_lm lm mca mce md_clear mmx monitor
    movbe mpx msr mtrr nonstop_tsc nopl nx pae pat pbe pcid pclmulqdq pdcm
    pdpe1gb pebs pge pln pni popcnt pse pse36 pti pts rdrand rdseed rdtscp
    rep_good sdbg sep smap smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp
    syscall tm tm2 tpr_shadow tsc tsc_adjust tsc_deadline_timer vme vmx vnmi
    vpid x2apic xgetbv1 xsave xsavec xsaveopt xsaves xtopology xtpr
  Vulnerabilities:
  Type: itlb_multihit status: KVM: VMX disabled
  Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
    vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
  Type: retbleed mitigation: IBRS
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: IBRS, IBPB: conditional, STIBP: conditional,
    RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds mitigation: Microcode
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: Intel CoffeeLake-H GT2 [UHD Graphics 630] vendor: Hewlett-Packard
    driver: i915 v: kernel arch: Gen-9.5 process: Intel 14nm built: 2016-20
    ports: active: eDP-1 empty: DP-1,HDMI-A-1 bus-ID: 00:02.0
    chip-ID: 8086:3e9b class-ID: 0300
  Device-2: NVIDIA TU117M [GeForce GTX 1650 Mobile / Max-Q]
    vendor: Hewlett-Packard driver: nvidia v: 535.86.05
    alternate: nouveau,nvidia_drm non-free: 535.xx+
    status: current (as of 2023-07) arch: Turing code: TUxxx
    process: TSMC 12nm FF built: 2018-22 pcie: gen: 1 speed: 2.5 GT/s lanes: 8
    link-max: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 01:00.0 chip-ID: 10de:1f91
    class-ID: 0300
  Device-3: Cheng Uei Precision Industry (Foxlink) HP Wide Vision HD
    Integrated Webcam driver: uvcvideo type: USB rev: 2.0 speed: 480 Mb/s
    lanes: 1 mode: 2.0 bus-ID: 1-6:4 chip-ID: 05c8:03bc class-ID: 0e02
  Display: x11 server: X.Org v: 21.1.8 compositor: kwin_x11 driver: X:
    loaded: modesetting,nvidia unloaded: nouveau alternate: fbdev,nv,vesa
    dri: iris gpu: i915 display-ID: :0 screens: 1
  Screen-1: 0 s-res: 3840x1080 s-dpi: 96 s-size: 1013x285mm (39.88x11.22")
    s-diag: 1052mm (41.43")
  Monitor-1: HDMI-1-0 pos: primary,right res: 1920x1080 hz: 60 dpi: 93
    size: 527x296mm (20.75x11.65") diag: 604mm (23.8") modes: N/A
  Monitor-2: eDP-1 pos: left res: 1920x1080 hz: 60 dpi: 142
    size: 344x193mm (13.54x7.6") diag: 394mm (15.53") modes: N/A
  API: OpenGL v: 4.6 Mesa 23.0.4 renderer: Mesa Intel UHD Graphics 630 (CFL
    GT2) direct-render: Yes
Audio:
  Device-1: Intel Cannon Lake PCH cAVS vendor: Hewlett-Packard
    driver: snd_hda_intel v: kernel alternate: snd_soc_skl,snd_sof_pci_intel_cnl
    bus-ID: 00:1f.3 chip-ID: 8086:a348 class-ID: 0403
  Device-2: NVIDIA vendor: Hewlett-Packard driver: snd_hda_intel v: kernel
    pcie: gen: 1 speed: 2.5 GT/s lanes: 8 link-max: gen: 3 speed: 8 GT/s
    lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:10fa class-ID: 0403
  API: ALSA v: k6.1.41-1-MANJARO status: kernel-api with: aoss
    type: oss-emulator tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 0.3.75 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Cannon Lake PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:a370 class-ID: 0280
  IF: wlo1 state: up mac: <filter>
  IP v4: <filter> type: dynamic noprefixroute scope: global
    broadcast: <filter>
  IP v6: <filter> type: noprefixroute scope: link
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Hewlett-Packard driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s
    lanes: 1 port: 3000 bus-ID: 03:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: eno1 state: down mac: <filter>
  WAN IP: <filter>
Bluetooth:
  Device-1: Intel Bluetooth 9460/9560 Jefferson Peak (JfP) driver: btusb v: 0.8
    type: USB rev: 2.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-14:5
    chip-ID: 8087:0aaa class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 0 state: up address: see --recommends
Logical:
  Message: No logical block device data found.
RAID:
  Hardware-1: Intel 82801 Mobile SATA Controller [RAID mode] driver: ahci
    v: 3.0 port: 5060 bus-ID: 00:17.0 chip-ID: 8086:282a rev: N/A class-ID: 0104
Drives:
  Local Storage: total: 1.14 TiB used: 149.41 GiB (12.8%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: MZVLB256HAHQ-000H1
    size: 238.47 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
    lanes: 4 tech: SSD serial: <filter> fw-rev: EXD70H1Q temp: 34.9 C
    scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Crucial model: CT1000BX500SSD1
    size: 931.51 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 030 scheme: GPT
  Message: No optical or floppy data found.
Partition:
  ID-1: / raw-size: 238.17 GiB size: 233.38 GiB (97.99%)
    used: 122.98 GiB (52.7%) fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2
    label: N/A uuid: 38fd7989-2606-41e1-b018-3ec8243f181d
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 316 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1 label: N/A
    uuid: A8B1-78C4
  ID-3: /home/<filter>/projects raw-size: 200 GiB size: 195.8 GiB (97.90%)
    used: 26.43 GiB (13.5%) fs: ext4 dev: /dev/sda5 maj-min: 8:5 label: N/A
    uuid: ba3b08ea-a90a-4bf9-8b87-cf3e1bdc133c
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
  ID-1: swap-1 type: file size: 512 MiB used: 26.5 MiB (5.2%) priority: -2
    file: /swapfile
Unmounted:
  ID-1: /dev/sda1 maj-min: 8:1 size: 100 MiB fs: vfat label: N/A
    uuid: AAEA-C433
  ID-2: /dev/sda2 maj-min: 8:2 size: 16 MiB fs: <superuser required>
    label: N/A uuid: N/A
  ID-3: /dev/sda3 maj-min: 8:3 size: 199.24 GiB fs: ntfs label: N/A
    uuid: 78EE7BD4EE7B88E0
  ID-4: /dev/sda4 maj-min: 8:4 size: 663 MiB fs: ntfs label: N/A
    uuid: 18C60466C604468A
  ID-5: /dev/sda6 maj-min: 8:6 size: 222.23 GiB fs: ntfs label: N/A
    uuid: E022816B2281478C
  ID-6: /dev/sda7 maj-min: 8:7 size: 309.29 GiB fs: ntfs label: last volume
    uuid: F4568E09568DCD34
USB:
  Hub-1: 1-0:1 info: hi-speed hub with single TT ports: 16 rev: 2.0
    speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 chip-ID: 1d6b:0002
    class-ID: 0900
  Device-1: 1-1:2 info: Microdia USB DEVICE type: keyboard,mouse
    driver: hid-generic,usbhid interfaces: 2 rev: 2.0 speed: 12 Mb/s (1.4 MiB/s)
    lanes: 1 mode: 1.1 power: 100mA chip-ID: 0c45:8513 class-ID: 0301
  Device-2: 1-2:3 info: Sunplus Innovation Gaming mouse [Philips SPK9304]
    type: mouse driver: hid-generic,usbhid interfaces: 1 rev: 2.0
    speed: 1.5 Mb/s (183 KiB/s) lanes: 1 mode: 1.0 power: 98mA
    chip-ID: 1bcf:08a0 class-ID: 0301
  Device-3: 1-6:4 info: Cheng Uei Precision Industry (Foxlink) HP Wide
    Vision HD Integrated Webcam type: video driver: uvcvideo interfaces: 2
    rev: 2.0 speed: 480 Mb/s (57.2 MiB/s) lanes: 1 mode: 2.0 power: 500mA
    chip-ID: 05c8:03bc class-ID: 0e02
  Device-4: 1-14:5 info: Intel Bluetooth 9460/9560 Jefferson Peak (JfP)
    type: bluetooth driver: btusb interfaces: 2 rev: 2.0
    speed: 12 Mb/s (1.4 MiB/s) lanes: 1 mode: 1.1 power: 100mA
    chip-ID: 8087:0aaa class-ID: e001
  Hub-2: 2-0:1 info: super-speed hub ports: 8 rev: 3.1
    speed: 10 Gb/s (1.16 GiB/s) lanes: 1 mode: 3.2 gen-2x1 chip-ID: 1d6b:0003
    class-ID: 0900
Sensors:
  System Temperatures: cpu: 51.0 C pch: 46.0 C mobo: N/A
  Fan Speeds (RPM): cpu: 2013 fan-2: 0
Info:
  Processes: 295 Uptime: 12m wakeups: 1 Init: systemd v: 253 default: graphical
  tool: systemctl Compilers: gcc: 13.1.1 clang: 15.0.7 Packages: 1443
  pm: pacman pkgs: 1437 libs: 377 tools: pamac,yay pm: flatpak pkgs: 6
  Shell: Zsh v: 5.9 default: Bash v: 5.1.16 running-in: konsole inxi: 3.3.28

Can you exclude automatic boot-time checking of your drive(s)?
(I have activated this in Manjaro and sometimes the boot-process takes longer.

So does that mean that the issue is not related to the graphics drivers?
I was convinced that the graphics driver is the issue because it pops up when I switch to the nvidia graphics driver.

This is the last few lines of one boot where it freezed up, the system boots up until the splash screen and then freezes completely

Aug 08 09:29:09 OMEN kernel: ucsi_acpi USBC000:00: PPM init failed (-110)
Aug 08 09:29:09 OMEN systemd[1]: systemd-rfkill.service: Deactivated successfully.
Aug 08 09:29:09 OMEN systemd[1]: Received SIGRTMIN+21 from PID 243 (plymouthd).
Aug 08 09:29:09 OMEN systemd[1]: Finished Hold until boot process finishes up.
Aug 08 09:29:09 OMEN systemd[1]: Finished Terminate Plymouth Boot Screen.
Aug 08 09:29:09 OMEN systemd[1]: Reached target Multi-User System.
Aug 08 09:29:09 OMEN systemd[1]: Started Simple Desktop Display Manager.
Aug 08 09:29:09 OMEN systemd[1]: Reached target Graphical Interface.
Aug 08 09:29:09 OMEN systemd[1]: Startup finished in 11.450s (firmware) + 4.915s (loader) + 1.718s (kernel) + 7.373s (userspace) = 25.458s.
Aug 08 09:29:09 OMEN sddm[711]: Initializing...
Aug 08 09:29:09 OMEN sddm[711]: Starting...
Aug 08 09:29:09 OMEN sddm[711]: Logind interface found
Aug 08 09:29:09 OMEN sddm[711]: Adding new display...
Aug 08 09:29:09 OMEN sddm[711]: Loaded empty theme configuration
Aug 08 09:29:09 OMEN sddm[711]: Xauthority path: "/run/sddm/xauth_XlXuJu"
Aug 08 09:29:09 OMEN sddm[711]: Using VT 2
Aug 08 09:29:09 OMEN sddm[711]: Display server starting...
Aug 08 09:29:09 OMEN sddm[711]: Writing cookie to "/run/sddm/xauth_XlXuJu"
Aug 08 09:29:09 OMEN sddm[711]: Running: /usr/bin/X -nolisten tcp -background none -seat seat0 vt2 -auth /run/sddm/xauth_XlXuJu -noreset -displayfd 16
Aug 08 09:29:10 OMEN wpa_supplicant[708]: wlo1: CTRL-EVENT-REGDOM-CHANGE init=DRIVER type=COUNTRY alpha2=US
Aug 08 09:29:12 OMEN NetworkManager[637]: <info>  [1691467152.5977] manager: startup complete
Aug 08 09:29:16 OMEN systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.

That is normal and the time needed is always the same amount.
Before I threw out all NVIDIA graphics cards, it wast like thatā€¦
sudo systemd-analyze blame and sudo systemd-analyze
tells if that is so.
Your (in example above)

Startup finished in 11.450s (firmware) + 4.915s (loader) + 1.718s (kernel) + 7.373s (userspace) = 25.458s.

Thats not bad, times are similar to my AMD-Graphics-Card-System.
BUT:
looks like Plymouth problem - how to get rid of plymouth is mentioned in the Forum manytimes.
AND:
look whether you have configured automatic disk checking.
sudo tune2fs -c 12 /dev/sdcX for example??
Or in Grub??

This command modifies my system right? Could you kindly explain what it does?

I got the system freeze again today

Aug 10 12:10:49 OMEN kernel: ucsi_acpi USBC000:00: PPM init failed (-110)
Aug 10 12:10:50 OMEN systemd[1]: Received SIGRTMIN+21 from PID 243 (plymouthd).

does this confirm your guess that plymouth is the problem? Or is there some other isssue that is causing causing plymouth to fail. Anyway I have disabled plymouth by removing quiet splash from grub options. Iā€™ll see if that keeps things stable. I might also remove plymouth altogether

Thanks
Edit: This is becoming more frequent , happening on almost every other reboot.
This is costing me so time now and almost making the system unusable. Each time the freeze happens, I have to reboot to windows once or twice to be ablel to boot normally otherwise linux will just freeze.
This is what I get usually :

The following happened only once and I got this message indicating some issue with the nvidia power state. This has happened before, long time ago when I tried nvidia drivers. I was not able to get a fix for this even then, so I removed the nvidia drivers.


In Grub menu:
edit the actual line (E) remove the word quiet and
change the word plymouth to noplymouth (or delete the word, if plymouth runs despite it).
and proceed with (= press F10-key to boot)
This will - TEMPorarely - change your Boot-Paramter
and the screen tells what your machine does
and disables plymouth (provisionally, finally is more complicated)

Then:

May be loose cable connection (drive / graphic card) a german Wackelkontaktā€¦

Yes, it modifies your filesystem - but it doesnā€™t change any files.
Instead the filesystem will be instructed to be forcibly checked after it has been mounted 12 times (in this case)
independent of whether there where any errors detected, which would also result in a need to repair.

1 Like

That does not seem to the case! This issue only pops up when I use the nvidia drivers and even then, I am able to boot up straight to windows without any problem. Also this is a laptop so most parts are soldered on except for the storage drives that Iā€™ve reseated just to be sure. If the soldering is the culprit I guess Iā€™m just unfortunate

Today I got the same issue again but the message was different:


PC booted up as usual when I force shutdown and powered it back on.

I will try changing the boot parameters and see if the issue goes away!
Thanks

Looks like:
https://askubuntu.com/questions/1225934/ucsi-acpi-ppm-init-failed-110
or/and a failing USB-device?
If you want to test a boot-parameter: acpi_enforce_resources=lax
will help or not.

Does the failure to boot happens only when running on battery? Found this:

The 70Wh battery allows to use the laptop about 4 hours on battery. In particular, the CPU TDP is limited to 25W while on battery, compared to the 45W on AC. On Windows there is the possibility of increasing the TDP to 90W/107W (long/short), yet this is managed directly by the embedded controller (EC) and requires patching to allow this on Linux. HP Omen 15-ek005na - ArchWiki .

This could also explain the USB error; the more USB active the more likely the machine is pushed past the 25W limit when running on battery.

can you describe what happened as you received this error from the last picture with the message ā€œunable to change power state from d3cold to d0 ā€¦ā€.
the change from d3cold to d0 is usually the moment when the laptop power is activated

The transition from D3cold to D0 occurs when the supply voltage
 is provided to the device (i.e. power is restored). In that case 
the device returns to D0 with a full power-on reset sequence 
and the power-on defaults are restored to the device by hardware 
just as at initial power up.

see:
https://docs.kernel.org/power/pci.html

itā€™s also possible that the power-transition (up to 90 watt) cannot be activated and throws this error.
https://wiki.archlinux.org/title/HP_Omen_15-ek005na

there are certain issues and flaws described if you search for ā€œhp omen linuxā€.
all fixes that are described arenā€™t easy and some of them can break your laptop-hardware.
this laptop is a powerfull and modern system, but it doesnā€™t play well with linux.
imho a bad product while hp always claims to support linux-systems, but this laptop is seriously a bad example that they donā€™t.

p.s.: you should check the hp-support site for possible driver-support but hp is focused at rpm and deb based linuxes.

This issue happens when the PC is on AC power as well. I have it plugged in almost all the time since I am using the laptop as my WFH setup.

I have tried most the steps given in the guide, and some are ubuntu specific it seemed.
I will try to boot up with no USB devices attached and see if the issue pops up. I dont expect it to be the USB devices because they have been the same for a long time now and the only thing that has changed is the driver

The laptop just freezes after displaying this message, although I dont get this message always. Out of 5 failed boots, i get the ppm init failed message almost always, the ā€œd3cold to d0ā€ message maybe 2-3 times, sometimes no message at all(it just freezes with a black screen and a non-blinky cursor) and 1-2 times I get this different message.

question:
are there any external usb-devices connected while booting and if please post what devices they are.

Usually I have a USB keyboard and mouse plugged in. Thatā€™s all. But the issue occurs even if there are no USB devices plugged in.

ACPI/Firmware issues and outdated BIOS?
Try updating itā€¦

Next stop:
Options in BIOS.
Models like this sometimes have funky acpi tables for their implementation of dual-graphics. If such an option exists, sometimes disabling hybrid graphics can ā€˜solveā€™ the problem.

And oh yeah ā€¦ have you tried setting early KMS?
https://wiki.archlinux.org/title/Kernel_mode_setting