Kernel 5.15.72-1 freezes at boot

Dear all,

Long time Manjaro user here. I have the system listed below that refuses to boot with the specified kernel above. Kernel 5.10 series and earlier versions boot as expected. When tried to boot with kernel 5.15.72-1 or any 5.15 version, boot process displays the initial message that reads something like “Starting version…” and everything freezes there. I cannot even get a tty after several minutes of waiting. I tried to do a web search about this but could not come across any similar problem reported before. Any pointers will be greatly appreciated. I am more than happy to provide additional information as needed.

$ inxi -GIS
System:
  Host: -redacted- Kernel: 5.10.147-1-MANJARO arch: x86_64 bits: 64
    Desktop: KDE Plasma v: 5.25.5 Distro: Manjaro Linux
Graphics:
  Device-1: NVIDIA GP104GL [Quadro P4000] driver: nvidia v: 515.76
  Device-2: ASPEED Graphics Family driver: ast v: kernel
  Display: server: X.Org v: 21.1.4 with: Xwayland v: 22.1.3 driver: X:
    loaded: nvidia gpu: ast resolution: 2560x1600~60Hz
  OpenGL: renderer: Quadro P4000/PCIe/SSE2 v: 4.6.0 NVIDIA 515.76
Info:
  Processes: 693 Uptime: 8m Memory: 94.28 GiB used: 16.14 GiB (17.1%)
  Shell: Bash inxi: 3.3.22

boot with the 5.15 kernel, then boot back with the 5.10 kernel and provide logs from the 5.15 boot with:
journalctl -b-1 -p4 --no-pager
also you can try booting with the 5.19 kernel - not the rt one

Thanks for the guidance. Please see below for the output.

Oct 14 18:27:52 hostname kernel:  #11 #12 #13 #14 #15 #16 #17 #18 #19
Oct 14 18:27:52 hostname kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
Oct 14 18:27:52 hostname kernel: TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
Oct 14 18:27:52 hostname kernel: MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.
Oct 14 18:27:52 hostname kernel:  #21 #22 #23 #24 #25 #26 #27 #28 #29
Oct 14 18:27:52 hostname kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
Oct 14 18:27:52 hostname kernel: pci_bus 0000:ff: Unknown NUMA node; performance will be reduced
Oct 14 18:27:52 hostname kernel: pci_bus 0000:7f: Unknown NUMA node; performance will be reduced
Oct 14 18:27:52 hostname kernel: pmd_set_huge: Cannot satisfy [mem 0xc1000000-0xc1200000] with a huge-page mapping due to MTRR override.
Oct 14 18:27:52 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible
Oct 14 18:27:52 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible
Oct 14 18:27:52 hostname kernel: ata7.00: ATA Identify Device Log not supported
Oct 14 18:27:52 hostname kernel: ata7.00: ATA Identify Device Log not supported
Oct 14 18:27:52 hostname kernel: i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
Oct 14 18:27:52 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 14 18:27:52 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 14 18:27:52 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 14 18:27:52 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 14 18:27:52 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 14 18:27:52 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 14 18:27:52 hostname kernel: power_meter ACPI000D:00: Ignoring unsafe software power cap!
Oct 14 18:27:52 hostname kernel: power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
Oct 14 18:27:52 hostname kernel: i2c i2c-0: Systems with more than 4 memory slots not supported yet, not instantiating SPD
Oct 14 18:27:53 hostname systemd-udevd[713]: event4: Failed to call EVIOCSKEYCODE with scan code 0x7c, and key code 190: Invalid argument
Oct 14 18:27:53 hostname systemd-udevd[641]: controlC1: Process '/usr/bin/alsactl restore 1' failed with exit code 2.
Oct 14 18:27:53 hostname systemd-udevd[701]: controlC0: Process '/usr/bin/alsactl restore 0' failed with exit code 2.
Oct 14 18:27:53 hostname systemd-udevd[714]: could not read from '/sys/module/acpi_cpufreq/initstate': No such device
Oct 14 18:27:53 hostname kernel: nvidia: loading out-of-tree module taints kernel.
Oct 14 18:27:53 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.
Oct 14 18:27:53 hostname kernel: Disabling lock debugging due to kernel taint
Oct 14 18:27:53 hostname kernel:
Oct 14 18:27:53 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  515.76  Mon Sep 12 19:21:56 UTC 2022
Oct 14 18:27:53 hostname systemd-udevd[670]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.
Oct 14 18:27:53 hostname systemd-udevd[713]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \  -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) ${i}; done'' failed with exit code 1.

I didn’t have a chance to try booting with the 5.19 kernel yet. Hoping to be able to do that by tomorrow.

Oct 14 18:27:53 hostname systemd-udevd[670]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.
Oct 14 18:27:53 hostname systemd-udevd[713]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \  -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) ${i}; done'' failed with exit code 1.

the last logs are related to nvidia… did yo do any modifications to/with nvidia ?
also provide system info:
inxi -Faz

We pushed new Kernels and drivers for Nvidia to testing and stable branches. Please see if that solves it.

1 Like

Nope, no modifications at all.

Here you go:

System:
  Kernel: 5.10.148-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
    parameters: BOOT_IMAGE=/boot/vmlinuz-5.10-x86_64
    root=UUID=92d022fa-7e5b-4c9e-aa2e-dfbbfd05f5c8 rw quiet intel_iommu=on
    resume=UUID=3b8de6df-ba38-447e-a7eb-cae4837b2942
  Desktop: KDE Plasma v: 5.25.5 tk: Qt v: 5.15.6 wm: kwin_x11 vt: 1
    dm: startx Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Desktop Mobo: ASUSTeK model: Z10PE-D8 WS v: Rev 1.xx
    serial: <superuser required> UEFI-[Legacy]: American Megatrends v: 3501
    date: 12/12/2017
CPU:
  Info: model: Intel Xeon E5-2630 v4 bits: 64 type: MT MCP SMP
    arch: Broadwell level: v3 note: check built: 2015-18 process: Intel 14nm
    family: 6 model-id: 0x4F (79) stepping: 1 microcode: 0xB000040
  Topology: cpus: 2x cores: 10 tpc: 2 threads: 20 smt: enabled cache: L1: 2x
    640 KiB (1.2 MiB) desc: d-10x32 KiB; i-10x32 KiB L2: 2x 2.5 MiB (5 MiB)
    desc: 10x256 KiB L3: 2x 25 MiB (50 MiB) desc: 1x25 MiB
  Speed (MHz): avg: 2391 high: 2397 min/max: 1200/3100 scaling:
    driver: intel_cpufreq governor: schedutil cores: 1: 2395 2: 2395 3: 2395
    4: 2395 5: 2395 6: 2395 7: 2395 8: 2395 9: 2395 10: 2395 11: 2395
    12: 2395 13: 2395 14: 2395 15: 2395 16: 2395 17: 2395 18: 2395 19: 2395
    20: 2395 21: 2395 22: 2395 23: 2395 24: 2395 25: 2396 26: 2395 27: 2395
    28: 2395 29: 2395 30: 2395 31: 2395 32: 2395 33: 2395 34: 2258 35: 2395
    36: 2395 37: 2395 38: 2397 39: 2395 40: 2395 bogomips: 175750
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
  Vulnerabilities:
  Type: itlb_multihit status: KVM: Split huge pages
  Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
    vulnerable
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
  Type: retbleed status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl and seccomp
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, IBRS_FW,
    STIBP: conditional, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort mitigation: Clear CPU buffers; SMT vulnerable
Graphics:
  Device-1: NVIDIA GP104GL [Quadro P4000] driver: nvidia v: 520.56.06
    alternate: nouveau,nvidia_drm non-free: 515.xx+ status: current (as of
    2022-10) arch: Pascal code: GP10x process: TSMC 16nm built: 2016-21 pcie:
    gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 03:00.0 chip-ID: 10de:1bb1
    class-ID: 0300
  Device-2: ASPEED Graphics Family vendor: ASUSTeK driver: ast v: kernel
    ports: active: VGA-1 empty: none bus-ID: 09:00.0 chip-ID: 1a03:2000
    class-ID: 0300
  Display: server: X.Org v: 21.1.4 with: Xwayland v: 22.1.3
    compositor: kwin_x11 driver: X: loaded: nvidia gpu: ast display-ID: :0
    screens: 1
  Screen-1: 0 s-res: 2560x1600 s-dpi: 101 s-size: 644x402mm (25.35x15.83")
    s-diag: 759mm (29.89")
  Monitor-1: VGA-1 mapped: DP-6 res: 2560x1600 hz: 60 dpi: 101
    size: 641x400mm (25.24x15.75") modes: max: 1024x768 min: 640x480
  OpenGL: renderer: Quadro P4000/PCIe/SSE2 v: 4.6.0 NVIDIA 520.56.06
    direct render: Yes
Audio:
  Device-1: Intel C610/X99 series HD Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel bus-ID: 00:1b.0 chip-ID: 8086:8d20
    class-ID: 0403
  Device-2: NVIDIA GP104 High Definition Audio driver: snd_hda_intel
    v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 03:00.1
    chip-ID: 10de:10f0 class-ID: 0403
  Sound API: ALSA v: k5.10.148-1-MANJARO running: yes
  Sound Server-1: JACK v: 1.9.21 running: no
  Sound Server-2: PulseAudio v: 16.1 running: yes
  Sound Server-3: PipeWire v: 0.3.58 running: yes
Network:
  Device-1: Intel I210 Gigabit Network vendor: ASUSTeK driver: igb v: kernel
    pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: 5000 bus-ID: 05:00.0
    chip-ID: 8086:1533 class-ID: 0200
  IF: enp5s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Intel I210 Gigabit Network vendor: ASUSTeK driver: igb
    v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1 port: 4000 bus-ID: 06:00.0
    chip-ID: 8086:1533 class-ID: 0200
  IF: enp6s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IF-ID-1: bond0 state: up speed: 2000 Mbps duplex: full mac: <filter>
  IF-ID-2: bonding_masters state: N/A speed: N/A duplex: N/A mac: N/A
  IF-ID-3: bridge0 state: up speed: 2000 Mbps duplex: unknown mac: <filter>
  IF-ID-4: vnet0 state: unknown speed: 10 Mbps duplex: full mac: <filter>
  IF-ID-5: vnet1 state: unknown speed: 10 Mbps duplex: full mac: <filter>
  IF-ID-6: vnet2 state: unknown speed: 10 Mbps duplex: full mac: <filter>
  IF-ID-7: vnet3 state: unknown speed: 10 Mbps duplex: full mac: <filter>
Drives:
  Local Storage: total: 10.92 TiB used: 4.19 TiB (38.4%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/sda maj-min: 8:0 vendor: Western Digital
    model: WD40EFRX-68N32N0 size: 3.64 TiB block-size: physical: 4096 B
    logical: 512 B speed: 6.0 Gb/s type: HDD rpm: 5400 serial: <filter>
    rev: 0A82
  ID-2: /dev/sdb maj-min: 8:16 vendor: Seagate model: ST4000VN008-2DR166
    size: 3.64 TiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    type: HDD rpm: 5980 serial: <filter> rev: SC60 scheme: GPT
  ID-3: /dev/sdc maj-min: 8:32 vendor: Seagate model: ST1500DL003-9VT16L
    size: 1.36 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    type: HDD rpm: 5900 serial: <filter> rev: CC32 scheme: MBR
  ID-4: /dev/sdd maj-min: 8:48 vendor: Samsung model: SSD 850 EVO M.2 500GB
    size: 465.76 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    type: SSD serial: <filter> rev: 1B6Q scheme: GPT
  ID-5: /dev/sde maj-min: 8:64 vendor: Hitachi model: HUA723020ALA641
    size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    type: HDD rpm: 7200 serial: <filter> rev: A840 scheme: MBR
Partition:
  ID-1: / raw-size: 431.82 GiB size: 423.97 GiB (98.18%) used: 198.45 GiB
    (46.8%) fs: ext4 dev: /dev/sdd2 maj-min: 8:50
  ID-2: /home raw-size: 706.89 GiB size: 695.6 GiB (98.40%) used: 393.93
    GiB (56.6%) fs: ext4 dev: /dev/sde1 maj-min: 8:65
  ID-3: /var raw-size: 1013.48 GiB size: 1013.48 GiB (100.00%) used: 87.66
    GiB (8.6%) fs: btrfs dev: /dev/sdb1 maj-min: 8:17
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
  ID-1: swap-1 type: partition size: 33.94 GiB used: 0 KiB (0.0%)
    priority: -2 dev: /dev/sdd1 maj-min: 8:49
Sensors:
  System Temperatures: cpu: 39.0 C mobo: N/A gpu: nvidia temp: 42 C
  Fan Speeds (RPM): N/A gpu: nvidia fan: 47%
Info:
  Processes: 740 Uptime: 3m wakeups: 0 Memory: 94.28 GiB used: 16.34 GiB
  (17.3%) Init: systemd v: 251 default: multi-user tool: systemctl
  Compilers: gcc: 12.2.0 alt: 10/11/8/9 clang: 14.0.6 Packages: pm: pacman
  pkgs: 2197 libs: 497 tools: octopi,pamac,yay pm: flatpak pkgs: 0
  Shell: Bash v: 5.1.16 running-in: konsole inxi: 3.3.22

Following is the message obtained after booting with kernel 6.0.2-2 fails:

Oct 18 19:13:08 hostname kernel:  #11 #12 #13 #14 #15 #16 #17 #18 #19↲
  1 Oct 18 19:13:08 hostname kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.↲
  2 Oct 18 19:13:08 hostname kernel: TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for mor    e details.↲
  3 Oct 18 19:13:08 hostname kernel: MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_s    tale_data.html for more details.↲
  4 Oct 18 19:13:08 hostname kernel:  #21 #22 #23 #24 #25 #26 #27 #28 #29↲
  5 Oct 18 19:13:08 hostname kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'↲
  6 Oct 18 19:13:08 hostname kernel: pci_bus 0000:ff: Unknown NUMA node; performance will be reduced↲
  7 Oct 18 19:13:08 hostname kernel: pci_bus 0000:7f: Unknown NUMA node; performance will be reduced↲
  8 Oct 18 19:13:08 hostname kernel: pmd_set_huge: Cannot satisfy [mem 0xc1000000-0xc1200000] with a huge-page mapping due to MTRR override.↲
  9 Oct 18 19:13:08 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible↲
 10 Oct 18 19:13:08 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible↲
 11 Oct 18 19:13:08 hostname kernel: i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp↲
 12 Oct 18 19:13:08 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 13 Oct 18 19:13:08 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 14 Oct 18 19:13:08 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 15 Oct 18 19:13:08 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 16 Oct 18 19:13:08 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 17 Oct 18 19:13:08 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 18 Oct 18 19:13:08 hostname kernel: power_meter ACPI000D:00: Ignoring unsafe software power cap!↲
 19 Oct 18 19:13:08 hostname kernel: power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().↲
 20 Oct 18 19:13:08 hostname kernel: i2c i2c-0: Systems with more than 4 memory slots not supported yet, not instantiating SPD↲
 21 Oct 18 19:13:09 hostname kernel: asus_wmi: fan_curve_get_factory_default (0x00110024) failed: -61↲
 22 Oct 18 19:13:09 hostname kernel: asus_wmi: fan_curve_get_factory_default (0x00110025) failed: -61↲
 23 Oct 18 19:13:09 hostname systemd-udevd[681]: event11: Failed to call EVIOCSKEYCODE with scan code 0x7c, and key code 190: Invalid argument↲
 24 Oct 18 19:13:09 hostname systemd-udevd[701]: controlC1: Process '/usr/bin/alsactl restore 1' failed with exit code 2.↲
 25 Oct 18 19:13:09 hostname systemd-udevd[686]: controlC0: Process '/usr/bin/alsactl restore 0' failed with exit code 2.↲ 
 26 Oct 18 19:13:09 hostname kernel: nvidia: loading out-of-tree module taints kernel.↲
 27 Oct 18 19:13:09 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.↲
 28 Oct 18 19:13:09 hostname kernel: Disabling lock debugging due to kernel taint↲
 29 Oct 18 19:13:10 hostname kernel: ↲
 30 Oct 18 19:13:10 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  520.56.06  Thu Oct  6 21:38:55 UTC 2022↲
 31 Oct 18 19:13:10 hostname systemd-udevd[695]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1)     255'' failed with exit code 1.↲
 32 Oct 18 19:13:10 hostname systemd-udevd[655]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \  -f 4); do /usr/bi    n/mknod -Z -m 666 /dev/nvidia${i} c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) ${i}; done'' failed with exit code 1.↲

Looks very similar to the failure log with kernel 5.15.72-1 to me.

And following is the journal log after booting with kernel 5.19.16-2 fails:

  Oct 18 19:29:16 hostname kernel:  #11 #12 #13 #14 #15 #16 #17 #18 #19↲
  1 Oct 18 19:29:16 hostname kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.↲
  2 Oct 18 19:29:16 hostname kernel: TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for mor    e details.↲
  3 Oct 18 19:29:16 hostname kernel: MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_s    tale_data.html for more details.↲
  4 Oct 18 19:29:16 hostname kernel:  #21 #22 #23 #24 #25 #26 #27 #28 #29↲
  5 Oct 18 19:29:16 hostname kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'↲
  6 Oct 18 19:29:16 hostname kernel: pci_bus 0000:ff: Unknown NUMA node; performance will be reduced↲
  7 Oct 18 19:29:16 hostname kernel: pci_bus 0000:7f: Unknown NUMA node; performance will be reduced↲
  8 Oct 18 19:29:16 hostname kernel: pmd_set_huge: Cannot satisfy [mem 0xc1000000-0xc1200000] with a huge-page mapping due to MTRR override.↲
  9 Oct 18 19:29:16 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible↲
 10 Oct 18 19:29:16 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible↲
 11 Oct 18 19:29:16 hostname kernel: i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp↲
 12 Oct 18 19:29:16 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 13 Oct 18 19:29:16 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 14 Oct 18 19:29:16 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 15 Oct 18 19:29:16 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 16 Oct 18 19:29:16 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 17 Oct 18 19:29:16 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!↲
 18 Oct 18 19:29:16 hostname kernel: power_meter ACPI000D:00: Ignoring unsafe software power cap!↲
 19 Oct 18 19:29:16 hostname kernel: power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().↲
 20 Oct 18 19:29:16 hostname kernel: i2c i2c-0: Systems with more than 4 memory slots not supported yet, not instantiating SPD↲
 21 Oct 18 19:29:17 hostname systemd-udevd[598]: controlC1: Process '/usr/bin/alsactl restore 1' failed with exit code 2.↲
 22 Oct 18 19:29:17 hostname kernel: asus_wmi: fan_curve_get_factory_default (0x00110024) failed: -61↲
 23 Oct 18 19:29:17 hostname kernel: asus_wmi: fan_curve_get_factory_default (0x00110025) failed: -61↲
 24 Oct 18 19:29:17 hostname systemd-udevd[688]: event11: Failed to call EVIOCSKEYCODE with scan code 0x7c, and key code 190: Invalid argument↲
 25 Oct 18 19:29:17 hostname systemd-udevd[575]: controlC0: Process '/usr/bin/alsactl restore 0' failed with exit code 2.↲
 26 Oct 18 19:29:17 hostname kernel: nvidia: loading out-of-tree module taints kernel.↲
 27 Oct 18 19:29:17 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.↲
 28 Oct 18 19:29:17 hostname kernel: Disabling lock debugging due to kernel taint↲
 29 Oct 18 19:29:18 hostname kernel: ↲

I am currently stuck with kernel 5.10 series, which work quite well and as expected. It is not a problem right now but will become once this kernel series retires. Thank you so much for helping me identify the cause.

why do you have added this parameter:
do you need it?


also add another kernel parameter:
ibt=off
save grub, update grub, reboot and test if it helped


if it doesnt work provide logs from this:
journalctl -b-1 -p5 --no-pager


and where do you get stuck? on the dev… clean display?
can you enter into tty on the stuck screen: ctrl+alt+f2 - or f1-f6 keys, enter your username/password and type: startx - if it doesnt work take a pic of the screen, and also it will provide a log usually saved in your home folder: home/.local/share/xorg


provide output also from:
mhwd -l && mhwd -li
ls /etc/modprobe.d
find /etc/X11/ -name "*.conf"
ls /etc/udev/rules.d/
ls /usr/lib/udev/rules.d/*nvidia*
pamac list -qm

I don’t need it anymore, so I removed it as recommended.


Unfortunately, it did not help.


Interestingly this is what I got this time. I wonder if it has anything to do with what happened. After issuing the sudo reboot command, it took a long time for the system to reboot, so I manually power cycled it. Once the system recovered, I tried booting into kernel 5-15, which failed again, followed by rebooting to 5.10 before issuing the command above with the following output:

Journal file /var/log/journal/e6439e661a9f456ca18c3f9edb0ca8c5/system@0005eb816d8cead1-f159dd3353c999b9.journal~ is truncated, ignoring file.
-- No entries --


The system freezes shortly after the following message is displayed. I get no tty, no login prompt of any kind. ctrl+alt+fn combinations do nothing.

starting version 251.5-1-manjaro

> 0000:03:00.0 (0300:10de:1bb1) Display controller nVidia Corporation:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
          video-nvidia            2021.11.04               false            PCI
    video-nvidia-470xx            2021.11.04               false            PCI
    video-nvidia-390xx            2021.11.26               false            PCI
           video-linux            2018.05.04                true            PCI
     video-modesetting            2020.01.13                true            PCI
            video-vesa            2017.03.12                true            PCI


> 0000:09:00.0 (0300:1a03:2000) Display controller ASPEED Technology Inc.:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
     video-modesetting            2020.01.13                true            PCI
            video-vesa            2017.03.12                true            PCI


> Installed PCI configs:
--------------------------------------------------------------------------------
                  NAME               VERSION          FREEDRIVER           TYPE
--------------------------------------------------------------------------------
          video-nvidia            2021.11.04               false            PCI
     video-modesetting            2020.01.13                true            PCI


Warning: No installed USB configs!

mhwd-gpu.conf  vfio.conf
/etc/X11/xorg.conf.d/00-keyboard.conf
/etc/X11/xorg.conf.d/90-mhwd.conf
/etc/X11/xorg.conf
/etc/X11/mhwd.d/nvidia.conf

no output at all

/usr/lib/udev/rules.d/60-nvidia.rules  /usr/lib/udev/rules.d/71-nvidia-controllers.rules
apm
archivemail
arcus
atom
breath-icon-theme
breath2-wallpaper
ccextractor
celt
celt0.5.1
ceph-libs
cura
cura-resources-materials
curaengine
davfs2
dunstify
ebtables
electron11
electron12
electron5
electron6
etcher
fontpreview-git
gcc10
gcc10-libs
gcc8
gcc8-libs
gcc9
gcc9-libs
gconf
gkrellm-themes
gkrellmoon
gn-m85
gnome-icon-theme
gnome-icon-theme-symbolic
guile2.0
haroopad
icaclient
ifenslave
ipw2100-fw
ipw2200-fw
joplin-desktop
js60
js68
kde-servicemenus-rootactions
kdepim-apps-libs
lf-git
lib32-libstdc++5
libart-lgpl
libopenaptx
libpasastro
libsavitar
makemkv
manjaro-documentation-en
manjaro-firmware
masterpdfeditor
mhwd-nvidia-340xx
mozilla-common
mutt-wizard-git
nerd-fonts-mononoki
opencolorio1
openjpeg
paho-mqtt-c
pam-gnupg
pepper-flash
powerpanel
protonmail-bridge-nogui
pspp
pth
python-pynest2d
python-sip4
python-trimesh
python2
python2-appdirs
python2-gobject
python2-m2crypto
python2-ordered-set
python2-packaging
python2-pygments
python2-pyparsing
python2-pytz
python2-rfc6555
python2-scandir
python2-selectors2
python2-setuptools
python2-six
python2-typing
python2-uritemplate
rttr
simple-mtpfs
skychart
spread-sheet-widget
ssmtp
st-luke-git
uranium
urlview
wpa_actiond
xdg-su
xerox-phaser-6000-6010
xorg-font-utils
zoom

Thank you for continuing to help me!

post output from:
cat /etc/modprobe.d/vfio.conf
cat /etc/X11/xorg.conf
cat /usr/lib/udev/rules.d/71-nvidia-controllers.rules


it looks like the journal is somehow messed up, so clean it:

sudo journalctl --rotate
sudo journalctl -m --vacuum-time=1s

and these packages can be removed:
atom is end of life, its no longer supported, so you should find a replacement for it and remove it

these packages were dropped from official repos and some were dropped also from aur, so they are dead and can be removed:

breath-icon-theme
breath2-wallpaper
celt
celt0.5.1
ceph-libs
dunstify
electron11
electron5
electron6
gnome-icon-theme
gnome-icon-theme-symbolic
ifenslave
ipw2100-fw
ipw2200-fw
js60
js68
kde-servicemenus-rootactions - (this no longer works)
kdepim-apps-libs
libopenaptx
manjaro-documentation-en
manjaro-firmware
mhwd-nvidia-340xx
pth

also python2 was dropped, so if you are not using it you can remove it…


reinstall the 515 kernel and its modules, by uninstalling them and installing them back again…
also install the 5.19 kernel - not the rt one;
reboot and try again with both, if you get stuck, provide logs from the stuck boot:
journalctl -b-1 -p5 --no-pager

options vfio-pci ids=10de:1c82,10de:0fb9
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 430.40

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection
# NVIDIA Shield Portable (2013 - NVIDIA_Controller_v01.01 - In-Home Streaming only)
KERNEL=="hidraw*", ATTRS{idVendor}=="0955", ATTRS{idProduct}=="7203", ENV{ID_INPUT_JOYSTICK}="1", ENV{ID_INPUT_MOUSE}="", MODE="0660", TAG+="uaccess"

# NVIDIA Shield Controller (2017 - NVIDIA_Controller_v01.04); bluetooth
KERNEL=="hidraw*", KERNELS=="*0955:7214*", MODE="0660", TAG+="uaccess"

# NVIDIA Shield Controller (2015 - NVIDIA_Controller_v01.03); USB
KERNEL=="hidraw*", ATTRS{idVendor}=="0955", ATTRS{idProduct}=="7210", ENV{ID_INPUT_JOYSTICK}="1", ENV{ID_INPUT_MOUSE}="", MODE="0660", TAG+="uaccess"

Done!

A script of mine needs it so I kept it.


Will do when my transcoding finishes shortly and post results below.

Thanks!

$journalctl -b -1 -p5 --no-pager ##output for 515 kernel

Oct 22 02:30:41 hostname kernel: Linux version 5.15.74-3-MANJARO (builduser@fv-az457-508) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39.0) #1 SMP PREEMPT Sat Oct 15 13:39:11 UTC 2022
Oct 22 02:30:41 hostname kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15-x86_64 root=UUID=92d022fa-7e5b-4c9e-aa2e-dfbbfd05f5c8 rw quiet ibt=off resume=UUID=3b8de6df-ba38-447e-a7eb-cae4837b2942
Oct 22 02:30:41 hostname kernel: Unknown kernel command line parameters "BOOT_IMAGE=/boot/vmlinuz-5.15-x86_64 ibt=off", will be passed to user space.
Oct 22 02:30:41 hostname kernel:  #11 #12 #13 #14 #15 #16 #17 #18 #19
Oct 22 02:30:41 hostname kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
Oct 22 02:30:41 hostname kernel: TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
Oct 22 02:30:41 hostname kernel: MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.
Oct 22 02:30:41 hostname kernel:  #21 #22 #23 #24 #25 #26 #27 #28 #29
Oct 22 02:30:41 hostname kernel: audit: type=2000 audit(1666423837.259:1): state=initialized audit_enabled=0 res=1
Oct 22 02:30:41 hostname kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
Oct 22 02:30:41 hostname kernel: ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
Oct 22 02:30:41 hostname kernel: pci_bus 0000:ff: Unknown NUMA node; performance will be reduced
Oct 22 02:30:41 hostname kernel: pci_bus 0000:7f: Unknown NUMA node; performance will be reduced
Oct 22 02:30:41 hostname kernel: SCSI subsystem initialized
Oct 22 02:30:41 hostname kernel: VFS: Disk quotas dquot_6.6.0
Oct 22 02:30:41 hostname kernel: Initialise system trusted keyrings
Oct 22 02:30:41 hostname kernel: Key type blacklist registered
Oct 22 02:30:41 hostname kernel: Key type asymmetric registered
Oct 22 02:30:41 hostname kernel: Asymmetric key parser 'x509' registered
Oct 22 02:30:41 hostname kernel: pmd_set_huge: Cannot satisfy [mem 0xc1000000-0xc1200000] with a huge-page mapping due to MTRR override.
Oct 22 02:30:41 hostname kernel: Loading compiled-in X.509 certificates
Oct 22 02:30:41 hostname kernel: Loaded X.509 cert 'Build time autogenerated kernel key: a726d2fcedec7de7f1bb21d56ff6b06afad1c930'
Oct 22 02:30:41 hostname kernel: Key type ._fscrypt registered
Oct 22 02:30:41 hostname kernel: Key type .fscrypt registered
Oct 22 02:30:41 hostname kernel: Key type fscrypt-provisioning registered
Oct 22 02:30:41 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible
Oct 22 02:30:41 hostname kernel: scsi 0:0:0:0: Direct-Access     ATA      WDC WD40EFRX-68N 0A82 PQ: 0 ANSI: 5
Oct 22 02:30:41 hostname kernel: sd 0:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
Oct 22 02:30:41 hostname kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 22 02:30:41 hostname kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 02:30:41 hostname kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 02:30:41 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible
Oct 22 02:30:41 hostname kernel: scsi 1:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
Oct 22 02:30:41 hostname kernel: sd 1:0:0:0: [sdb] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
Oct 22 02:30:41 hostname kernel: sd 1:0:0:0: [sdb] 4096-byte physical blocks
Oct 22 02:30:41 hostname kernel: sd 1:0:0:0: [sdb] Write Protect is off
Oct 22 02:30:41 hostname kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 02:30:41 hostname kernel: scsi 2:0:0:0: Direct-Access     ATA      ST1500DL003-9VT1 CC32 PQ: 0 ANSI: 5
Oct 22 02:30:41 hostname kernel: sd 2:0:0:0: [sdc] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
Oct 22 02:30:41 hostname kernel: sd 2:0:0:0: [sdc] Write Protect is off
Oct 22 02:30:41 hostname kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 02:30:41 hostname kernel: scsi 3:0:0:0: Direct-Access     ATA      Samsung SSD 850  1B6Q PQ: 0 ANSI: 5
Oct 22 02:30:41 hostname kernel: sd 3:0:0:0: [sdd] 976773168 512-byte logical blocks: (500 GB/466 GiB)
Oct 22 02:30:41 hostname kernel: sd 3:0:0:0: [sdd] Write Protect is off
Oct 22 02:30:41 hostname kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 02:30:41 hostname kernel: ata7.00: ATA Identify Device Log not supported
Oct 22 02:30:41 hostname kernel: ata7.00: ATA Identify Device Log not supported
Oct 22 02:30:41 hostname kernel: scsi 6:0:0:0: Direct-Access     ATA      Hitachi HUA72302 A840 PQ: 0 ANSI: 5
Oct 22 02:30:41 hostname kernel: sd 6:0:0:0: [sde] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
Oct 22 02:30:41 hostname kernel: sd 6:0:0:0: [sde] Write Protect is off
Oct 22 02:30:41 hostname kernel: sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 02:30:41 hostname kernel: scsi 7:0:0:0: CD-ROM            HL-DT-ST BD-RE  WH16NS40  1.02 PQ: 0 ANSI: 5
Oct 22 02:30:41 hostname kernel: sd 3:0:0:0: [sdd] supports TCG Opal
Oct 22 02:30:41 hostname kernel: sd 3:0:0:0: [sdd] Attached SCSI disk
Oct 22 02:30:41 hostname kernel: sd 0:0:0:0: [sda] Attached SCSI disk
Oct 22 02:30:41 hostname kernel: sd 2:0:0:0: [sdc] Attached SCSI disk
Oct 22 02:30:41 hostname kernel: sd 1:0:0:0: [sdb] Attached SCSI disk
Oct 22 02:30:41 hostname kernel: sd 6:0:0:0: [sde] Attached SCSI disk
Oct 22 02:30:41 hostname kernel: i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
Oct 22 02:30:41 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 02:30:41 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 02:30:41 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 02:30:41 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 02:30:41 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 02:30:41 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 02:30:41 hostname kernel: random: lvm: uninitialized urandom read (4 bytes read)
Oct 22 02:30:41 hostname kernel: audit: type=1130 audit(1666423841.009:2): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=modprobe@fuse comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: audit: type=1131 audit(1666423841.009:3): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=modprobe@fuse comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 02:30:41 hostname kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0
Oct 22 02:30:41 hostname kernel: sd 2:0:0:0: Attached scsi generic sg2 type 0
Oct 22 02:30:41 hostname kernel: sd 3:0:0:0: Attached scsi generic sg3 type 0
Oct 22 02:30:41 hostname kernel: sd 6:0:0:0: Attached scsi generic sg4 type 0
Oct 22 02:30:41 hostname kernel: sr 7:0:0:0: Attached scsi generic sg5 type 5
Oct 22 02:30:41 hostname kernel: audit: type=1130 audit(1666423841.032:4): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-sysusers comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: audit: type=1130 audit(1666423841.062:5): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-tmpfiles-setup-dev comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: audit: type=1334 audit(1666423841.062:6): prog-id=6 op=LOAD
Oct 22 02:30:41 hostname kernel: audit: type=1334 audit(1666423841.062:7): prog-id=7 op=LOAD
Oct 22 02:30:41 hostname kernel: audit: type=1130 audit(1666423841.092:8): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: audit: type=1130 audit(1666423841.122:9): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-monitor comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: audit: type=1130 audit(1666423841.135:10): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-udevd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 02:30:41 hostname kernel: random: crng init done
Oct 22 02:30:41 hostname kernel: power_meter ACPI000D:00: Ignoring unsafe software power cap!
Oct 22 02:30:41 hostname kernel: power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
Oct 22 02:30:41 hostname kernel: i2c i2c-0: Systems with more than 4 memory slots not supported yet, not instantiating SPD
Oct 22 02:30:42 hostname systemd-udevd[672]: event4: Failed to call EVIOCSKEYCODE with scan code 0x7c, and key code 190: Invalid argument
Oct 22 02:30:42 hostname systemd-udevd[741]: controlC1: Process '/usr/bin/alsactl restore 1' failed with exit code 2.
Oct 22 02:30:42 hostname systemd-udevd[711]: controlC0: Process '/usr/bin/alsactl restore 0' failed with exit code 2.
Oct 22 02:30:42 hostname ntfs-3g[1122]: Version 2022.5.17 external FUSE 29
Oct 22 02:30:42 hostname ntfs-3g[1122]: Mounted /dev/sde3 (Read-Write, label "shared_music", NTFS 3.1)
Oct 22 02:30:42 hostname ntfs-3g[1122]: Cmdline options: rw,nosuid,nodev
Oct 22 02:30:42 hostname ntfs-3g[1122]: Mount options: nosuid,nodev,allow_other,nonempty,relatime,rw,fsname=/dev/sde3,blkdev,blksize=4096
Oct 22 02:30:42 hostname ntfs-3g[1122]: Ownership and permissions disabled, configuration type 7
Oct 22 02:30:42 hostname kernel: nvidia: loading out-of-tree module taints kernel.
Oct 22 02:30:42 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.
Oct 22 02:30:42 hostname kernel: Disabling lock debugging due to kernel taint
Oct 22 02:30:42 hostname kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
Oct 22 02:30:42 hostname kernel:
Oct 22 02:30:42 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  520.56.06  Thu Oct  6 21:38:55 UTC 2022
Oct 22 02:30:42 hostname systemd-udevd[638]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.
Oct 22 02:30:43 hostname systemd-udevd[636]: nvidia: Process '/usr/bin/bash -c 'for i in $(cat /proc/driver/nvidia/gpus/*/information | grep Minor | cut -d \  -f 4); do /usr/bin/mknod -Z -m 666 /dev/nvidia${i} c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) ${i}; done'' failed with exit code 1.

I will post the output for 519 sometime tomorrow.

post also output from:
journalctl -b-1 --no-pager | tail -30

$journalctl -b -1 -p5 --no-pager ##output for 519 kernel

Oct 23 14:14:50 hostname kernel: Linux version 5.19.16-2-MANJARO (builduser@fv-az204-563) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39.0) #1 SMP PREEMPT_DYNAMIC Sat Oct 15 13:37:00 UTC 2022
Oct 22 14:14:50 hostname kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.19-x86_64 root=UUID=92d022fa-7e5b-4c9e-aa2e-dfbbfd05f5c8 rw quiet ibt=off resume=UUID=3b8de6df-ba38-447e-a7eb-cae4837b2942
Oct 22 14:14:50 hostname kernel: Unknown kernel command line parameters "BOOT_IMAGE=/boot/vmlinuz-5.19-x86_64", will be passed to user space.
Oct 22 14:14:50 hostname kernel: random: crng init done
Oct 22 14:14:50 hostname kernel:  #11 #12 #13 #14 #15 #16 #17 #18 #19
Oct 22 14:14:50 hostname kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
Oct 22 14:14:50 hostname kernel: TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
Oct 22 14:14:50 hostname kernel: MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.
Oct 22 14:14:50 hostname kernel:  #21 #22 #23 #24 #25 #26 #27 #28 #29
Oct 22 14:14:50 hostname kernel: audit: type=2000 audit(1666466087.263:1): state=initialized audit_enabled=0 res=1
Oct 22 14:14:50 hostname kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
Oct 22 14:14:50 hostname kernel: ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
Oct 22 14:14:50 hostname kernel: pci_bus 0000:ff: Unknown NUMA node; performance will be reduced
Oct 22 14:14:50 hostname kernel: pci_bus 0000:7f: Unknown NUMA node; performance will be reduced
Oct 22 14:14:50 hostname kernel: SCSI subsystem initialized
Oct 22 14:14:50 hostname kernel: VFS: Disk quotas dquot_6.6.0
Oct 22 14:14:50 hostname kernel: Initialise system trusted keyrings
Oct 22 14:14:50 hostname kernel: Key type blacklist registered
Oct 22 14:14:50 hostname kernel: integrity: Platform Keyring initialized
Oct 22 14:14:50 hostname kernel: integrity: Machine keyring initialized
Oct 22 14:14:50 hostname kernel: Key type asymmetric registered
Oct 22 14:14:50 hostname kernel: Asymmetric key parser 'x509' registered
Oct 22 14:14:50 hostname kernel: pmd_set_huge: Cannot satisfy [mem 0xc1000000-0xc1200000] with a huge-page mapping due to MTRR override.
Oct 22 14:14:50 hostname kernel: Loading compiled-in X.509 certificates
Oct 22 14:14:50 hostname kernel: Loaded X.509 cert 'Build time autogenerated kernel key: 5e027128b890cd30e6c2c399299e05ff28f2695c'
Oct 22 14:14:50 hostname kernel: Key type ._fscrypt registered
Oct 22 14:14:50 hostname kernel: Key type .fscrypt registered
Oct 22 14:14:50 hostname kernel: Key type fscrypt-provisioning registered
Oct 22 14:14:50 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible
Oct 22 14:14:50 hostname kernel: scsi 0:0:0:0: Direct-Access     ATA      WDC WD40EFRX-68N 0A82 PQ: 0 ANSI: 5
Oct 22 14:14:50 hostname kernel: sd 0:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
Oct 22 14:14:50 hostname kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 22 14:14:50 hostname kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 14:14:50 hostname kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 14:14:50 hostname kernel: ata4.00: supports DRM functions and may not be fully accessible
Oct 22 14:14:50 hostname kernel: scsi 1:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
Oct 22 14:14:50 hostname kernel: sd 1:0:0:0: [sdb] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
Oct 22 14:14:50 hostname kernel: sd 1:0:0:0: [sdb] 4096-byte physical blocks
Oct 22 14:14:50 hostname kernel: sd 1:0:0:0: [sdb] Write Protect is off
Oct 22 14:14:50 hostname kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 14:14:50 hostname kernel: scsi 2:0:0:0: Direct-Access     ATA      ST1500DL003-9VT1 CC32 PQ: 0 ANSI: 5
Oct 22 14:14:50 hostname kernel: sd 2:0:0:0: [sdc] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
Oct 22 14:14:50 hostname kernel: sd 2:0:0:0: [sdc] Write Protect is off
Oct 22 14:14:50 hostname kernel: scsi 3:0:0:0: Direct-Access     ATA      Samsung SSD 850  1B6Q PQ: 0 ANSI: 5
Oct 22 14:14:50 hostname kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 14:14:50 hostname kernel: sd 3:0:0:0: [sdd] 976773168 512-byte logical blocks: (500 GB/466 GiB)
Oct 22 14:14:50 hostname kernel: sd 3:0:0:0: [sdd] Write Protect is off
Oct 22 14:14:50 hostname kernel: sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 14:14:50 hostname kernel: sd 3:0:0:0: [sdd] supports TCG Opal
Oct 22 14:14:50 hostname kernel: sd 3:0:0:0: [sdd] Attached SCSI disk
Oct 22 14:14:50 hostname kernel: sd 0:0:0:0: [sda] Attached SCSI disk
Oct 22 14:14:50 hostname kernel: scsi 6:0:0:0: Direct-Access     ATA      Hitachi HUA72302 A840 PQ: 0 ANSI: 5
Oct 22 14:14:50 hostname kernel: sd 6:0:0:0: [sde] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
Oct 22 14:14:50 hostname kernel: sd 6:0:0:0: [sde] Write Protect is off
Oct 22 14:14:50 hostname kernel: sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 22 14:14:50 hostname kernel: scsi 7:0:0:0: CD-ROM            HL-DT-ST BD-RE  WH16NS40  1.02 PQ: 0 ANSI: 5
Oct 22 14:14:50 hostname kernel: sd 2:0:0:0: [sdc] Attached SCSI disk
Oct 22 14:14:50 hostname kernel: sd 6:0:0:0: [sde] Attached SCSI disk
Oct 22 14:14:50 hostname kernel: sd 1:0:0:0: [sdb] Attached SCSI disk
Oct 22 14:14:50 hostname kernel: i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
Oct 22 14:14:51 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 14:14:51 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 14:14:51 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 14:14:51 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 14:14:51 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 14:14:51 hostname kernel: pstore: crypto_comp_decompress failed, ret = -22!
Oct 22 14:14:51 hostname kernel: audit: type=1130 audit(1666466090.922:2): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=modprobe@fuse comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: audit: type=1131 audit(1666466090.922:3): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=modprobe@fuse comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 14:14:51 hostname kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0
Oct 22 14:14:51 hostname kernel: sd 2:0:0:0: Attached scsi generic sg2 type 0
Oct 22 14:14:51 hostname kernel: sd 3:0:0:0: Attached scsi generic sg3 type 0
Oct 22 14:14:51 hostname kernel: sd 6:0:0:0: Attached scsi generic sg4 type 0
Oct 22 14:14:51 hostname kernel: sr 7:0:0:0: Attached scsi generic sg5 type 5
Oct 22 14:14:51 hostname kernel: audit: type=1130 audit(1666466090.942:4): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-sysusers comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: audit: type=1130 audit(1666466090.969:5): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-tmpfiles-setup-dev comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: audit: type=1334 audit(1666466090.969:6): prog-id=9 op=LOAD
Oct 22 14:14:51 hostname kernel: audit: type=1334 audit(1666466090.969:7): prog-id=10 op=LOAD
Oct 22 14:14:51 hostname kernel: audit: type=1130 audit(1666466091.005:8): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: audit: type=1130 audit(1666466091.035:9): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-monitor comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: audit: type=1130 audit(1666466091.042:10): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-udevd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 22 14:14:51 hostname kernel: power_meter ACPI000D:00: Ignoring unsafe software power cap!
Oct 22 14:14:51 hostname kernel: power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
Oct 22 14:14:51 hostname kernel: i2c i2c-0: Systems with more than 4 memory slots not supported yet, not instantiating SPD
Oct 22 14:14:51 hostname kernel: asus_wmi: fan_curve_get_factory_default (0x00110024) failed: -61
Oct 22 14:14:51 hostname kernel: asus_wmi: fan_curve_get_factory_default (0x00110025) failed: -61
Oct 22 14:14:51 hostname systemd-udevd[576]: event4: Failed to call EVIOCSKEYCODE with scan code 0x7c, and key code 190: Invalid argument
Oct 22 14:14:52 hostname systemd-udevd[608]: controlC1: Process '/usr/bin/alsactl restore 1' failed with exit code 2.
Oct 22 14:14:52 hostname ntfs-3g[1011]: Version 2022.5.17 external FUSE 29
Oct 22 14:14:52 hostname ntfs-3g[1011]: Mounted /dev/sde3 (Read-Write, label "shared_music", NTFS 3.1)
Oct 22 14:14:52 hostname ntfs-3g[1011]: Cmdline options: rw,nosuid,nodev
Oct 22 14:14:52 hostname ntfs-3g[1011]: Mount options: nosuid,nodev,allow_other,nonempty,relatime,rw,fsname=/dev/sde3,blkdev,blksize=4096
Oct 22 14:14:52 hostname ntfs-3g[1011]: Ownership and permissions disabled, configuration type 7
Oct 22 14:14:52 hostname systemd-udevd[585]: controlC0: Process '/usr/bin/alsactl restore 0' failed with exit code 2.
Oct 22 14:14:52 hostname kernel: nvidia: loading out-of-tree module taints kernel.
Oct 22 14:14:52 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.
Oct 22 14:14:52 hostname kernel: Disabling lock debugging due to kernel taint
Oct 22 14:14:52 hostname kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
Oct 22 14:14:53 hostname kernel:
Oct 22 14:14:53 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  520.56.06  Thu Oct  6 21:38:55 UTC 2022
Oct 22 14:14:53 hostname systemd-udevd[597]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.
$journalctl -b-1 --no-pager | tail -30 # for kernel 519

Oct 22 14:14:52 hostname mtp-probe[1053]: checking bus 3, device 2: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-3"↲
Oct 22 14:14:52 hostname mtp-probe[1052]: bus: 3, device: 6 was not an MTP device↲
Oct 22 14:14:52 hostname mtp-probe[1053]: bus: 3, device: 2 was not an MTP device↲
Oct 22 14:14:52 hostname kernel: nvidia: loading out-of-tree module taints kernel.↲
Oct 22 14:14:52 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.↲
Oct 22 14:14:52 hostname kernel: Disabling lock debugging due to kernel taint↲
Oct 22 14:14:52 hostname kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel↲
Oct 22 14:14:53 hostname kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 236↲
Oct 22 14:14:53 hostname kernel: ↲
Oct 22 14:14:53 hostname kernel: nvidia 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem↲
Oct 22 14:14:53 hostname kernel: input: ROCCAT ROCCAT Kone XTD as /devices/pci0000:00/0000:00:14.0/usb3/3-14/3-14:1.0/0003:1E7D:2E22.0002/input/input20↲
Oct 22 14:14:53 hostname kernel: koneplus 0003:1E7D:2E22.0002: input,hiddev97,hidraw1: USB HID v1.00 Mouse [ROCCAT ROCCAT Kone XTD] on usb-0000:00:14.0-14/input0↲
Oct 22 14:14:53 hostname kernel: input: ROCCAT ROCCAT Kone XTD as /devices/pci0000:00/0000:00:14.0/usb3/3-14/3-14:1.1/0003:1E7D:2E22.0003/input/input21↲
Oct 22 14:14:53 hostname kernel: koneplus 0003:1E7D:2E22.0003: input,hidraw2: USB HID v1.11 Keyboard [ROCCAT ROCCAT Kone XTD] on usb-0000:00:14.0-14/input1↲
Oct 22 14:14:53 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  520.56.06  Thu Oct  6 21:38:55 UTC 2022↲
Oct 22 14:14:53 hostname systemd-udevd[597]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code 1.↲ 
Oct 22 14:14:53 hostname systemd-modules-load[548]: Inserted module 'nvidia'↲
Oct 22 14:14:53 hostname mtp-probe[1175]: checking bus 3, device 6: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-14"↲
Oct 22 14:14:53 hostname mtp-probe[1175]: bus: 3, device: 6 was not an MTP device↲
Oct 22 14:14:53 hostname kernel: mousedev: PS/2 mouse device common for all mice↲
Oct 22 14:14:53 hostname systemd[1]: Mounted /var.↲
Oct 22 14:14:53 hostname systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.↲
Oct 22 14:14:53 hostname systemd[1]: Virtual Machine and Container Storage (Compatibility) was skipped because of a failed condition check (ConditionPathExists=/var/lib/machines.raw).↲
Oct 22 14:14:53 hostname systemd[1]: Reached target Local File Systems.↲
Oct 22 14:14:53 hostname systemd[1]: Starting Rebuild Dynamic Linker Cache...↲
Oct 22 14:14:53 hostname systemd[1]: Set Up Additional Binary Formats was skipped because all trigger condition checks failed.↲
Oct 22 14:14:53 hostname systemd[1]: Starting Flush Journal to Persistent Storage...↲
Oct 22 14:14:53 hostname systemd[1]: Starting Load/Save Random Seed...↲
Oct 22 14:14:53 hostname systemd-journald[547]: Time spent on flushing to /var/log/journal/e6439e661a9f456ca18c3f9edb0ca8c5 is 481.909ms for 1476 entries.↲
Oct 22 14:14:53 hostname systemd-journald[547]: System Journal (/var/log/journal/e6439e661a9f456ca18c3f9edb0ca8c5) is 48.0M, max 4.0G, 3.9G free.

damn have no idea, the last normal logs are from cleaning journal …
and this:

nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \  -f 1) 255'' failed with exit code

is just a warning, and its not responsible for the freezes…
you can try booting into latest manjaro iso, which ships with the 5.15 kernel, to see if it is affected too…

also post output from:
cat /etc/mkinitcpio.conf

Thank you for still trying to help me. I will try to live boot with the latest manjaro iso when I get a chance; I am sure it will work. In the meantime see below for my /etc/mkinitcpio.conf

# vim:set ft=sh
# MODULES
# The following modules are loaded before any boot hooks are
# run.  Advanced users may wish to specify all system modules
# in this array.  For instance:
#     MODULES=(piix ide_disk reiserfs)
MODULES="crc32c-intel vfio_pci vfio vfio_iommu_type1 vfio_virqfd"

# BINARIES
# This setting includes any additional binaries a given user may
# wish into the CPIO image.  This is run last, so it may be used to
# override the actual binaries included by a given hook
# BINARIES are dependency parsed, so you may safely ignore libraries
BINARIES=()

# FILES
# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way.  This is useful for config files.
FILES=""

# HOOKS
# This is the most important setting in this file.  The HOOKS control the
# modules and scripts added to the image, and what happens at boot time.
# Order is important, and it is recommended that you do not change the
# order in which HOOKS are added.  Run 'mkinitcpio -H <hook name>' for
# help on a given hook.
# 'base' is _required_ unless you know precisely what you are doing.
# 'udev' is _required_ in order to automatically load modules
# 'filesystems' is _required_ unless you specify your fs modules in MODULES
# Examples:
##   This setup specifies all modules in the MODULES setting above.
##   No raid, lvm2, or encrypted root is needed.
#    HOOKS=(base)
#
##   This setup will autodetect all modules for your system and should
##   work as a sane default
#    HOOKS=(base udev autodetect block filesystems)
#
##   This setup will generate a 'full' image which supports most systems.
##   No autodetection is done.
#    HOOKS=(base udev block filesystems)
#
##   This setup assembles a pata mdadm array with an encrypted root FS.
##   Note: See 'mkinitcpio -H mdadm' for more information on raid devices.
#    HOOKS=(base udev block mdadm encrypt filesystems)
#
##   This setup loads an lvm2 volume group on a usb device.
#    HOOKS=(base udev block lvm2 filesystems)
#
##   NOTE: If you have /usr on a separate partition, you MUST include the
#    usr, fsck and shutdown hooks.
HOOKS="base udev autodetect modconf block keyboard keymap resume filesystems"

# COMPRESSION
# Use this to compress the initramfs image. By default, gzip compression
# is used. Use 'cat' to create an uncompressed image.
#COMPRESSION="gzip"
#COMPRESSION="bzip2"
#COMPRESSION="lzma"
#COMPRESSION="xz"
#COMPRESSION="lzop"
#COMPRESSION="lz4"

# COMPRESSION_OPTIONS
# Additional options for the compressor
#COMPRESSION_OPTIONS=()

Edit: Noticing the vfio_iommu_type1 module above, I removed it and uninstalled/reinstalled kernel 515 with still no joy.

i would remove all the vfio parameters from the modules section:
sudo nano /etc/mkinitcpio.conf

MODULES="crc32c-intel"

save the file: ctrl+x and run:
sudo mkinitcpio -P
this is only for testing purposes, to see if its related…

also disable the /etc/modprobe.d/vfio.conf
sudo nano /etc/modprobe.d/vfio.conf

# options vfio-pci ids=10de:1c82,10de:0fb9

save it: ctrl+x, reboot

How much I wanted this to work but it did not. Unfortunately there was no change in behavior. Once kernel 510 loses support, I would probably have to re-install the whole system from scratch and hope that it actually works. In the meantime, I will try rebooting with a live manjaro-kde iso (kernel 515) and see how that one behaves.

Thank you for spending time to troubleshoot with me. This topic remains unsolved.