why do you have added this parameter:
do you need it?
also add another kernel parameter: ibt=off
save grub, update grub, reboot and test if it helped
if it doesnt work provide logs from this: journalctl -b-1 -p5 --no-pager
and where do you get stuck? on the dev… clean display?
can you enter into tty on the stuck screen: ctrl+alt+f2 - or f1-f6 keys, enter your username/password and type: startx - if it doesnt work take a pic of the screen, and also it will provide a log usually saved in your home folder: home/.local/share/xorg
provide output also from: mhwd -l && mhwd -li ls /etc/modprobe.d find /etc/X11/ -name "*.conf" ls /etc/udev/rules.d/ ls /usr/lib/udev/rules.d/*nvidia* pamac list -qm
I don’t need it anymore, so I removed it as recommended.
Unfortunately, it did not help.
Interestingly this is what I got this time. I wonder if it has anything to do with what happened. After issuing the sudo reboot command, it took a long time for the system to reboot, so I manually power cycled it. Once the system recovered, I tried booting into kernel 5-15, which failed again, followed by rebooting to 5.10 before issuing the command above with the following output:
Journal file /var/log/journal/e6439e661a9f456ca18c3f9edb0ca8c5/system@0005eb816d8cead1-f159dd3353c999b9.journal~ is truncated, ignoring file.
-- No entries --
The system freezes shortly after the following message is displayed. I get no tty, no login prompt of any kind. ctrl+alt+fn combinations do nothing.
starting version 251.5-1-manjaro
> 0000:03:00.0 (0300:10de:1bb1) Display controller nVidia Corporation:
--------------------------------------------------------------------------------
NAME VERSION FREEDRIVER TYPE
--------------------------------------------------------------------------------
video-nvidia 2021.11.04 false PCI
video-nvidia-470xx 2021.11.04 false PCI
video-nvidia-390xx 2021.11.26 false PCI
video-linux 2018.05.04 true PCI
video-modesetting 2020.01.13 true PCI
video-vesa 2017.03.12 true PCI
> 0000:09:00.0 (0300:1a03:2000) Display controller ASPEED Technology Inc.:
--------------------------------------------------------------------------------
NAME VERSION FREEDRIVER TYPE
--------------------------------------------------------------------------------
video-modesetting 2020.01.13 true PCI
video-vesa 2017.03.12 true PCI
> Installed PCI configs:
--------------------------------------------------------------------------------
NAME VERSION FREEDRIVER TYPE
--------------------------------------------------------------------------------
video-nvidia 2021.11.04 false PCI
video-modesetting 2020.01.13 true PCI
Warning: No installed USB configs!
also python2 was dropped, so if you are not using it you can remove it…
reinstall the 515 kernel and its modules, by uninstalling them and installing them back again…
also install the 5.19 kernel - not the rt one;
reboot and try again with both, if you get stuck, provide logs from the stuck boot: journalctl -b-1 -p5 --no-pager
$journalctl -b-1 --no-pager | tail -30 # for kernel 519
Oct 22 14:14:52 hostname mtp-probe[1053]: checking bus 3, device 2: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-3"↲
Oct 22 14:14:52 hostname mtp-probe[1052]: bus: 3, device: 6 was not an MTP device↲
Oct 22 14:14:52 hostname mtp-probe[1053]: bus: 3, device: 2 was not an MTP device↲
Oct 22 14:14:52 hostname kernel: nvidia: loading out-of-tree module taints kernel.↲
Oct 22 14:14:52 hostname kernel: nvidia: module license 'NVIDIA' taints kernel.↲
Oct 22 14:14:52 hostname kernel: Disabling lock debugging due to kernel taint↲
Oct 22 14:14:52 hostname kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel↲
Oct 22 14:14:53 hostname kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 236↲
Oct 22 14:14:53 hostname kernel: ↲
Oct 22 14:14:53 hostname kernel: nvidia 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem↲
Oct 22 14:14:53 hostname kernel: input: ROCCAT ROCCAT Kone XTD as /devices/pci0000:00/0000:00:14.0/usb3/3-14/3-14:1.0/0003:1E7D:2E22.0002/input/input20↲
Oct 22 14:14:53 hostname kernel: koneplus 0003:1E7D:2E22.0002: input,hiddev97,hidraw1: USB HID v1.00 Mouse [ROCCAT ROCCAT Kone XTD] on usb-0000:00:14.0-14/input0↲
Oct 22 14:14:53 hostname kernel: input: ROCCAT ROCCAT Kone XTD as /devices/pci0000:00/0000:00:14.0/usb3/3-14/3-14:1.1/0003:1E7D:2E22.0003/input/input21↲
Oct 22 14:14:53 hostname kernel: koneplus 0003:1E7D:2E22.0003: input,hidraw2: USB HID v1.11 Keyboard [ROCCAT ROCCAT Kone XTD] on usb-0000:00:14.0-14/input1↲
Oct 22 14:14:53 hostname kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 520.56.06 Thu Oct 6 21:38:55 UTC 2022↲
Oct 22 14:14:53 hostname systemd-udevd[597]: nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \ -f 1) 255'' failed with exit code 1.↲
Oct 22 14:14:53 hostname systemd-modules-load[548]: Inserted module 'nvidia'↲
Oct 22 14:14:53 hostname mtp-probe[1175]: checking bus 3, device 6: "/sys/devices/pci0000:00/0000:00:14.0/usb3/3-14"↲
Oct 22 14:14:53 hostname mtp-probe[1175]: bus: 3, device: 6 was not an MTP device↲
Oct 22 14:14:53 hostname kernel: mousedev: PS/2 mouse device common for all mice↲
Oct 22 14:14:53 hostname systemd[1]: Mounted /var.↲
Oct 22 14:14:53 hostname systemd[1]: Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.↲
Oct 22 14:14:53 hostname systemd[1]: Virtual Machine and Container Storage (Compatibility) was skipped because of a failed condition check (ConditionPathExists=/var/lib/machines.raw).↲
Oct 22 14:14:53 hostname systemd[1]: Reached target Local File Systems.↲
Oct 22 14:14:53 hostname systemd[1]: Starting Rebuild Dynamic Linker Cache...↲
Oct 22 14:14:53 hostname systemd[1]: Set Up Additional Binary Formats was skipped because all trigger condition checks failed.↲
Oct 22 14:14:53 hostname systemd[1]: Starting Flush Journal to Persistent Storage...↲
Oct 22 14:14:53 hostname systemd[1]: Starting Load/Save Random Seed...↲
Oct 22 14:14:53 hostname systemd-journald[547]: Time spent on flushing to /var/log/journal/e6439e661a9f456ca18c3f9edb0ca8c5 is 481.909ms for 1476 entries.↲
Oct 22 14:14:53 hostname systemd-journald[547]: System Journal (/var/log/journal/e6439e661a9f456ca18c3f9edb0ca8c5) is 48.0M, max 4.0G, 3.9G free.
damn have no idea, the last normal logs are from cleaning journal …
and this:
nvidia: Process '/usr/bin/bash -c '/usr/bin/mknod -Z -m 666 /dev/nvidiactl c $(grep nvidia-frontend /proc/devices | cut -d \ -f 1) 255'' failed with exit code
is just a warning, and its not responsible for the freezes…
you can try booting into latest manjaro iso, which ships with the 5.15 kernel, to see if it is affected too…
Thank you for still trying to help me. I will try to live boot with the latest manjaro iso when I get a chance; I am sure it will work. In the meantime see below for my /etc/mkinitcpio.conf
# vim:set ft=sh
# MODULES
# The following modules are loaded before any boot hooks are
# run. Advanced users may wish to specify all system modules
# in this array. For instance:
# MODULES=(piix ide_disk reiserfs)
MODULES="crc32c-intel vfio_pci vfio vfio_iommu_type1 vfio_virqfd"
# BINARIES
# This setting includes any additional binaries a given user may
# wish into the CPIO image. This is run last, so it may be used to
# override the actual binaries included by a given hook
# BINARIES are dependency parsed, so you may safely ignore libraries
BINARIES=()
# FILES
# This setting is similar to BINARIES above, however, files are added
# as-is and are not parsed in any way. This is useful for config files.
FILES=""
# HOOKS
# This is the most important setting in this file. The HOOKS control the
# modules and scripts added to the image, and what happens at boot time.
# Order is important, and it is recommended that you do not change the
# order in which HOOKS are added. Run 'mkinitcpio -H <hook name>' for
# help on a given hook.
# 'base' is _required_ unless you know precisely what you are doing.
# 'udev' is _required_ in order to automatically load modules
# 'filesystems' is _required_ unless you specify your fs modules in MODULES
# Examples:
## This setup specifies all modules in the MODULES setting above.
## No raid, lvm2, or encrypted root is needed.
# HOOKS=(base)
#
## This setup will autodetect all modules for your system and should
## work as a sane default
# HOOKS=(base udev autodetect block filesystems)
#
## This setup will generate a 'full' image which supports most systems.
## No autodetection is done.
# HOOKS=(base udev block filesystems)
#
## This setup assembles a pata mdadm array with an encrypted root FS.
## Note: See 'mkinitcpio -H mdadm' for more information on raid devices.
# HOOKS=(base udev block mdadm encrypt filesystems)
#
## This setup loads an lvm2 volume group on a usb device.
# HOOKS=(base udev block lvm2 filesystems)
#
## NOTE: If you have /usr on a separate partition, you MUST include the
# usr, fsck and shutdown hooks.
HOOKS="base udev autodetect modconf block keyboard keymap resume filesystems"
# COMPRESSION
# Use this to compress the initramfs image. By default, gzip compression
# is used. Use 'cat' to create an uncompressed image.
#COMPRESSION="gzip"
#COMPRESSION="bzip2"
#COMPRESSION="lzma"
#COMPRESSION="xz"
#COMPRESSION="lzop"
#COMPRESSION="lz4"
# COMPRESSION_OPTIONS
# Additional options for the compressor
#COMPRESSION_OPTIONS=()
Edit: Noticing the vfio_iommu_type1 module above, I removed it and uninstalled/reinstalled kernel 515 with still no joy.
How much I wanted this to work but it did not. Unfortunately there was no change in behavior. Once kernel 510 loses support, I would probably have to re-install the whole system from scratch and hope that it actually works. In the meantime, I will try rebooting with a live manjaro-kde iso (kernel 515) and see how that one behaves.
Thank you for spending time to troubleshoot with me. This topic remains unsolved.
good idea to try a live usb …
also you can enable early load of the nvidia drivers: kate /etc/mkinitcpio.conf
and edit the modules section to look like this:
Booting with a live usb worked well, no issues. It is not the kernel 515 that has the problem it is the “rolling” part of the rolling-release that seems to fail.
Unfortunately, this did not seem to change anything. I guess I will have to reinstall the OS sometime down the road. It would be a pain to reconfigure everything, VMs, etc.
Thank you for all of your ideas and suggestions, very much appreciated.
did you modify journal settings? since the logs get stuck after flushing journal… i had the same issue, but the freezes happened randomly, not every time like you have…
post output from: cat /etc/systemd/journald.conf
I did not modify journal settings, please see below.
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it under the
# terms of the GNU Lesser General Public License as published by the Free
# Software Foundation; either version 2.1 of the License, or (at your option)
# any later version.
#
# Entries in this file show the compile time defaults. Local configuration
# should be created by either modifying this file, or by creating "drop-ins" in
# the journald.conf.d/ subdirectory. The latter is generally recommended.
# Defaults can be restored by simply deleting this file and all drop-ins.
#
# Use 'systemd-analyze cat-config systemd/journald.conf' to display the full config.
#
# See journald.conf(5) for details.
[Journal]
#Storage=auto
#Compress=yes
#Seal=yes
#SplitMode=uid
#SyncIntervalSec=5m
#RateLimitIntervalSec=30s
#RateLimitBurst=10000
#SystemMaxUse=
#SystemKeepFree=
#SystemMaxFileSize=
#SystemMaxFiles=100
#RuntimeMaxUse=
#RuntimeKeepFree=
#RuntimeMaxFileSize=
#RuntimeMaxFiles=100
#MaxRetentionSec=
#MaxFileSec=1month
#ForwardToSyslog=no
#ForwardToKMsg=no
#ForwardToConsole=no
#ForwardToWall=yes
#TTYPath=/dev/console
#MaxLevelStore=debug
#MaxLevelSyslog=debug
#MaxLevelKMsg=notice
#MaxLevelConsole=info
#MaxLevelWall=emerg
#LineMax=48K
#ReadKMsg=yes
#Audit=yes
Thank you again, I personally don’t believe the cause is nvidia drivers, etc. I built this server/workstation in 2016, and upgraded some of its hardware a few times over the course of the last 6 years. This is the first machine that I used Manjaro as the single OS, before I have been a fedora user and upgrading it every 6 months was a pain. Although Manjaro is a rolling release, I have always been skeptical about the rolling portion, since I know junk accumulates over time and even the best effort for rolling would eventually fail over time, at least in my opinion/expectations. Six years is not a bad time, I appreciate the effort put in this project every day. It has been very stable for me and also very easy to use. I am thankful for the developers and maintainers of Manjaro. Kudos for a job well done!
Thanks for your message. I guess kernel 6.1 did not make it to my branch yet, there is only one version listed which reads kernel 6.1.0rc2-1 and marked as experimental. I use this computer as a hypervisor which hosts many servers and I am a little worried to try the experimental kernel on it, as silly as it may sound. Kernel 6.0 definitely did not make any difference for me with the problem I am having.