System gets stuck

Hi.

My system has started to freeze occasionally during the recent days. Could you please help me to troubleshoot?
If this happens, I press the power-button, which initiates a shutdown. Since I can see the usual messages on the screen during shutdown, it is obviously not 100% frozen - just stuck.

I gathered the most important info - you can find it on pastebin:
journalctl -p err
cat Xorg.0.log
hardware info
journalctl -p err | grep error

Thank you very much.

The journal doesn’t help. It is cutoff due the pager.

Maybe try this script to collect data: https://notabug.org/megavolt/random-scripts/src/master/lognconf-report.sh (I wrote it)

You can also run it like that:

bash <(curl -s "https://notabug.org/megavolt/random-scripts/raw/master/lognconf-report.sh")

At the end you get 2 questions for viewing the content and uploading or save to file. Both are preformated in markdown. So you copy&paste it easier.

Thank you.

The output is exceeding pastebins limit of 512kb, so please download the zipped logs here: Nextcloud

Are your system up-to-date?

sudo pacman -Syu

Do you use custom packages?

pacman -.Qqm

Do you use custom repo?

cat /etc/pacman.conf

I vaguely recall something with the 6.0 kernel - test 6.1 - to verify

sudo pacman -S linux61

yes:

(base) [dejhost@Workstation ~]$ sudo pacman -Syu
[sudo] password for dejhost: 
:: Synchronizing package databases...
 core is up to date
 extra is up to date
 community is up to date
 multilib is up to date
:: Starting full system upgrade...
 there is nothing to do

(base) [dejhost@Workstation ~]$ pacman -Qqm
anaconda
anydesk-bin
birdtray
context
cuda-11.1
duplicati-latest
foxitreader
g2o
gnome-icon-theme
gnome-icon-theme-symbolic
gstreamer0.10
gstreamer0.10-base
imagej2
lib32-gtk-engine-murrine
manjaro-documentation-en
manjaro-firmware
pyqt4-common
python-pyqt4
python-sip-pyqt4
qpdfview
qt4
teamviewer
texlive-full
ttf-ms-fonts
unetbootin
visual-studio-code-bin
vivaldi-widevine
webex-bin
wireguard-dkms
xmlmind-xmleditor
(base) [dejhost@Workstation ~]$ 

cat /etc/pacman.conf

sudo pacman -S linux61
resolving dependencies...
looking for conflicting packages...

Packages (1) linux61-6.1.0rc4-1

Total Download Size:   164.58 MiB
Total Installed Size:  168.89 MiB

:: Proceed with installation? [Y/n] n

Please define “freeze” better, because most likely, reading your text, you mean slows down a lot.

I would suggest you to open 2 consoles (not terminals) using CTRL+ALT+F2/3, and run these in them respectively and keep them running:

  • htop
    To monitor which process is using how much CPU.
  • journalctl -xf
    To monitor logs in case stuff gives errors.

Then whenever your system feels like “freezing” again, you can switch to the consoles to check the info provided by those programs. :vulcan_salute:

1 Like

Should I upgrade using sudo pacman -S linux61 ?

The system froze about 6 times during the last few hours. I was suprised that Manjaro sends notifications, while mouse and keyboard do not react. So when it happend the last time, I pulled the USB-cable and re-inserted it. This helped.

I will re-organize my usb-setup, exchanging cables. Maybe that helps.
Any suggestions what else I could do?

I am a little confused by some of data in the inxi

  • XFCE desktop
  • Xwayland

I may be missing something - I didn’t know XFCE was capable of using Wayland - which just triggered the thought - what if you logout - do you have an option switch session to Xorg?

Now that I remember CUDA and Nvidia - I am not so sure - as my suggestion relies on the prebuilt nvidia driver modules for the kernel.

Personally I would test it but since you have to ask - I withdraw the suggestion.

Should you want to test if the kernel is somehow flooded with events - you should be able to gather some intel as suggested by @TriMoon.

For information only

You should not do this if you are in any way uncomfortable with the process

My process of troubleshooting by testing the 6.1 kernel would involve the following change to the system

One will have to accept the replacement of the kernel specific nvidia package in favor of the nvidia-dkms package

sudo pacman -Syu linux60-headers dkms nvidia-dkms

Then restart the system - to test the dkms modules.

If the result is as expected continue with

sudo pacman -Syu linux61 linux61-headers

And restart to test 6.1 kernel.

If for any reason this creates issues - one can issue an emergency reboot using REISUB - then hold the shift key while booting to display the GRUB menu → navigate advanced options → select the 6.0 kernel.

1 Like

First of all. I don’t see anything related to the gpu or anything else… only your network before you pressed the shutdown button:

Nov 29 10:08:59 nextcloud.nextcloud-fixer[60510]: Nextcloud is not installed - only a limited number of commands are available
Nov 29 10:09:00 nextcloud.nextcloud-fixer[60588]: Nextcloud is not installed - only a limited number of commands are available
Nov 29 10:09:01 nextcloud.nextcloud-fixer[60666]: Nextcloud is not installed - only a limited number of commands are available
Nov 29 10:09:02 nextcloud.nextcloud-fixer[60742]: Nextcloud is not installed - only a limited number of commands are available
Nov 29 10:09:02 kernel: [UFW BLOCK] IN=enp34s0 OUT= MAC=01:00:5e:00:00:fb:be:dd:68:9e:4d:c6:08:00 SRC=10.0.4.65 DST=224.0.0.251 LEN=32 TOS=0x00 PREC=0x00 TTL=1 ID=9959 PROTO=2
Nov 29 10:09:03 systemd[1]: docker-2bd7636352e76a227672e49942c16600fa97f2c6fe3a7c850cb4392bf7bc2b3b.scope: Deactivated successfully.
Nov 29 10:09:03 systemd[1]: docker-2bd7636352e76a227672e49942c16600fa97f2c6fe3a7c850cb4392bf7bc2b3b.scope: Consumed 2.511s CPU time.
Nov 29 10:09:03 kernel: br-a96c1ef81739: port 2(veth0e2b8da) entered disabled state
Nov 29 10:09:03 kernel: veth9c539e2: renamed from eth0
Nov 29 10:09:03 NetworkManager[1229]: <info>  [1669712943.5372] manager: (veth9c539e2): new Veth device (/org/freedesktop/NetworkManager/Devices/69)
Nov 29 10:09:03 avahi-daemon[1214]: Interface veth0e2b8da.IPv6 no longer relevant for mDNS.
Nov 29 10:09:03 avahi-daemon[1214]: Leaving mDNS multicast group on interface veth0e2b8da.IPv6 with address fe80::7ce7:7bff:fe12:c265.
Nov 29 10:09:03 kernel: br-a96c1ef81739: port 2(veth0e2b8da) entered disabled state
Nov 29 10:09:03 kernel: device veth0e2b8da left promiscuous mode
Nov 29 10:09:03 kernel: br-a96c1ef81739: port 2(veth0e2b8da) entered disabled state
Nov 29 10:09:03 avahi-daemon[1214]: Withdrawing address record for fe80::7ce7:7bff:fe12:c265 on veth0e2b8da.
Nov 29 10:09:03 nextcloud.nextcloud-fixer[60855]: Nextcloud is not installed - only a limited number of commands are available
Nov 29 10:09:04 systemd-logind[1219]: Power key pressed short.
Nov 29 10:09:04 systemd-logind[1219]: System is powering down.

About the inxi output:

Graphics:
  Device-1: NVIDIA GA104 [GeForce RTX 3070] vendor: ASUSTeK driver: nvidia
    v: 520.56.06 alternate: nouveau,nvidia_drm non-free: 520.xx+
    status: current (as of 2022-10) arch: Ampere code: GAxxx
    process: TSMC n7 (7nm) built: 2020-22 pcie: gen: 3 speed: 8 GT/s lanes: 16
    link-max: gen: 4 speed: 16 GT/s bus-ID: 26:00.0 chip-ID: 10de:2484
    class-ID: 0300
  Device-2: Logitech HD Pro Webcam C920 type: USB
    driver: snd-usb-audio,uvcvideo bus-ID: 1-1.2.1:10 chip-ID: 046d:082d
    class-ID: 0102 serial: <filter>
  Display: x11 server: X.Org v: 21.1.4 with: Xwayland v: 22.1.5
    compositor: xfwm v: 4.16.1 driver: N/A display-ID: :1.0 screens: 1
  Screen-1: 0 s-res: 13822x2592 s-dpi: 96 s-size: 3657x686mm (143.98x27.01")
    s-diag: 3721mm (146.49")

It looks like the driver is loaded but not by Xorg as you see driver: N/A. There could be a problem somehow.

In your /etc/mkinitcpio.conf is

MODULES=(amdgpu)

mentioned, but it seems you don’t one it.

And yeah I guess you saw it:

Nov 29 10:05:19 kernel: usb 1-1.1-port1: cannot disable (err = -71)
Nov 29 10:05:19 kernel: usb 1-1.1-port1: cannot reset (err = -71)
Nov 29 10:05:19 kernel: usb 1-1.1-port1: cannot reset (err = -71)
Nov 29 10:05:19 kernel: usb 1-1.1-port1: cannot reset (err = -71)
Nov 29 10:05:19 kernel: usb 1-1.1-port1: cannot reset (err = -71)
Nov 29 10:05:19 kernel: usb 1-1.1-port1: cannot reset (err = -71)
Nov 29 10:05:19 kernel: usb 1-1.1-port1: Cannot enable. Maybe the USB cable is bad?

From the logs I cannot determine what your freeze is caused by.

I would guess it could be an Issue with Cstates of your CPU . So if it goes in deep sleep then it freezes.

Maybe try this: processor.max_cstate=1 to add to the kernel parameter. So it will not be lower than 1, so fixed to C0 C1.

Yes, freeze might be a strong word. I couldn’t see any movement on the screen anymore. Mouse and keyboard wouldn’t have any impact. Much later, I was surprised to see a manjaro-notification popping up.
When it last happened, htop did not show anything special. It just continued to run as usual.

btw.: CTRL+ALT+F2/3 doesn’t open a console on my machine. Do you know the command for the terminal?
I am pretty sure by now, that it has to do with the USB. I am investigating this further. Unplug and replugging one of the usb-cables helps (I added a second one in order to find out which component is cauing the issue.

journalctl -xf tells me many things. For example: Nov 29 11:54:59 Workstation nextcloud.nextcloud-fixer[468851]: Nextcloud is not installed - only a limited number of commands are available

Nextcloud is running just fine. Should I ignore this? Reinstall?

So I should add
processor.max_cstate=1
at the end to /etc/default/grub ?

Thanks for verifying…

This was the nextcloud-server, causing the messages. Which I don’t have need for anymore. So I uninstalled it, and the log is calming down a bit…

It should show a login prompt by default, no idea if they did things differently on the XFCE version of manjaro :woman_shrugging:

But see what i just created for you :wink:

1 Like