Issues with Kernel 5.3.11-1 after upgrade

Hi

I upgraded my system earlier this evening which went through without any warnings. Upon restart, my system started to "stall" / "freeze" every minute or two for about 5-10 seconds. It basically became unusable as everything would just halt. Mouse and keyboard included.

I quickly restored a backup and everything was fine again. I decided to update again (this time from terminal) and start troubleshooting what did break on my system.

After some quick checks (and the random "stalls" returning), the only difference before and after the update I could find was (using sudo journalctl -p err --since today)

nov 16 23:59:10 XXX kernel: iTCO_wdt iTCO_wdt: can't request region for resource [mem 0x00c5fffc-0x00c5ffff]
nov 16 23:59:16 XXX lightdm[1005]: gkr-pam: unable to locate daemon control file

I then rebooted and reverted to the 4.19 kernel I have installed for emergencies like these and behold, no more freezes or halts.

The last good Kernel I had working here was 5.3.8 I believe (which isn't available in Manjaro kernel manager anymore). So I'm pretty sure it's something between the kernel and my system.

Here my inxi:

System:    Kernel: 4.19.84-1-MANJARO x86_64 bits: 64 compiler: gcc v: 9.2.0 
           parameters: BOOT_IMAGE=/boot/vmlinuz-4.19-x86_64 root=UUID=d65401e5-4edf-432c-abbb-2d52a2bfe5ae rw quiet 
           udev.log_priority=3 
           Desktop: Xfce 4.14.1 tk: Gtk 3.24.12 info: xfce4-panel wm: xfwm4 dm: LightDM 1.30.0 Distro: Manjaro Linux 
Machine:   Type: Laptop System: Dell product: XPS 13 9380 v: N/A serial: <filter> Chassis: type: 10 serial: <filter> 
           Mobo: Dell model: 0KTW76 v: A00 serial: <filter> UEFI: Dell v: 1.7.0 date: 08/05/2019 
Battery:   ID-1: BAT0 charge: 17.8 Wh condition: 50.8/52.0 Wh (98%) volts: 7.5/7.6 model: LGC-LGC6.73 DELL H754V97 
           type: Li-ion serial: <filter> status: Discharging 
CPU:       Topology: Quad Core model: Intel Core i7-8565U bits: 64 type: MT MCP arch: Kaby Lake family: 6 model-id: 8E (142) 
           stepping: C (12) microcode: CA L2 cache: 8192 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 31880 
           Speed: 1084 MHz min/max: 400/4600 MHz Core speeds (MHz): 1: 900 2: 900 3: 900 4: 900 5: 900 6: 900 7: 900 8: 900 
           Vulnerabilities: Type: itlb_multihit status: KVM: Split huge pages 
           Type: l1tf status: Not affected 
           Type: mds status: Not affected 
           Type: meltdown status: Not affected 
           Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 mitigation: Enhanced IBRS, IBPB: conditional, RSB filling 
           Type: tsx_async_abort status: Not affected 
Graphics:  Device-1: Intel UHD Graphics 620 vendor: Dell driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:3ea0 
           Display: x11 server: X.Org 1.20.5 driver: intel resolution: 1920x1080~60Hz 
           OpenGL: renderer: Mesa DRI Intel UHD Graphics (Whiskey Lake 3x8 GT2) v: 4.5 Mesa 19.2.4 compat-v: 3.0 
           direct render: Yes 
Audio:     Device-1: Intel Cannon Point-LP High Definition Audio vendor: Dell driver: snd_hda_intel v: kernel bus ID: 00:1f.3 
           chip ID: 8086:9dc8 
           Sound Server: ALSA v: k4.19.84-1-MANJARO 
Network:   Device-1: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter 
           vendor: Bigfoot Networks Killer 1435 Wireless-AC driver: ath10k_pci v: kernel port: efa0 bus ID: 02:00.0 
           chip ID: 168c:003e 
           IF: wlp2s0 state: up mac: <filter> 
Drives:    Local Storage: total: 476.94 GiB used: 114.11 GiB (23.9%) 
           ID-1: /dev/nvme0n1 vendor: Toshiba model: KXG60ZNV512G NVMe 512GB size: 476.94 GiB block size: physical: 512 B 
           logical: 512 B speed: 31.6 Gb/s lanes: 4 serial: <filter> rev: 10604105 scheme: GPT 
Partition: ID-1: / raw size: 476.18 GiB size: 467.70 GiB (98.22%) used: 114.06 GiB (24.4%) fs: ext4 dev: /dev/nvme0n1p2 
Sensors:   System Temperatures: cpu: 42.0 C mobo: N/A 
           Fan Speeds (RPM): N/A 
Info:      Processes: 233 Uptime: 22m Memory: 15.44 GiB used: 2.84 GiB (18.4%) Init: systemd v: 242 Compilers: gcc: 9.2.0 
           Shell: bash v: 5.0.11 running in: xfce4-terminal inxi: 3.0.36

I would appreciate any help to try to solve this.

Thank you

TOny

1 Like

@tchavei Try checking if there are any failed processes interfering with the startup using the
sudo systemctl --failed command

1 Like

Thx for your reply

it returns:

0 loaded units listed.

I don't think a service is failing as in not loading. Something is periodically freezing everything for 5 seconds... and then releasing everything. its like a stutter... like right now I'm pausing on every 5th word as everything halts for 5 seconds (not even the keyboard buffer records my keystrokes during that time).

I never had this before on any previous update

Tony

Yes it doesn't appear to have anything to do with this... I'll check out the link.

Thank you kindly

Tony

Finally found something on dmesg:

[   25.634978] i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0
[   25.634982] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   25.634984] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   25.634985] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   25.634987] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   25.634989] [drm] GPU crash dump saved to /sys/class/drm/card0/error

Followed by a ton of:

[   69.577330] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[   79.604262] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  115.657831] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  177.522367] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  203.548559] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  241.521392] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  251.547930] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  261.574492] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  305.520685] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  313.627276] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  347.546842] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  363.546759] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  393.626459] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  405.573032] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  415.599590] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  425.626163] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  465.519083] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  475.545721] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  509.465354] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  545.518475] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  571.544908] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  587.544835] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  595.651437] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
[  609.517932] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

which could correlate to temporarily feezes

EDIT: Confirmed. I ran dmesg -w and the moment the system unfreezes, another line is added to the log (Resetting rcs0 for hang on rcs0)

I am not sure how relevant this is since I'm on KDE.
Everything was fine for me on kernels 4.14 and 4.19, once I upgraded to 5.3 I had stuttering, almost-full GUI freezes for 5-10 seconds every ~30 seconds, it was horrible, the strange thing is that it wasn't happening on openbox, only on KDE.
I was able to fix it completely but changing KDE rendering backend to XRender instead of OpenGL.

Similar issue here. I had a fresh install on a new laptop and 5.3.11-1 was fast as a bullet.
After a system update and installing some software, the kernel was updated to 5.3.12-1. After this it has been hell. Random freezes, hslts, crashes. I am lucky if I can quickly switch to a terminal (no GUI) as soon as X is loaded.
I am trying to troubleshoot it this way, but no success...

After a few re-installations I found the culprit was not the kernel update but the nvidia video driver (440xx)... The problem is that using mhwd to install it breaks something else. Uninstalling the driver and going back to the previous one does not solve the issue, only a reinstall.

Running an Asus FX505DT, AMD Ryzen 7 laptop with integrated Vega graphics and Geforce Gtx 1650.

Unfortunately I have a i915 not a Nvidia.

Problems continued with 5.3.12. I'm on experimental 5.4.rc7 which doesn't appear to have that particular issue (which means there was a change in the meanwhile) but yesterday the laptop hard locked after 10 minutes of use. Not sure if it's the same issue or something completely unrelated in the 5.4.rc7 kernel (it's experimental afterall). No logs were written of the hard lock.

Tony

Im also having this, since November 2019, and it is still ongoing with kernels > 4.19 (5.4 as of now). Experiencing it on two laptops, both Intel 915 video and 7 and 8 gen CPUs. What is interesting is that it only does it with one of the users on the system so it must be something in that profile. I know this, because the second laptop was a fresh 18.1.5 install with the profile copied after, and it showed immediately after I transferred the profile and reboot into it. If I log in with one of the other users or create a new profile it does not show. The profile in question is the oldest and has been created back in 2017 if that matters. Good that 4.19 is still around and is LTS as otherwise I would be forced to reset that profile, which will be a pain, because that is my main user with lots of setting / tweaks done..

This topic was automatically closed after 180 days. New replies are no longer allowed.

Forum kindly sponsored by