Unable to boot after system update (kmod: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory)

Background

I have been running Manjaro with Linux 5.15 and have partitioned my system like so:

[manjaro /]$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0    7:0    0 130.6M  1 loop 
loop1    7:1    0 581.9M  1 loop 
loop2    7:2    0   1.5G  1 loop 
loop3    7:3    0 702.8M  1 loop 
sda      8:0    0 223.6G  0 disk 
|-sda1   8:1    0   100M  0 part /boot/efi
|-sda2   8:2    0    16M  0 part 
|-sda3   8:3    0  62.8G  0 part 
|-sda4   8:4    0   505M  0 part 
|-sda5   8:5    0 155.3G  0 part /
`-sda6   8:6    0   4.9G  0 part 
sdb      8:16   0 931.5G  0 disk 
|-sdb1   8:17   0    16M  0 part 
|-sdb2   8:18   0 443.2G  0 part 
`-sdb3   8:19   0 488.3G  0 part /home
sdc      8:32   1  28.7G  0 disk 
|-sdc1   8:33   1   2.9G  0 part 
`-sdc2   8:34   1     4M  0 part 

The Problem

After a system update, I shutdown and received a kernel panic (at 2023-03-21 16:07:39) which seemed to indicate a missing library libz.so.1:

When booting, I now get the following error about a missing libz.so.1:

Imgur

The emergency shell doesn’t respond to keyboard input.

What I’ve Tried

1. Reinstalling packages

Using a Manjaro Live USB I chroot’ed into my broken system (which mounts just fine):

[manjaro@manjaro ~]$ sudo manjaro-chroot -a
grub-probe: error: cannot find a GRUB drive for /dev/sdc1.  Check your device.map.
grub-probe: error: cannot find a GRUB drive for /dev/sdc1.  Check your device.map.
==> Mounting (ManjaroLinux) [/dev/sda5]
 --> mount: [/mnt]
 --> mount: [/mnt/boot/efi]
 --> mount: [/mnt/home]

I have tried sudo pacman -Syyu and everything is up to date, and have re-installed the packages which own a library called libz.so.1:

[manjaro /]$ for name in $(locate libz.so.1 | grep "^/usr"); do sudo pacman -F ${name}; done
usr/lib/libz.so.1 is owned by core/zlib 1:1.2.13-2
usr/lib/libz.so.1.2.13 is owned by core/zlib 1:1.2.13-2
usr/lib32/libz.so.1 is owned by multilib/lib32-zlib 1.2.13-2
usr/lib32/libz.so.1.2.13 is owned by multilib/lib32-zlib 1.2.13-2
[manjaro /]$ sudo pacman -S zlib
[manjaro /]$ sudo pacman -S lib32-zlib

I have rebuilt /etc/ld.so.cache:

[manjaro /]$ sudo rm /etc/ld.so.cache && sudo ldconfig

2. Reinstalling kernel(s)

I have removed an outdated kernel version (5.13) and installed a new kernel (6.1):

[manjaro /]$ mhwd-kernel -li
Currently running: 5.13.19-2-MANJARO (linux513) # AFAIK this is the kernel on my live USB
The following kernels are installed in your system:
   * linux515
   * linux61

I have also run update-grub and can choose these kernels from the GRUB menu at boot (neither will boot, giving the same error about missing libz.so.1).

3. Reading logs with journalctl

I’ve looked for error messages with journalctl -b all -p7 --no-pager and journalctl -S "2023-03-21" and see suspicious-looking errors that have the same timestamp as the original kernel panic:

Mar 21 16:07:39 adam-xps159560 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=udisks2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Mar 21 16:07:39 adam-xps159560 systemd[1]: udisks2.service: Main process exited, code=dumped, status=6/ABRT
Mar 21 16:07:39 adam-xps159560 systemd[1]: udisks2.service: Failed with result 'core-dump'.
Mar 21 16:07:39 adam-xps159560 systemd[1]: Stopped Disk Manager.
Mar 21 16:07:39 adam-xps159560 systemd[1]: udisks2.service: Consumed 15.341s CPU time.
Mar 21 16:07:39 adam-xps159560 kernel: audit: type=1131 audit(1679414859.254:639): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=udisks2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'

However there are quite a few similar errors like this from udisks2 in my logs before this crash, and later journalctl entries seem to show that the shutdown was successful:

Mar 21 16:07:45 adam-xps159560 systemd-shutdown[1]: Syncing filesystems and block devices.
Mar 21 16:07:45 adam-xps159560 systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Mar 21 16:07:45 adam-xps159560 systemd-journald[269]: Received SIGTERM from PID 1 (systemd-shutdow).
Mar 21 16:07:45 adam-xps159560 systemd-journald[269]: Journal stopped

So I’m not even sure that I’m reading these messages correctly.

4. Check drive health

Suspecting a hardware issue with the drive containing my root partition, I have checked the health of /dev/sda with smartctl:

[manjaro@manjaro ~]$ smartctl -t short
[manjaro@manjaro ~]$ smartctl -t long
[manjaro@manjaro ~]$ sudo smartctl -H /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.19-2-MANJARO] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

[manjaro@manjaro ~]$ sudo smartctl -a /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.19-2-MANJARO] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     WD Blue / Red / Green SSDs
Device Model:     WDC WDS240G2G0B-00EPW0
Serial Number:    2021CY465813
LU WWN Device Id: 5 001b44 4a783926e
Firmware Version: UJ510000
User Capacity:    240,065,183,744 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      M.2
TRIM Command:     Available, deterministic
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Mar 22 14:52:46 2023 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (  32)	The self-test routine was interrupted
					by the host with a hard or soft reset.
Total time to complete Offline 
data collection: 		(  120) seconds.
Offline data collection
capabilities: 			 (0x15) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Abort Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					No Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  42) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       3557
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       807
165 Block_Erase_Count       0x0032   100   100   000    Old_age   Always       -       517
166 Minimum_PE_Cycles_TLC   0x0032   100   100   ---    Old_age   Always       -       4
167 Max_Bad_Blocks_per_Die  0x0032   100   100   ---    Old_age   Always       -       0
168 Maximum_PE_Cycles_TLC   0x0032   100   100   ---    Old_age   Always       -       13
169 Total_Bad_Blocks        0x0032   100   100   ---    Old_age   Always       -       431
170 Grown_Bad_Blocks        0x0032   100   100   ---    Old_age   Always       -       0
171 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 Average_PE_Cycles_TLC   0x0032   100   100   000    Old_age   Always       -       4
174 Unexpected_Power_Loss   0x0032   100   100   000    Old_age   Always       -       98
184 End-to-End_Error        0x0032   100   100   ---    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   ---    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   075   066   000    Old_age   Always       -       25 (Min/Max 3/66)
199 UDMA_CRC_Error_Count    0x0032   100   100   ---    Old_age   Always       -       0
230 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       -       0x012f0050012f
232 Available_Reservd_Space 0x0033   100   100   005    Pre-fail  Always       -       100
233 NAND_GB_Written_TLC     0x0032   100   100   ---    Old_age   Always       -       1103
234 NAND_GB_Written_SLC     0x0032   100   100   000    Old_age   Always       -       3942
241 Host_Writes_GiB         0x0030   100   100   000    Old_age   Offline      -       1480
242 Host_Reads_GiB          0x0030   100   100   000    Old_age   Offline      -       1681
244 Temp_Throttle_Status    0x0032   000   100   ---    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3555         -
# 2  Short offline       Aborted by host               90%      3122         -
# 3  Short offline       Aborted by host               90%       543         -
# 4  Short offline       Aborted by host               90%       543         -
# 5  Short offline       Aborted by host               90%       543         -
# 6  Short offline       Aborted by host               90%       543         -
# 7  Short offline       Aborted by host               80%       543         -
# 8  Short offline       Completed without error       00%       543         -
# 9  Short offline       Aborted by host               90%       543         -

Selective Self-tests/Logging not supported

Summary

It appears my kernel isn’t able to find libz.so.1. I’m sure the package which owns it is correctly installed and that I’m using an up-to-date kernel. My system is fully up to date (as far as I can tell). I can chroot into the system and mount all its partitions ok, so I don’t think the drive hosting my root partition is at fault. I am out of ideas of what to try.

Related

This post had an almost identical error message but their solution didn’t work for me: https://forum.manjaro.org/t/manjaro-not-booting-after-last-update-error-loading-shared-libcrypto-so-3-unable-to-mount-new-root-cannot-access-chroot/129095

:bangbang: Tip:

Edit:

I’d say, return to the chroot environment, and reinstall zlib, but overwrite any conflicts. That way, if there is something wrong with the file, they’ll be replaced.

pamac reinstall zlib --overwite="/usr/lib/*,/usr/lib32/*"
$ sudo pamac reinstall zlib --overwrite "/usr/lib/*,/usr/lib32/*"

Thanks for the recommendation, but this didn’t help.

if you run ldconfig from chroot, does it return something (it should not)?
post also output from:
pacman -Qm

Output:

[manjaro /]# ldconfig   
[manjaro /]# pacman -Qm
celt 0.11.3-4
ceph-libs 15.2.14-6
debtap 3.5.1-1
dropbox 146.4.4836-1
f5vpn 7213.2021.0526.1-1
galculator-gtk2 2.1.4-5
gcolor2 0.4-9
gnome-icon-theme 3.12.0-6
gnome-icon-theme-symbolic 3.12.0-6
google-chrome 111.0.5563.64-1
ipw2100-fw 1.3-10
ipw2200-fw 3.1-8
jags 4.3.1-1
kvantum-theme-matchama 20191118-1
libguess 1.2-4
libvterm01 0.1.4-2
manjaro-documentation-en 20181009-1
manjaro-firmware 20160419-1
metis 5.1.0.p10-2
openssl-1.0 1.0.2.u-1
python-pep517 0.13.0-1
qpdfview 0.4.18-2
qt5-styleplugins 5.0.0.20170311-25
qt5-webkit 5.212.0alpha4-18
rstudio-desktop-bin 2022.07.0.548-1
slack-desktop 4.25.1-1
udunits 2.2.28-3
visual-studio-code-bin 1.76.1-1
zoom 5.10.3.2778-1

try to reinstall all that is required and depends on zlib:

pacman -S binutils  boost-libs  btrfs-progs  cairo  clucene  cracklib  curl  exiv2  ffmpeg  ffmpeg4.4  file  freetype2  ghostscript  git  glib2  gnupg  gnutls  kmod  lib32-zlib  libarchive  libavif  libelf  libetonyek  libfontenc  libgit2  libid3tag  libmysofa  libodfgen  libopenmpt  libpciaccess  libpng  libproxy  librevenge  libssh  libssh2  libtar  libtiff  libunwind  libwpd  libxml2  libzip  llvm-libs  man-db  minizip  mpd  neon  nss  openexr  openjpeg2  openmpi  openpmix  openssh  pcre  pcre2  protobuf  python  qpdf  raptor  rsync  rtmpdump  ruby  sqlcipher  sqlite  sudo  taglib  tcl  unarchiver  webkit2gtk  webkit2gtk-4.1  wget  zstd  zziplib glibc

if there were no errors reboot and test


EDIT:
looking at the pciture above from the kernel panic, you are booting with an older 5.15.78-1 kernel, the latest stable is 5.15.102-1…
are you up to date? rerun update again:

pacman-mirrors --fasttrack 5 && pacman -Syyu

Done, but to no effect :frowning_face:

I’ve removed the old kernel(s) while inside the chroot and have installed the following:

[manjaro /]# mhwd-kernel -li
Currently running: 5.13.19-2-MANJARO (linux513)
The following kernels are installed in your system:
   * linux515
   * linux61
[manjaro /]# pacman -Q linux515
linux515 5.15.102-1
[manjaro /]# pacman -Q linux61 
linux61 6.1.19-1

so it still gives you the same zlib error?
so try reinstalling it via pacstrap, so in live usb mount your manjaro partition:
sudo mount /dev/sda5 /mnt
install it:
sudo pacman -S arch-install-scripts
and reinstall the zlib:
sudo pacstrap -i /mnt zlib lib32-zlib
if no errors, reboot and see if it helped…

if not, post logs from chroot, maybe there will be something:
journalctl -b-1 -p4 --no-pager

and also output from:
ls -l /usr/lib/libz.so*

1 Like

How is this possible, when:

…it’s not listed there. Unless:

  1. You haven’t restarted for quite some time; or
  2. this is from an old ISO image’s chroot environment.

If it’s #1, then reboot for a change.
If it’s #2, get an ISO with a still-supported kernel and try everything again.

Thanks, I tried these steps but get the same error about missing libz.so.1.

Output from ls and journalctl:

[manjaro /]# ls -l /usr/lib/libz.so*
lrwxrwxrwx 1 root root     14 Nov 17 16:39 /usr/lib/libz.so -> libz.so.1.2.13
lrwxrwxrwx 1 root root     14 Nov 17 16:39 /usr/lib/libz.so.1 -> libz.so.1.2.13
-rwxr-xr-x 1 root root 100288 Nov 17 16:39 /usr/lib/libz.so.1.2.13
[manjaro /]# journalctl -b-1 -p4 --no-pager
Mar 17 16:53:07 adam-xps159560 kernel: x86/cpu: SGX disabled by BIOS.
Mar 17 16:53:07 adam-xps159560 kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
Mar 17 16:53:07 adam-xps159560 kernel: resource sanity check: requesting [mem 0xfdffe800-0xfe0007ff], which spans more than pnp 00:07 [mem 0xfdb00000-0xfdffffff]
Mar 17 16:53:07 adam-xps159560 kernel: caller pmc_core_probe+0xb7/0x6b0 mapping multiple BARs
Mar 17 16:53:07 adam-xps159560 kernel: i8042: Warning: Keylock active
Mar 17 16:53:07 adam-xps159560 kernel: usb: port power management may be unreliable
Mar 17 16:53:08 adam-xps159560 kernel: ACPI Warning: \_SB.IETM._TRT: Return Package has no elements (empty) (20210730/nsprepkg-94)
Mar 17 16:53:08 adam-xps159560 kernel: wmi_bus wmi_bus-PNP0C14:03: WQBC data block query control method not found
Mar 17 16:53:08 adam-xps159560 kernel: i801_smbus 0000:00:1f.4: Accelerometer lis3lv02d is present on SMBus but its address is unknown, skipping registration
Mar 17 16:53:08 adam-xps159560 kernel: nvidia: loading out-of-tree module taints kernel.
Mar 17 16:53:08 adam-xps159560 kernel: nvidia: module license 'NVIDIA' taints kernel.
Mar 17 16:53:08 adam-xps159560 kernel: Disabling lock debugging due to kernel taint
Mar 17 16:53:08 adam-xps159560 kernel: 
Mar 17 16:53:08 adam-xps159560 kernel: tpm_tis: probe of MSFT0101:00 failed with error -1
Mar 17 16:53:08 adam-xps159560 kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  470.141.03  Thu Jun 30 18:45:31 UTC 2022
Mar 17 16:53:09 adam-xps159560 kernel: nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
Mar 17 16:53:09 adam-xps159560 kernel: snd_hda_codec_hdmi hdaudioC0D2: Monitor plugged-in, Failed to power up codec ret=[-13]
Mar 17 16:53:10 adam-xps159560 systemd[672]: ConfigurationDirectory 'bluetooth' already exists but the mode is different. (File system: 755 ConfigurationDirectoryMode: 555)
Mar 17 16:53:10 adam-xps159560 bluetoothd[672]: profiles/audio/vcp.c:vcp_init() D-Bus experimental not enabled
Mar 17 16:53:10 adam-xps159560 bluetoothd[672]: src/plugin.c:plugin_init() Failed to init vcp plugin
Mar 17 16:53:10 adam-xps159560 bluetoothd[672]: profiles/audio/mcp.c:mcp_init() D-Bus experimental not enabled
Mar 17 16:53:10 adam-xps159560 bluetoothd[672]: src/plugin.c:plugin_init() Failed to init mcp plugin
Mar 17 16:53:10 adam-xps159560 bluetoothd[672]: profiles/audio/bap.c:bap_init() D-Bus experimental not enabled
Mar 17 16:53:10 adam-xps159560 bluetoothd[672]: src/plugin.c:plugin_init() Failed to init bap plugin
Mar 17 16:53:12 adam-xps159560 kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20210730/nsarguments-61)
Mar 17 16:53:14 adam-xps159560 kernel: kauditd_printk_skb: 32 callbacks suppressed
Mar 17 16:53:20 adam-xps159560 kernel: kauditd_printk_skb: 8 callbacks suppressed
Mar 17 16:53:21 adam-xps159560 lightdm[1007]: gkr-pam: unable to locate daemon control file
Mar 17 16:53:21 adam-xps159560 systemd-xdg-autostart-generator[1024]: Configuration file /home/adam/.config/autostart/jetbrains-toolbox.desktop is marked executable. Please remove executable permission bits. Proceeding anyway.
Mar 17 16:53:21 adam-xps159560 systemd-xdg-autostart-generator[1024]: /home/adam/.config/autostart/light-locker.desktop:64: Unknown key name 'NotShownIn' in section 'Desktop Entry', ignoring.
Mar 17 16:53:21 adam-xps159560 systemd-xdg-autostart-generator[1024]: Exec line is empty
Mar 17 16:53:21 adam-xps159560 systemd-xdg-autostart-generator[1024]: /home/adam/.config/autostart/light-locker.desktop: not generating unit, error parsing Exec= line: Invalid argument
Mar 17 16:53:24 adam-xps159560 gnome-keyring-daemon[1103]: discover_other_daemon: 1
Mar 17 16:53:26 adam-xps159560 kernel: kauditd_printk_skb: 13 callbacks suppressed
Mar 17 16:53:26 adam-xps159560 pipewire[1315]: mod.rt: Can't find xdg-portal: (null)
Mar 17 16:53:26 adam-xps159560 pipewire[1315]: mod.rt: found session bus but no portal
Mar 17 16:54:06 adam-xps159560 pulseaudio[1858]: stat('/etc/pulse/default.pa.d'): No such file or directory
Mar 17 17:23:22 adam-xps159560 systemd[1018]: xdg-desktop-portal-gtk.service: Failed with result 'exit-code'.
Mar 17 17:23:22 adam-xps159560 systemd[1018]: xdg-desktop-portal-gnome.service: Failed with result 'exit-code'.
Mar 17 17:23:23 adam-xps159560 NetworkManager[700]: <warn>  [1679073803.3655] dispatcher: (8) failed (after 0.001 sec): Refusing activation, D-Bus is shutting down.

@Mirdarthos it’s #2 - Thanks, I will update the ISO, enter chroot and try to reinstall zlib and then do mkinitcpio -P.

If that still doesn’t fix the issue then I will just reinstall from a fresh ISO image as this is causing me to tear my hair out now :joy:

1 Like

Sometimes it is less painless and quicker to do. As long as you learn something from this.

1 Like

I ended up having to go nuclear and do a fresh install from the ISO. At least I was able to avoid overwriting my original home partition so I haven’t lost any data.

Given that I still don’t know why my system broke, I can’t really say that I learned something from this :person_shrugging: oh well, at least I’m back in business!

Thank you both for your help!

1 Like

There is that, yes.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.