Background
I have been running Manjaro with Linux 5.15 and have partitioned my system like so:
[manjaro /]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 130.6M 1 loop
loop1 7:1 0 581.9M 1 loop
loop2 7:2 0 1.5G 1 loop
loop3 7:3 0 702.8M 1 loop
sda 8:0 0 223.6G 0 disk
|-sda1 8:1 0 100M 0 part /boot/efi
|-sda2 8:2 0 16M 0 part
|-sda3 8:3 0 62.8G 0 part
|-sda4 8:4 0 505M 0 part
|-sda5 8:5 0 155.3G 0 part /
`-sda6 8:6 0 4.9G 0 part
sdb 8:16 0 931.5G 0 disk
|-sdb1 8:17 0 16M 0 part
|-sdb2 8:18 0 443.2G 0 part
`-sdb3 8:19 0 488.3G 0 part /home
sdc 8:32 1 28.7G 0 disk
|-sdc1 8:33 1 2.9G 0 part
`-sdc2 8:34 1 4M 0 part
The Problem
After a system update, I shutdown and received a kernel panic (at 2023-03-21 16:07:39
) which seemed to indicate a missing library libz.so.1
:
When booting, I now get the following error about a missing libz.so.1
:
The emergency shell doesn’t respond to keyboard input.
What I’ve Tried
1. Reinstalling packages
Using a Manjaro Live USB I chroot
’ed into my broken system (which mounts just fine):
[manjaro@manjaro ~]$ sudo manjaro-chroot -a
grub-probe: error: cannot find a GRUB drive for /dev/sdc1. Check your device.map.
grub-probe: error: cannot find a GRUB drive for /dev/sdc1. Check your device.map.
==> Mounting (ManjaroLinux) [/dev/sda5]
--> mount: [/mnt]
--> mount: [/mnt/boot/efi]
--> mount: [/mnt/home]
I have tried sudo pacman -Syyu
and everything is up to date, and have re-installed the packages which own a library called libz.so.1
:
[manjaro /]$ for name in $(locate libz.so.1 | grep "^/usr"); do sudo pacman -F ${name}; done
usr/lib/libz.so.1 is owned by core/zlib 1:1.2.13-2
usr/lib/libz.so.1.2.13 is owned by core/zlib 1:1.2.13-2
usr/lib32/libz.so.1 is owned by multilib/lib32-zlib 1.2.13-2
usr/lib32/libz.so.1.2.13 is owned by multilib/lib32-zlib 1.2.13-2
[manjaro /]$ sudo pacman -S zlib
[manjaro /]$ sudo pacman -S lib32-zlib
I have rebuilt /etc/ld.so.cache
:
[manjaro /]$ sudo rm /etc/ld.so.cache && sudo ldconfig
2. Reinstalling kernel(s)
I have removed an outdated kernel version (5.13) and installed a new kernel (6.1):
[manjaro /]$ mhwd-kernel -li
Currently running: 5.13.19-2-MANJARO (linux513) # AFAIK this is the kernel on my live USB
The following kernels are installed in your system:
* linux515
* linux61
I have also run update-grub
and can choose these kernels from the GRUB menu at boot (neither will boot, giving the same error about missing libz.so.1
).
3. Reading logs with journalctl
I’ve looked for error messages with journalctl -b all -p7 --no-pager
and journalctl -S "2023-03-21"
and see suspicious-looking errors that have the same timestamp as the original kernel panic:
Mar 21 16:07:39 adam-xps159560 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=udisks2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Mar 21 16:07:39 adam-xps159560 systemd[1]: udisks2.service: Main process exited, code=dumped, status=6/ABRT
Mar 21 16:07:39 adam-xps159560 systemd[1]: udisks2.service: Failed with result 'core-dump'.
Mar 21 16:07:39 adam-xps159560 systemd[1]: Stopped Disk Manager.
Mar 21 16:07:39 adam-xps159560 systemd[1]: udisks2.service: Consumed 15.341s CPU time.
Mar 21 16:07:39 adam-xps159560 kernel: audit: type=1131 audit(1679414859.254:639): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=udisks2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
However there are quite a few similar errors like this from udisks2
in my logs before this crash, and later journalctl entries seem to show that the shutdown was successful:
Mar 21 16:07:45 adam-xps159560 systemd-shutdown[1]: Syncing filesystems and block devices.
Mar 21 16:07:45 adam-xps159560 systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Mar 21 16:07:45 adam-xps159560 systemd-journald[269]: Received SIGTERM from PID 1 (systemd-shutdow).
Mar 21 16:07:45 adam-xps159560 systemd-journald[269]: Journal stopped
So I’m not even sure that I’m reading these messages correctly.
4. Check drive health
Suspecting a hardware issue with the drive containing my root partition, I have checked the health of /dev/sda
with smartctl
:
[manjaro@manjaro ~]$ smartctl -t short
[manjaro@manjaro ~]$ smartctl -t long
[manjaro@manjaro ~]$ sudo smartctl -H /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.19-2-MANJARO] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
[manjaro@manjaro ~]$ sudo smartctl -a /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.13.19-2-MANJARO] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: WD Blue / Red / Green SSDs
Device Model: WDC WDS240G2G0B-00EPW0
Serial Number: 2021CY465813
LU WWN Device Id: 5 001b44 4a783926e
Firmware Version: UJ510000
User Capacity: 240,065,183,744 bytes [240 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: M.2
TRIM Command: Available, deterministic
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Mar 22 14:52:46 2023 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 32) The self-test routine was interrupted
by the host with a hard or soft reset.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x15) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 42) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 3557
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 807
165 Block_Erase_Count 0x0032 100 100 000 Old_age Always - 517
166 Minimum_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 4
167 Max_Bad_Blocks_per_Die 0x0032 100 100 --- Old_age Always - 0
168 Maximum_PE_Cycles_TLC 0x0032 100 100 --- Old_age Always - 13
169 Total_Bad_Blocks 0x0032 100 100 --- Old_age Always - 431
170 Grown_Bad_Blocks 0x0032 100 100 --- Old_age Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
173 Average_PE_Cycles_TLC 0x0032 100 100 000 Old_age Always - 4
174 Unexpected_Power_Loss 0x0032 100 100 000 Old_age Always - 98
184 End-to-End_Error 0x0032 100 100 --- Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 --- Old_age Always - 0
194 Temperature_Celsius 0x0022 075 066 000 Old_age Always - 25 (Min/Max 3/66)
199 UDMA_CRC_Error_Count 0x0032 100 100 --- Old_age Always - 0
230 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0x012f0050012f
232 Available_Reservd_Space 0x0033 100 100 005 Pre-fail Always - 100
233 NAND_GB_Written_TLC 0x0032 100 100 --- Old_age Always - 1103
234 NAND_GB_Written_SLC 0x0032 100 100 000 Old_age Always - 3942
241 Host_Writes_GiB 0x0030 100 100 000 Old_age Offline - 1480
242 Host_Reads_GiB 0x0030 100 100 000 Old_age Offline - 1681
244 Temp_Throttle_Status 0x0032 000 100 --- Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 3555 -
# 2 Short offline Aborted by host 90% 3122 -
# 3 Short offline Aborted by host 90% 543 -
# 4 Short offline Aborted by host 90% 543 -
# 5 Short offline Aborted by host 90% 543 -
# 6 Short offline Aborted by host 90% 543 -
# 7 Short offline Aborted by host 80% 543 -
# 8 Short offline Completed without error 00% 543 -
# 9 Short offline Aborted by host 90% 543 -
Selective Self-tests/Logging not supported
Summary
It appears my kernel isn’t able to find libz.so.1
. I’m sure the package which owns it is correctly installed and that I’m using an up-to-date kernel. My system is fully up to date (as far as I can tell). I can chroot
into the system and mount all its partitions ok, so I don’t think the drive hosting my root partition is at fault. I am out of ideas of what to try.
Related
This post had an almost identical error message but their solution didn’t work for me: https://forum.manjaro.org/t/manjaro-not-booting-after-last-update-error-loading-shared-libcrypto-so-3-unable-to-mount-new-root-cannot-access-chroot/129095