BTRFS ran out of space during update, now I can't boot, GRUB does not find kernel

kamarada · 14 August 2024 00:52

Hi! I’m new to Manjaro, coming from openSUSE Leap.

I’ve made some free space on my SSD to install and give Manjaro a try. So now I have a 30GB BTRFS partition with Manjaro (/dev/sda8), a 70GB BTRFS partition with my previous (still working) openSUSE Leap install (/dev/sda5), among other partitions (I can list them all with fdisk -l in case that helps somehow).

I installed many packages and was using Manjaro for my daily tasks, but then some day when I updated the system, it didn’t finish the update, saying the disk was full, I rebooted (now I don’t remember why, but maybe because I needed to do something and switched to openSUSE Leap for a while) and when I tried to go back to Manjaro, GRUB said it didn’t find the kernel.

From GRUB, I’m able to boot to Manjaro if I select the last snapshot.

I’ve googled a little how to revert to the last snapshot, and tried some suggested solutions (like this), but I can’t advance because the disk is full.

I’m not sure some suggestions are for me (like this one), because AFAIK I’m not using snapper, but Timeshift.

Any ideas on how can I solve it? Preferably in place (without using another SSD/HDD/flash drive).

Thank you in advance!

kamarada · 14 August 2024 01:14

The error that appears to me, if I try to boot as usual:

It reads, in Brazilian Portuguese:

error: file '/@/boot/vmlinuz-6.9-x86_64' not found.
error: you need to load the kernel first.

Press any key to continue...

But if I select the last snapshot:

I’m able to reach the GNOME desktop.

Aragorn · 14 August 2024 01:20

Welcome to the forum!

There are several things you can do. First of all, as you update the system, packages are downloaded to /var/cache/pacman/pkg. Normally, only the last two generations of the packages will be kept there, but it could be that you’ve got more (old) packages in there. So best is to clean that out, like so…

sudo paccache -rvk0

Another thing that tends to run away with disk space is if you have a lot of btrfs snapshots, so it is recommended to periodically delete your older snapshots.

On account of there not being a kernel, this is due to the way the Arch package manager works for updating the system.

First it downloads the packages, then it removes the old kernel(s) and initramfs image(s), then it updates all packages from what it has downloaded into /var/cache/pacman/pkg, and only at the end of that process will it copy the new kernel image into /boot, recreate the initramfs, and update the boot loader.

The following HowTo will explain to you how to recover from an interrupted update…

Lastly, yet another reason why you may run out of space is due to how btrfs allocates space. It essentially divides the available space on the volume into 1 GiB chunks, but even though btrfs is very clever, this could mean that, due to the copy-on-write nature of btrfs, you may run out of available unallocated blocks.

This can be remedied by balancing the filesystem. What this means is that btrfs will then attempt to move blocks together — this is however not a defragmentation — in order to free up more unallocated blocks. For details, see…

man btrfs-balance

Note: The command is “btrfs balance”, but the man page uses hyphens for the different subcommands of the btrfs tool.

kamarada · 14 August 2024 01:46

Thank you for the directions!

I’ve booted Manjaro GNOME from a USB flash drive with Ventoy.

I’ve successfully chrooted into my BTRFS partition following the instructions of your how-to, with just a few changes:

$ sudo su -
# lsblk --fs
# mount -t btrfs -o subvol=@ /dev/sda8 /mnt
# btrfs subvolume list /mnt
# mount -t btrfs -o subvol=@cache /dev/sda8 /mnt/var/cache
# mount -t btrfs -o subvol=@log /dev/sda8 /mnt/var/log
# mount -t vfat  /dev/sda1 /mnt/boot/efi
# mount --bind /dev /mnt/dev
# mount -t proc proc /mnt/proc
# mount -t sysfs sysfs /mnt/sys
# mount -t efivarfs efivarfs /mnt/sys/firmware/efi/efivars
# chroot /mnt /bin/bash
# echo 'nameserver 8.8.8.8' > /etc/resolv.conf

First I tried:

# paccache -rvk0

bash: paccache: command not found

# pacman -S pacman-contrib

resolving dependencies...
looking for conflicting packages...

Packages (1) pacman-contrib-1.10.6-1

Total Download Size:   0.05 MiB
Total Installed Size:  0.12 MiB

:: Proceed with installation? [Y/n] 
error: Partition /var/cache too full: 5133 blocks needed, 1503 blocks free
error: failed to commit transaction (not enough free disk space)
Errors occurred, no packages were upgraded.

Then I tried following your how-to:

# [ -f /var/lib/pacman/db.lck ] && rm -f /var/lib/pacman/db.lck
# pacman-mirrors -f && pacman -Syyu

::INFO Downloading mirrors from Manjaro
::INFO => Mirror pool: https://repo.manjaro.org/mirrors.json
::INFO => Mirror status: https://repo.manjaro.org/status.json
::INFO Using default mirror file
::INFO Querying mirrors - This may take some time
  1.720 United_States  : https://ohioix.mm.fcix.net/manjaro/
  0.173 Brazil         : https://manjaro.c3sl.ufpr.br/

...

resolving dependencies...
looking for conflicting packages...

Packages (5) chromium-127.0.6533.119-1  gnome-shell-extension-dash-to-dock-93-1  wine-9.14-1  xapp-2.8.5-1  xfsprogs-6.9.0-1

Total Download Size:   101.09 MiB
Total Installed Size:  923.89 MiB
Net Upgrade Size:       -0.07 MiB

:: Proceed with installation? [Y/n] 
error: Partition /var/cache too full: 31001 blocks needed, 1503 blocks free
error: failed to commit transaction (not enough free disk space)
Errors occurred, no packages were upgraded.

My disk is really full…

Do you have any ideas on what should I try next?

Aragorn · 14 August 2024 02:10

paccache is only a script. You can just as easily clean out the package cache with…

sudo pacman -Scc

kamarada · 14 August 2024 03:27

pacman -Scc worked.

Then, following with the how-to:

# [ -f /var/lib/pacman/db.lck ] && rm -f /var/lib/pacman/db.lck
# pacman-mirrors -f && pacman -Syyu

::INFO Downloading mirrors from Manjaro
...
::INFO Mirror list generated and saved to: /etc/pacman.d/mirrorlist
:: Synchronizing package databases...
...
:: Proceed with installation? [Y/n] 
:: Retrieving packages...
...
(5/5) checking available disk space                    [############################] 100%
:: Running pre-transaction hooks...
(1/1) Creating Timeshift snapshot before upgrade...
First run mode (config file not found)
Selected default snapshot type: BTRFS
E: System disk not found!
Unable to run timeshift-autosnap! Please close Timeshift and try again. Script will now exit...
error: command failed to execute correctly
error: failed to commit transaction (failed to run transaction hooks)
Errors occurred, no packages were upgraded.

Aragorn · 14 August 2024 03:50

Well, the automatic generation of btrfs snapshots when updating the system is something I have no experience with. It is triggered by a pacman hook, but I don’t have that functionality installed here — my btrfs setup is also very different from the defaults.

Maybe @andreas85 will be able to help you with that.

Molski · 14 August 2024 04:39

You have to be careful booting into all these read/write snapshots, you are just making them take up more space. And most important of all, you can’t use Timeshift restore on a live boot, or from booting into a snapshot via GRUB.

I’m not even convinced the latest root volume is even in a sane state?

Have you freed the space? Btrfs needs ample space to work its magic. How much is free? And especially for this, don’t use df. Use btrfs filesystem usage / (btrfs commands can always be shortened too, btrfs fi us /.)

As Aragon said, the space is probably in old snapshots. You can free space by deleting the oldest X number of snapshots. You can even delete them through, e.g. sudo btrfs sub del /mnt/timeshift-btrfs/snapshots/2024-01-01_00-00-00/@home.

First free up the space, and see where your at. A restore to a previous snapshot is possible, but you cannot use the Timeshift GUI, or the CLI, for a restore. (Booting normally, you can use it all you want!) If it comes to requiring a restore, and you can’t boot your root volume. it is is still easy to restore (for many? some?) But they are a series of terminal commands you will need to do from a live boot/chroot, many of which use btrfs-tools.

kamarada · 14 August 2024 18:26

I booted the 2024-08-08 snapshot just once, just to check that things were there. I’ve made no changes to the system then.

All that I’m trying and reporting to you is from a Manjaro GNOME live boot, after chrooting to my BTRFS Manjaro partition, as I described.

# btrfs filesystem usage /

Overall:
    Device size:		  30.00GiB
    Device allocated:		  30.00GiB
    Device unallocated:		   1.00MiB
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			  26.58GiB
    Free (estimated):		   2.75GiB	(min: 2.75GiB)
    Free (statfs, df):		   2.75GiB
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		  61.23MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:27.94GiB, Used:25.19GiB (90.17%)
   /dev/sda8	  27.94GiB

Metadata,DUP: Size:1.00GiB, Used:713.84MiB (69.71%)
   /dev/sda8	   2.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB (0.05%)
   /dev/sda8	  64.00MiB

Unallocated:
   /dev/sda8	   1.00MiB

# btrfs subvolume list /

ID 256 gen 4487 top level 5 path @
ID 257 gen 4484 top level 5 path @cache
ID 258 gen 4484 top level 5 path @log
ID 264 gen 4418 top level 5 path timeshift-btrfs/snapshots/2024-08-06_19-59-08/@
ID 265 gen 4418 top level 5 path timeshift-btrfs/snapshots/2024-08-07_22-45-16/@
ID 266 gen 4469 top level 5 path timeshift-btrfs/snapshots/2024-08-08_19-41-02/@

# btrfs subvolume delete --subvolid 264 /

Delete subvolume 264 (no-commit): '//timeshift-btrfs/snapshots/2024-08-06_19-59-08/@'

# btrfs subvolume delete --subvolid 265 /

Delete subvolume 265 (no-commit): '//timeshift-btrfs/snapshots/2024-08-07_22-45-16/@'

# btrfs subvolume list /

ID 256 gen 4488 top level 5 path @
ID 257 gen 4484 top level 5 path @cache
ID 258 gen 4484 top level 5 path @log
ID 266 gen 4469 top level 5 path timeshift-btrfs/snapshots/2024-08-08_19-41-02/@

# btrfs filesystem usage /

Overall:
    Device size:		  30.00GiB
    Device allocated:		  30.00GiB
    Device unallocated:		   1.00MiB
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			  23.88GiB
    Free (estimated):		   5.25GiB	(min: 5.25GiB)
    Free (statfs, df):		   5.25GiB
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		  61.23MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:27.94GiB, Used:22.69GiB (81.22%)
   /dev/sda8	  27.94GiB

Metadata,DUP: Size:1.00GiB, Used:607.66MiB (59.34%)
   /dev/sda8	   2.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB (0.05%)
   /dev/sda8	  64.00MiB

Unallocated:
   /dev/sda8	   1.00MiB

It seems I’ve got a few GiB by deleting the 2 oldest snapshots (I had 3).

But then, trying to go further with the system update, again:

# pacman -Syyu

:: Synchronizing package databases...
...
:: Running pre-transaction hooks...
(1/1) Creating Timeshift snapshot before upgrade...
First run mode (config file not found)
Selected default snapshot type: BTRFS
E: System disk not found!
Unable to run timeshift-autosnap! Please close Timeshift and try again. Script will now exit...
error: command failed to execute correctly
error: failed to commit transaction (failed to run transaction hooks)
Errors occurred, no packages were upgraded.

kamarada · 14 August 2024 18:34

Searching, I found this:

So I tried:

# SKIP_AUTOSNAP=1 pacman -Syyu

:: Synchronizing package databases...
...
:: Running pre-transaction hooks...
(1/1) Creating Timeshift snapshot before upgrade...
==> skipping timeshift-autosnap due SKIP_AUTOSNAP environment variable being set.
:: Processing package changes...
(1/6) upgrading brave-browser                                                                                                  [#############################################################################] 100%
(2/6) upgrading chromium                                                                                                       [#############################################################################] 100%
(3/6) upgrading gnome-shell-extension-dash-to-dock                                                                             [#############################################################################] 100%
(4/6) upgrading wine                                                                                                           [#############################################################################] 100%
(5/6) upgrading xapp                                                                                                           [#############################################################################] 100%
(6/6) upgrading xfsprogs                                                                                                       [#############################################################################] 100%
:: Running post-transaction hooks...
(1/8) Registering binary formats...
  Skipped: Running in chroot.
(2/8) Reloading system manager configuration...
  Skipped: Running in chroot.
(3/8) Arming ConditionNeedsUpdate...
(4/8) Updating fontconfig cache...
(5/8) Updating 32-bit fontconfig cache...
(6/8) Compiling GSettings XML schema files...
(7/8) Updating icon theme caches...
(8/8) Updating the desktop file MIME type cache...

It seems it worked!

# update-grub

Generating grub configuration file ...
Found theme: /usr/share/grub/themes/manjaro/theme.txt
Warning: os-prober will be executed to detect other bootable partitions.
Its output will be used to detect bootable binaries on them and create new boot entries.
Found openSUSE Leap 15.6 on /dev/sda5
Adding boot menu entry for UEFI Firmware Settings ...
Detecting snapshots ...
Found snapshot: 2024-08-08 19:41:02 | timeshift-btrfs/snapshots/2024-08-08_19-41-02/@ | ondemand | {timeshift-autosnap} {created before upgrade} |
Found 1 snapshot(s)
Unmount /tmp/grub-btrfs.dihB0RprcO .. Success
Found memtest86+ image: /boot/memtest86+/memtest.bin
Found memtest86+ EFI image: /boot/memtest86+/memtest.efi
done

# exit

Let’s see…

kamarada · 14 August 2024 19:03

After rebooting, Manjaro’s GRUB does not appear to me. Instead of it, a blank/black screen for a few seconds, then the computer reboots into the EFI setup.

Using the boot menu (F12, on my laptop), I’m able to access openSUSE’s GRUB. I can boot openSUSE from there.

I went to openSUSE, updated openSUSE’s GRUB, and then rebooted and tried to boot Manjaro using openSUSE’s GRUB, without success:

It reads, in Brazilian Portuguese:

error: ../../grub-core/fs/btrfs.c:2159:file '/boot/memtest86+/memtest.bin' not found.

Press any key to continue...

Arrababiski · 14 August 2024 19:31

With btrfs don’t look at free, look what’s “Allocated”. If it’s below 10% of total disk space, you have a problem.

Take a look at “btrfs balance” command because that’s what you might need.

Aragorn · 14 August 2024 21:17

kamarada:

Overall:
    Device size:		  30.00GiB
    Device allocated:		  30.00GiB
    Device unallocated:		   1.00MiB <===========
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			  26.58GiB
    Free (estimated):		   2.75GiB	(min: 2.75GiB) <==========
    Free (statfs, df):		   2.75GiB

Notice the discrepancy?

As I said earlier already…

Aragorn:

Lastly, yet another reason why you may run out of space is due to how btrfs allocates space. It essentially divides the available space on the volume into 1 GiB chunks, but even though btrfs is very clever, this could mean that, due to the copy-on-write nature of btrfs, you may run out of available unallocated blocks.

This can be remedied by balancing the filesystem. What this means is that btrfs will then attempt to move blocks together — this is however not a defragmentation — in order to free up more unallocated blocks. For details, see…
man btrfs-balance
Note: The command is “btrfs balance”, but the man page uses hyphens for the different subcommands of the btrfs tool.

kamarada · 15 August 2024 16:45

Sorry… I tried other solutions first, but I’m going to do that now.

# btrfs filesystem usage /

Overall:
    Device size:		  30.00GiB
    Device allocated:		  30.00GiB
    Device unallocated:		   1.00MiB
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			  25.27GiB
    Free (estimated):		   3.87GiB	(min: 3.87GiB)
    Free (statfs, df):		   3.87GiB
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		  58.58MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:27.94GiB, Used:24.07GiB (86.16%)
   /dev/sda8	  27.94GiB

Metadata,DUP: Size:1.00GiB, Used:612.75MiB (59.84%)
   /dev/sda8	   2.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB (0.05%)
   /dev/sda8	  64.00MiB

Unallocated:
   /dev/sda8	   1.00MiB

# btrfs balance start -musage=50 -dusage=50 /

Done, had to relocate 4 out of 37 chunks

# btrfs filesystem usage /

Overall:
    Device size:		  30.00GiB
    Device allocated:		  28.00GiB
    Device unallocated:		   2.00GiB
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			  25.27GiB
    Free (estimated):		   3.87GiB	(min: 2.87GiB)
    Free (statfs, df):		   3.87GiB
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		  58.00MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:25.94GiB, Used:24.07GiB (92.80%)
   /dev/sda8	  25.94GiB

Metadata,DUP: Size:1.00GiB, Used:612.17MiB (59.78%)
   /dev/sda8	   2.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB (0.05%)
   /dev/sda8	  64.00MiB

Unallocated:
   /dev/sda8	   2.00GiB

Now I have both free and unallocated disk space. Let’s see if I’m able to boot Manjaro…

kamarada · 15 August 2024 17:01

All the same happened again:

kamarada:

After rebooting, Manjaro’s GRUB does not appear to me. Instead of it, a blank/black screen for a few seconds, then the computer reboots into the EFI setup.

Using the boot menu (F12, on my laptop), I’m able to access openSUSE’s GRUB. I can boot openSUSE from there.

I went to openSUSE, updated openSUSE’s GRUB, and then rebooted and tried to boot Manjaro using openSUSE’s GRUB, without success:

opensuse-grub-error1024×547 35.6 KB

It reads, in Brazilian Portuguese:
error: ../../grub-core/fs/btrfs.c:2159:file '/boot/memtest86+/memtest.bin' not found.

Press any key to continue...

From the Manjaro GNOME live image and the chrooted environment, trying once more…

# SKIP_AUTOSNAP=1 pacman -Syyu

:: Synchronizing package databases...
 core                  140.1 KiB  61.5 KiB/s 00:02 [######################] 100%
 extra                   7.8 MiB   767 KiB/s 00:10 [######################] 100%
 multilib              145.9 KiB  82.7 KiB/s 00:02 [######################] 100%
:: Starting full system upgrade...
warning: nano-syntax-highlighting: local (2020.10.10+10+g1aa64a8-2) is newer than extra (2020.10.10-2)
resolving dependencies...
looking for conflicting packages...

Packages (1) gtksourceview-pkgbuild-5-2

Total Installed Size:  0.04 MiB
Net Upgrade Size:      0.00 MiB

:: Proceed with installation? [Y/n] 
(1/1) checking keys in keyring                     [######################] 100%
(1/1) checking package integrity                   [######################] 100%
(1/1) loading package files                        [######################] 100%
(1/1) checking for file conflicts                  [######################] 100%
(1/1) checking available disk space                [######################] 100%
:: Running pre-transaction hooks...
(1/1) Creating Timeshift snapshot before upgrade...
==> skipping timeshift-autosnap due SKIP_AUTOSNAP environment variable being set.
:: Processing package changes...
(1/1) upgrading gtksourceview-pkgbuild             [######################] 100%
:: Running post-transaction hooks...
(1/2) Arming ConditionNeedsUpdate...
(2/2) Updating the MIME type database...

# update-grub

Generating grub configuration file ...
Found theme: /usr/share/grub/themes/manjaro/theme.txt
Warning: os-prober will be executed to detect other bootable partitions.
Its output will be used to detect bootable binaries on them and create new boot entries.
Found openSUSE Leap 15.6 on /dev/sda5
Adding boot menu entry for UEFI Firmware Settings ...
Detecting snapshots ...
Found snapshot: 2024-08-08 19:41:02 | timeshift-btrfs/snapshots/2024-08-08_19-41-02/@ | ondemand | {timeshift-autosnap} {created before upgrade} |
Found 1 snapshot(s)
Unmount /tmp/grub-btrfs.fsBchvzhQ3 .. Success
Found memtest86+ image: /boot/memtest86+/memtest.bin
Found memtest86+ EFI image: /boot/memtest86+/memtest.efi
done

Now let’s see…

kamarada · 15 August 2024 17:17

…and again.

@Aragorn @Molski @Arrababiski thank you for your attention and efforts so far.

In short, now I have both free and unallocated disk space in my BTRFS Manjaro partition, and my Manjaro system (which I can access only via chroot) has all the updates installed, it seems that now I need to somehow reinstall Manjaro’s GRUB.

Do you have any idea on what could I do next?

If not, I’m considering this: delete the 30GB Manjaro partition, delete the 70GB openSUSE partition, create a 100GB (maybe EXT4) partition and reinstall Manjaro on it.

kamarada · 15 August 2024 17:30

What if I restore that 2024-08-08 snapshot? Is it possible from the Manjaro GNOME live image? Or from openSUSE Leap?

Arrababiski · 15 August 2024 18:11

What’s the output of efibootmgr -v command?

Molski · 15 August 2024 18:21

You have two choices…

Attempt to fix your current root volume.
As it sounds like you filled up your disk during an upgrade, it should be easily fixable. I would at least attempt this first. Have you tried Aragon’s tutorial you posted yourself earlier, after you freed some space? It’s really just updating again from a chroot’d live image.
Restore the one snapshot you have left now.
This part is only tricky because you can’t boot a snapshot, or boot a live image, and then use Timeshift’s simple restore button. (It works fine when you boot normally.) The manual rollback has saved me quite a few times. It involves booting a live image, and moving your root volumes around, and of course updating grub.

kamarada · 21 August 2024 19:30

Yes, I tried that.

Sorry, my friends, but I gave up trying to fix this and decided to reinstall Manjaro the way I was thinking to do:

Manjaro is up and running again, this time on a 100GB BTRFS partition.

Lesson learned: BTRFS eats disk space for dinner! Unless you have plenty of disk space for your root (/) partition, use another filesystem, such as the old but gold EXT4.

Again, thank you for your time.

I’m now trying to solve another problem:

In case you have any ideas, any help with that is appreciated.