Can't boot LUKS encrypted install after 2023-03-31 stable update

The issue

Normally I don’t post screenshots but here you guys go:

no such cryptodisk found is a common error with grub and FDE. Based on the findings you will find Manjaro, Ubuntu, Garuda, Artix, EndeavourOS. OpenSuse, Debian and many more. What does it mean?

Grub has a very slow development cycle. More or less every 2 years a new version comes out. And 2 years is a long timeframe for security issues and other issues which may come when libraries and other stuff gets updated.

Looking at Distributions

Distributions more or less have to maintain grub on their own. Lets take a very non-rolling release like Debian and see the changelog of their grub package. As you can see they did 8 releases of 2.06 so far. Most are security patches and fixes on top of the version released 2 years ago. Now you can download their source tar (don’t know why they use http for it but well …). Looking at their series file in patches folder they add 116 patches to the grub version.

Now we look at Fedora and how they handle grub. Well you will find out that 2.06 release got 92 rebuilds so far! You can also look at their commits. Time of writing Fedora adds 328 patches on top of 2.06 version.

Arch on the other hand simply uses grub-git. Wow! really? REALLY! Take a look here. 8 June 2022 marked the change on how Arch maintains grub. Before that you barley had any issues with the grub package, except if you don’t care about security fixes. However, back then the huge security issues hit mainstream news and Arch had to react, somehow. Either adopt something like from Fedora, Ubuntu or Suse and have a really hard time in keeping up on what to patch and not in grub, or use simply git master.

Well, who else used git master that way for a very long time? guess who. During 2.04 cycle I switched very early on to use git-master. When @Yochanan joined grub packaging he mostly adopted what Arch was doing. However also decided to use git master when the big security issue popped up. So you either use a lot of patch-sets or use git-master after testing it as good as you can.

The real issue with grub

Grub the bootloader is that program most of our users won’t notice anyway, as by default it is hidden away. You system boots, shows you the vendor logo and you’re in Manjaro after some time. That is the ideal case. Then there is the advanced case if you have full disk encryption enabled. You will read some grub dialog to enter your password, then you have to wait some seconds and the vendor logo will come up and the boot process continues. This grub install was most likely done a while ago when you installed Manjaro via Calamares.

But what happens when there are grub updates?

It is simple: the content of the grub package gets installed and update-grub gets called. In rare cases you have an unbootable system like reported here: FS#75701 : grub 2:2.06.r322.gd9b4638c5-1 issue

So what happend in that case? Well Arch demands that you know on how to maintain your bootloader and use grub-install the same way as you used it when you installed Arch. Isn’t it that they recommend to document the cmds you issued when you did that the first place? Wait a minute - say what? Well there is still the Arch Wiki for those who forgot …

Calm down my friends. Maybe you weren’t affected by the UEFI settings menu boot loop and you might have had issued grub-install at some point anyway. But what was the issue? Well, upstream decided to add a flag to fwsetup binary and if you didn’t had the latest version of it in your master boot record or UEFI system installed, the older version ignored the flag, as it doesn’t has it to begin with and you landed in your UEFI setup menu - and that in a continues loop. I managed to convince upstream to revert the change partly, but the project lead of Arch Linux mentioned that Manjaro and all the deviates out there (EOS, Garuda and what else) use how they manage grub updates the wrong way anyway and hence that is not needed to make wrong installs work with those workarounds. Personally I don’t care what most others say and patched it anyway.

So what is really really the issue with grub on Manjaro?

Well, you the user community might install grub in many ways and therefore we can’t track on how you did it the first time. Fedora for example created grubby to gain that data. Suse uses fancy scripts to do that. So why not run simple grub-install in a hook? Well, it is more complicated as it is as Morton from Arch already stated. Unless Arch changes to a more monolithic approach we won’t install grub the proper way - well only once on initial install.

However we will continue to call grub-mkconfig to update the menu on kernel updates or BTRFS snapshots via snapper. Otherwise you won’t have an updated grub menu and that is the actual reason why grub may break.

How we may fix this?

a) we don’t as you’re already in the driver seat, decided that you want to use FDE and you’re able to fix issues by that decision
b) we document stuff and possible solutions in a way you can follow easy steps and stop when your problem is fixed. You’re still in the driver seat
c) adopt a different approach how grub gets installed, like from Fedora or Suse, which may make Manjaro not 100% compatible with Arch anymore

Conclusion

We followed as much as possible how Arch is using grub. We adopted to use grub-git way earlier than Arch even switched to master branch to provide a better grub user experience and early support to our users. The occasion that you have an unbootable system due to a kernel and grub update is actually very slim. Only you may complicate things by using stuff like FDE and SecureBoot so the user community of similar characters come nicely here together to help each other out as best as they can.

Tips

  • maybe put grub into your packages ignore list, as what works doesn’t need to been updated
  • keep a Manjaro USB Stick handy
  • keep an eye open on testing and unstable threads regarding updates. Most issues got mentioned there already and our team tries to avoid those to land in stable branch as best as we can
  • be open to breaking your system once in a while when you decided to go the extra mile and encrypt your system - you’re more or less at a higher risk than those who don’t. Also restoring your system may take extra steps as you have to decrypt your disk before you can change files on it
  • keep it up and friendly as you guys do. Most likely it is not just a Manjaro problem - do a search once in a while if it is also common known in other distros
13 Likes