Can't boot LUKS encrypted install after 2023-03-31 stable update

I’m not far behind you. Playing with tumbleweed in a VM right now to see if it is my next distro or not…

It’s a nice distro, my #1 complaint was that they don’t set up any kind of PolKit, and EVERYTHING wants the root password, which is dumb.

Also, you set the system hostname in the summary screen, so it’s really easy to miss.

Also, you have to set up sudo manually, and IIRC, you have to install the man pages.

If you don’t mind my #1 issue with it, the rest are easy to get past and it’s a nice distro.

I’m going to another arch-based distro because I need bleeding-edge kernels for my 12th gen i7 laptop-and-portable-space-heater. :joy:

Yes, I’ve noticed the root pass thing. Must be some way to address that. Also, YaST is fugly and weird, but I can get over it. Sudo is working right out of the box. I did a custom install with MATE desktop, installed gnome-software so I can get all the flatpak stuff, etc. Just messing around and trying to make sure I can get all my absolutely necessary software installed on it.

sudo hostname ****** worked as expected.

Good luck with the re-install, mate. FYI tumbleweed gave me kernel 6.2.8 by default.

1 Like

I can’t stay how to fix it, but I wouldn’t recommend trying. I attempted the “accepted” answer of adding cryptodisk to grub and updating grub from chroot, but now I can’t mount the partition at all “device-mapper: reload ioctl on mnt (254:0) failed: invalid argument”

I was impatient waiting for the 1.5 gb install of time shift(wtf why is it 1.5gbs from the live USB) to fix it by restoring so I follow that.

What I’ve learned: don’t try to fix manjaro and use timeshift. If you didn’t have time shift: reinstall and use it nextime.

Guess it’s time to format…

: I call it the “accepted” answer because the update post FROM MANJARO directly links to in as an implied solution.

Please be aware that Timeshift restores are impossible if you choose full disk encryption (meaning your /boot is also encrypted). It is literally not supported.

The issue

Normally I don’t post screenshots but here you guys go:

no such cryptodisk found is a common error with grub and FDE. Based on the findings you will find Manjaro, Ubuntu, Garuda, Artix, EndeavourOS. OpenSuse, Debian and many more. What does it mean?

Grub has a very slow development cycle. More or less every 2 years a new version comes out. And 2 years is a long timeframe for security issues and other issues which may come when libraries and other stuff gets updated.

Looking at Distributions

Distributions more or less have to maintain grub on their own. Lets take a very non-rolling release like Debian and see the changelog of their grub package. As you can see they did 8 releases of 2.06 so far. Most are security patches and fixes on top of the version released 2 years ago. Now you can download their source tar (don’t know why they use http for it but well …). Looking at their series file in patches folder they add 116 patches to the grub version.

Now we look at Fedora and how they handle grub. Well you will find out that 2.06 release got 92 rebuilds so far! You can also look at their commits. Time of writing Fedora adds 328 patches on top of 2.06 version.

Arch on the other hand simply uses grub-git. Wow! really? REALLY! Take a look here. 8 June 2022 marked the change on how Arch maintains grub. Before that you barley had any issues with the grub package, except if you don’t care about security fixes. However, back then the huge security issues hit mainstream news and Arch had to react, somehow. Either adopt something like from Fedora, Ubuntu or Suse and have a really hard time in keeping up on what to patch and not in grub, or use simply git master.

Well, who else used git master that way for a very long time? guess who. During 2.04 cycle I switched very early on to use git-master. When @Yochanan joined grub packaging he mostly adopted what Arch was doing. However also decided to use git master when the big security issue popped up. So you either use a lot of patch-sets or use git-master after testing it as good as you can.

The real issue with grub

Grub the bootloader is that program most of our users won’t notice anyway, as by default it is hidden away. You system boots, shows you the vendor logo and you’re in Manjaro after some time. That is the ideal case. Then there is the advanced case if you have full disk encryption enabled. You will read some grub dialog to enter your password, then you have to wait some seconds and the vendor logo will come up and the boot process continues. This grub install was most likely done a while ago when you installed Manjaro via Calamares.

But what happens when there are grub updates?

It is simple: the content of the grub package gets installed and update-grub gets called. In rare cases you have an unbootable system like reported here: FS#75701 : grub 2:2.06.r322.gd9b4638c5-1 issue

So what happend in that case? Well Arch demands that you know on how to maintain your bootloader and use grub-install the same way as you used it when you installed Arch. Isn’t it that they recommend to document the cmds you issued when you did that the first place? Wait a minute - say what? Well there is still the Arch Wiki for those who forgot …

Calm down my friends. Maybe you weren’t affected by the UEFI settings menu boot loop and you might have had issued grub-install at some point anyway. But what was the issue? Well, upstream decided to add a flag to fwsetup binary and if you didn’t had the latest version of it in your master boot record or UEFI system installed, the older version ignored the flag, as it doesn’t has it to begin with and you landed in your UEFI setup menu - and that in a continues loop. I managed to convince upstream to revert the change partly, but the project lead of Arch Linux mentioned that Manjaro and all the deviates out there (EOS, Garuda and what else) use how they manage grub updates the wrong way anyway and hence that is not needed to make wrong installs work with those workarounds. Personally I don’t care what most others say and patched it anyway.

So what is really really the issue with grub on Manjaro?

Well, you the user community might install grub in many ways and therefore we can’t track on how you did it the first time. Fedora for example created grubby to gain that data. Suse uses fancy scripts to do that. So why not run simple grub-install in a hook? Well, it is more complicated as it is as Morton from Arch already stated. Unless Arch changes to a more monolithic approach we won’t install grub the proper way - well only once on initial install.

However we will continue to call grub-mkconfig to update the menu on kernel updates or BTRFS snapshots via snapper. Otherwise you won’t have an updated grub menu and that is the actual reason why grub may break.

How we may fix this?

a) we don’t as you’re already in the driver seat, decided that you want to use FDE and you’re able to fix issues by that decision
b) we document stuff and possible solutions in a way you can follow easy steps and stop when your problem is fixed. You’re still in the driver seat
c) adopt a different approach how grub gets installed, like from Fedora or Suse, which may make Manjaro not 100% compatible with Arch anymore

Conclusion

We followed as much as possible how Arch is using grub. We adopted to use grub-git way earlier than Arch even switched to master branch to provide a better grub user experience and early support to our users. The occasion that you have an unbootable system due to a kernel and grub update is actually very slim. Only you may complicate things by using stuff like FDE and SecureBoot so the user community of similar characters come nicely here together to help each other out as best as they can.

Tips

  • maybe put grub into your packages ignore list, as what works doesn’t need to been updated
  • keep a Manjaro USB Stick handy
  • keep an eye open on testing and unstable threads regarding updates. Most issues got mentioned there already and our team tries to avoid those to land in stable branch as best as we can
  • be open to breaking your system once in a while when you decided to go the extra mile and encrypt your system - you’re more or less at a higher risk than those who don’t. Also restoring your system may take extra steps as you have to decrypt your disk before you can change files on it
  • keep it up and friendly as you guys do. Most likely it is not just a Manjaro problem - do a search once in a while if it is also common known in other distros
13 Likes

Your filesystem is full EXT4, not btrfs. But why did you install grub-btrfs?

Remove grub-btrfs!

@philm Thank Phil for taking the time to post a thorough response - it certainly puts things in perspective for me & I know now what to look out for. More power to Manjaro & everyone contributing to the project…

1 Like

I have a hard time processing… :dizzy_face:


If i summarize:

Did i get that right?
Also, can the fix be applied before rebooting in order to prevent the issue?

1 Like

Thank you!!!

I made myself an account just to thank you! I had a nice start to Fool’s Day, but your kindness in sharing saved my day!

It is not clear if there are any changes needed to get your system booting. Most say: Hey I've this cryptodisk issue message asking me to press any key and it boots normal. Others claim their system is not bootable. For example I’ve a similar issue when I update grub on my Macbook M1 Air using our ARM version of Manjaro natively. There the issue is with grubenv not being valid. Since we know how the partitioning is done, grub gets actually installed to UEFI when it gets updated due to a special version of update-grub. So I’ve to fix that manually until the next update-grub call. There I have our update-grub named update-grub-menu as that is actually what our regular script is doing.

Installing grub to MBR/UEFI is needed to keep the distributed binary files in-sync with the version you have installed to boot your system. If you don’t want to bother with the bootloader simply put it into the ignore list and skip it from updating all together. As if security issues is your concern, you need to use grub-install plus the flags and info you used when you installed grub the first time - well most of us let Calamares do that. So they might not even know how grub got installed…

So what to do:

  1. check if some has changed in /etc/default/grub and fix that
  2. try to update your grub menu via update-grub
  3. always consider installing grub to MBR/UEFI as your last resort

Having a Linux Live-USB is always handy, even if you forgot or never had a Windows password when you have physical access to a PC or Laptop :smile_cat:

1 Like

Thanks for the additional comments. Is there a risk to setting “IgnorePkg = grub” in the Pacman config file? I don’t want to replace one risk with another? Thanks, R

Had same issue and did changes to /etc/default/grub
Also did:

$ sudo mkinitcpio -P
$ sudo update-grub

But issue persisted.

Ran:

$ lsblk

Got:

NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1                                       259:0    0 931.5G  0 disk  
└─nvme0n1p1                                   259:1    0 931.5G  0 part  
  └─luks-9d1a0ad4-ef26-41c2-bafe-4d3dcf7a750c 254:0    0 931.5G  0 crypt /

Then ran:

$ sudo grub-install /dev/nvme0n1

Problem fixed.
Hope it helps other newbies like me…


Moderator edit: In the future, please use proper formatting: [HowTo] Post command output and file content as formatted text

4 Likes

Thanks @kentwillumsen - that’s a useful suggestion to add into the mix.

Thank you kentwillumsen!!!
This suggestion fixed the problem for me…

Even though I kept fairly detailed notes of my initial installation procedure, there was nothing about grub-install. So, I followed the arch link that Phil posted and discovered some strange things, namely grub-install command has the path rather than the device.

# grub-install --target=x86_64-efi --efi-directory=esp --bootloader-id=GRUB

Then, they have this note.

You might note the absence of a device_path option (e.g.: /dev/sda) in the grub-install command. In fact any device_path provided will be ignored by the GRUB UEFI install script. Indeed, UEFI boot loaders do not use a MBR bootcode or partition boot sector at all.

Because of that note, I decided to follow arch wiki. (Side note: My EFI partition is not encrypted, i.e. /boot/efi is on /dev/nvme0n1p1 and the encrypted partition is on /dev/nvme0n1p2.) The problem is that there are two directories in /boot/efi/EFI: boot and Manjaro. Which id to use then? Interestingly, the former contains bootx64.efi and the latter has grubx64.efi but these two files are identical. I went with Manjaro.

# sudo grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=Manjaro

After reboot, I entered the passphrase and ended up in grub rescue mode after it showed something like “cannot find symbol grub-debug-malloc”.

I had to create a bootable stick, chroot into my system and copy /boot/efi/EFI/Manjaro/grubx64.efi into /boot/efi/EFI/boot/bootx64.efi. After that I was able to boot without any problems.

The whole thing is rather confusing. I have no idea why those two files need to be identical and why grub needs them both to function. On my older system without FDE, those files are not identical.

Idk, I feel like @philm is hiding something about what Calamares does to grub :rofl:

While it is true that it’s unsupported it’s 100% false to say it’s impossible. I’ve used Timeshift to restore Manjaro FDE multiple times when an update left me ■■■■■■■ Up until recently Timeshifts check/error for /boot being encrypted was non-deterministic and could be bypassed easily(which always led to successful restores for me). Unfortunately that doesn’t seem to be the case anymore IN GUI; through terminal I was still able to restore and thankfully didn’t need to wipe(the disk randomly decided it was fine and I was able to mount it and restore through the terminal).

efi/boot/bootx64.efi is the “fallback” path that is used when UEFI configuration in bios is wrong. So this file works same as MBR boot, in that when you say to bios “boot from this disk” it boots from that path. Normal UEFI boot procedure is to boot the path stored in bios (here should be /EFI/Manjaro/grubx64.efi) .

Finally! This one solved the issue for me. Thank you! Don’t really know why the fallback is being used though… My bios configuration hasn’t changed since the stable update. I guess this will reappear next time I’ll run grub-install so need to remember that. Don’t really like the ‘solution’ though. Feels like it should work without it. Does anyone have an idea how this can be fixed permanently? What I’ve noticed was that the first unlock prompt was referencing UUID without dashes, the other unlock promt (after the cryptodisk error message) was referencing UUID with dashes. I guess that’s why the decryption failed and showed the cryptodisk error message. When I cp /boot/efi/EFI/Manjaro/grubx64.efi /boot/efi/EFI/boot/bootx64.efi the first password promt had the UUID displayed with dashes and then the futher decryption worked.

UPDATE:
Just run efibootmgr -v and it seems that the bootx64.efi is set as first in order

Timeout: 1 seconds
BootOrder: 0001,0000
Boot0000* manjaro       HD(1,GPT,426c0f14-e61f-9249-97a9-4496a9a94d7f,0x1000,0x96000)/File(\EFI\manjaro\grubx64.efi)
      dp: 04 01 2a 00 01 00 00 00 00 10 00 00 00 00 00 00 00 60 09 00 00 00 00 00 14 0f 6c 42 1f e6 49 92 97 a9 44 96 a9 a9 4d 7f 02 02 / 04 04 36 00 5c 00 45 00 46 00 49 00 5c 00 6d 00 61 00 6e 00 6a 00 61 00 72 00 6f 00 5c 00 67 00 72 00 75 00 62 00 78 00 36 00 34 00 2e 00 65 00 66 00 69 00 00 00 / 7f ff 04 00
Boot0001* UEFI OS       HD(1,GPT,426c0f14-e61f-9249-97a9-4496a9a94d7f,0x1000,0x96000)/File(\EFI\BOOT\BOOTX64.EFI)0000424f
      dp: 04 01 2a 00 01 00 00 00 00 10 00 00 00 00 00 00 00 60 09 00 00 00 00 00 14 0f 6c 42 1f e6 49 92 97 a9 44 96 a9 a9 4d 7f 02 02 / 04 04 30 00 5c 00 45 00 46 00 49 00 5c 00 42 00 4f 00 4f 00 54 00 5c 00 42 00 4f 00 4f 00 54 00 58 00 36 00 34 00 2e 00 45 00 46 00 49 00 00 00 / 7f ff 04 00
    data: 00 00 42 4f

Will change to Manjaro and see if it works ok. Will update this post with results.

UPDATE 2:
After changing the boot order everything is working fine. :+1:

@varikonniemi thanks for the explanation, @Ancestr thanks for the pointers. I still don’t quite understand it. The order changed and the fallback was working albeit with the ‘no such cryptodisk found’ error message. However, after installing the new grub into Manjaro, the fallback stopped working. Shouldn’t they be independent? In any case, I also changed the order and it works.