Bizarre boot chaos across two drives (raw disk clone with dd)

I’ve got some truly bizarre behaviour with grub and booting going on at the moment.

Backstory is, I’m somewhat new to Linux etc and definitely working near the limits of my ability, but I’d been trying to diagnose some weird hardware failures and in the process needed to cart my Manjaro around to different hard drives via dd (plus some manual grub installing). I’m now trying to clean up the mess, and experiencing a very strange set of behaviours.

My situation is this - I’ve got two drives, one with my original Manjaro install (and Windows; call it drive A) and one with my backup copy (and some other storage; call it drive B). I’m trying to get it such that I can boot from drive A both alone and with drive B inserted, but… grub is behaving very badly. Currently, I can boot from A when B is not installed in the machine, but if I insert B it will boot from B regardless of anything I try to do. I’ve now set up B where it can be booted on its own, but for a while it was even booting to B using the EFI partition on A. At one point, I had a grub bootloader menu where it properly recognised the installs on A and B, but both options booted the install on B. Now, with just A connected (and having just now chrooted in with a live USB and run the grub install process on A again), I have a grub menu that only lists the install on B but still boots A when that option is selected. The only way I can boot into A is by physically removing B, apparently regardless of any attempts to set A or B up correctly with grub.

In the process, I also discovered that running grub-install --efi-directory=/boot/efi still installs it to /boot, and not to /boot/efi. Also, at no point in this process has grub ever noticed the Windows install on A, which it would at least be nice to have access to even if I don’t really need it all that critically.

What do I need to do here to restore sanity?

(Just to be fully clear, my goals are this - I want to be able to boot A with B connected; I don’t care about B being bootable; and it would be nice if I could have grub give me the option to boot the Windows install on A also.)

For one thing make sure this line is uncommented:
/etc/default/grub

GRUB_DISABLE_OS_PROBER=false

(and run sudo update-grub again)

For the rest … you probably need to include some actual information, such as blkid, lsblk, cat /etc/fstab, etc.

2 Likes

Since you are talkin about cloning a partition with dd, i am about 95% sure you made a mess with the UUIDs. Either you have partitions with the same uuid or two different installations with the same /etc/fstab mountpoint uuids.

Those identifiers should be unique. Otherwise strange disasters can happen…the least of which is starting to boot in one os and continuing the boot in the other. You are efectively randomly mixing the two installs.

That is what we will see after you provide the infos, obviously cscs thinks in the same direction.

P.s. don’t forget to write which output is A and which is B.

2 Likes

Comment on the title - Bizarre boot chaos … (raw disk clone with dd) - yes you get that because you have duplicated the root UUID onto two different disks.

When a systemd based system boots the disks are mounted in arbitrary order and when the kernel loads it is loaded from the first disk having the UUID specified on the kernel command line.

In ANY case - to avoid the confusion you must change the UUIDs for the second disk.

If your intention is to have two separate installations you must change some system files for the second system after you changed the UUIDs

This can be done in a variety of ways (must be root or using sudo - also change sdy to the device in question)

sgdisk --randomize-guids /dev/sdy

But that is not enough - you will also need to change some files on the second system.

  • /etc/default/grub - rebuild grub.cfg
  • /etc/fstab file

You can retrieve the new UUID using

lsblk -f /dev/sdy

In fact there is another discussion on the UUID subject - I don’t know if you have access as it is in the Member Hub section https://forum.manjaro.org/t/off-the-shelf-duplicate-uuids/150933

1 Like

The brackets were added by me for search engine purposes, because it is what it is :slight_smile:

Ah, yes, that would explain it; I didn’t realise dd copied across the partition UUID also. Is the solution then to boot into / chroot into the install on B, run that sgdisk command on B, and then fix grub and fstab in the B install?

That would be my plan. Fix uuid without chrooting, then chroot and insert the new value in fstab and if needed reinstall grub (maybe update grub will be enough, not sure).

Honestly, I would physically remove the original drive, boot up and see if it works.
Not sure but changing UUID could potentially break things you do not think of.

That way you literally run on a cloned version.
THEN insert the old drive, boot into live and change UUID or clear partition tables on the “old” drive.

Edit
Maybe I misunderstand here. The above solution assumes you have 2 identical disks (the DISKS are cloned, not only selected partitions), copied with dd. Maybe that is not the case and you can disregard above.

Two options:

  1. Boot into a live USB, change B’s UUID
  2. Boot into B (or a live USB), change A’s UUID, edit fstab to use the new UUID, and run sudo update-grub

If B is meant to be a backup then it would be better to use a backup utility.

Probably just changing B’s UUID would be enough, yeah. It wasn’t originally meant to be a backup, so that’s why I didn’t use a backup utility; I’d just like to keep it around since I already have it.

Thank y’all! I think I have my answers now (^^)

That’s not the partition UUID; that’s the filesystem UUID. It is stored in the filesystem itself, and that’s why it got carried over.

A partition UUID (PARTUUID) is something else — it exists only on GPT-partitioned drives, it resides in the partition table itself, and it is unique. :wink:

1 Like

This topic was automatically closed 3 hours after the last reply. New replies are no longer allowed.