Default btrfs mount options and subvolume layout

My goal is to make easy and sustainable snapshotting available in Calamares for a future manjaro release. On the roadmap there following subtasks:

  1. Make the calamares swapfile generation work with btrfs ([WIP] Btrfs swapfile handling by Chrysostomus · Pull Request #1597 · calamares/calamares · GitHub)
  • without its own subvolume with disabled cow, swapfile can lead to file system corruption
  1. Enable distro configurable subvolume layout in calamares so we can have more sophisticated layout
  • Current layout wastes space by snapshotting rapidly changing low value data like pacman cache, and restoring snapshots also loses the log files.
  1. Automatically install grub-btrfs, timeshift-autosnap and btrfs-maintenance if btrfs is chosen
  • Btrfs-maintenance is needed to free up space from accumulating metadata. Regular users should not need to worry about this manually. Timeshift-autosnap and grub-btrfs provide easy bootable snapshots on every update.
  1. Let users choose the filesystem (Allow selecting the FS from GUI · Issue #847 · calamares/calamares · GitHub)
  2. Patch timeshift to preserve the subvolume layout when restoring a snapshot (Preserve default subvolume name in snapshot restoration · Issue #694 · teejee2008/timeshift · GitHub)

This topic is for discussing what would be the best default subvolume layout for btrfs systems.

  • @ for / and @home for /home are required for timeshift to work.
    • Should @home use nodatacow by default? Cow causes abysmal performance for virtual machines and databases, which are commonly stored under /home directory. Datacow can be disabled on smaller scale too, but virtual machine directories don’t yet exist at install time. You lose some nifty features like reflink copies, but it might be easier for new users.
  • @log for /var/log would preserve log files between snapshot restorations. On the other hand, the system state would no longer be reflected in the logs. Pacman log might show you installed a package, but that package is no longer installed because you restored a snapshot.
  • @cache for /var/cache would prevent the differences between snapshots from growing too much, saving a lot of space.
  • @tmp for /var/tmp, same reason as cache subvolume
  • @mt for /var/lib/manjaro-tools would keep manjaro-tools from blocking automatic snapshot deletation. Not too interesting for most users, but very useful for developers
  • Some distros use a subvolume for /srv, but we are not a server distro so I don’t think that matters.

Does anyone have any other suggestions for btrfs support in manjaro? @eugen-b?

5 Likes

Keep in mind that BTRFS only uses the mount options of the first sub-volume mount command used, i don’t know if this applies to the whole system or just a single drive.
I remember reading about the usage of a separate sub-volume for the swap file, but i think you needed to set the “nocow” flag on the created file instead of the sub-volume…

This could be mounted as a tmpfs IMHO :wink:

I would also suggest to use full names (with slash replaced by underscore) for the subvolumes wrt their respective mount points. Eg. @var_log for /var/log
Or perhaps create sub-volumes inside of sub-volumes to better preserve the tree hierarchy.
You would not need to use @ in the names of the sub-volumes inside other sub-volumes, the log sub-volume inside the @var sub-volume can be addressed as @var/log

Disclaimer:

I have been fiddling with BTRFS in the past, but am not using it anymore…

Chattr +C /home?

Good idea

I would very specifically want to avoid nested subvolumes because it

  • makes restoring snapshots really messy
  • prevents unused subvolumes from being automatically pruned
  • is not supported by timeshift

Reading chattr(1) that might work:

(Note: For btrfs,…
…If the ‘C’ flag is set on a directory, it will have no effect on the directory, but new files created in that directory will have the No_COW attribute set.)

I can’t say it will or won’t work, you would need to test it out to be confident :wink:

Don’t understand why it would be messy, because it is just like a bind-mount over a mount point.
Never knew BTRFS auto-pruned unused sub-volumes…
Besides the sub-volumes you create will be populated anyhow, so no unused ones there.

I never used that app, but i’m sure it should be able to handle it same way as /proc etc…
Eg. Either by staying on same filesystem or by excluding directories.
Anyway i’m not familiar with it.

Timeshift automatically deletes old snapshots to save space. If there are subvolumes nested under a snaphot/subvolume, you can’t delete that subvolume before deleting the subvolumes under it, and timeshift does not do it.

Have you tried it? At bare minimum you should not be able to delete the old @ if you have subvolumes nested inside it. And you will want to delete it eventually if you need to restore a snapshot, otherwise you end up with a ever growing diff that just eats up your disk uselessly.

Is there some specific advantage to having a nested subvolume layout instead of a flat one?

https://www.spinics.net/lists/linux-btrfs/msg54931.html

going to bed so just a quick response to that…
The benefit would be at least no need for a mount definition in /etc/fstab :wink:

I don’t really see that as a major benefit, considering that fstab get’s written by the installer.

Ok i just read that link, and can see how it can cause a problem now indeed… :+1:

You could ofcourse move the @ to a new name and reconstruct the final structure by using bind-mounts, but it would involve a bit more effort to use a snapshot indeed.
Anyway it was just an idea that poped-up in my mind back then when i fiddled with BTRFS a bit, i never tried the snapshot functionality of it yet.
So lets just drop that idea then :wink:

I was actually wondering when Manjaro is going to provide support for BTRFS and I am happy someone took initiative. I am looking forward to see it come true and all the best with development.

On another note, please be aware of the grubenv limitation on BTRFS that I have hit myself. I have opened a thread about it, but fixed it meanwhile. You might want to take Fedora’s approach of making a separate partition for \boot that is on ext4. Feel free to correct my posts if you find any mistakes.

Why are people who never used BTRFS suggesting wrong ideas here?
If you haven’t used BTRFS, and don’t even know how Timeshift and snapshots work, don’t suggest wrong ideas. This just wastes other’s time. Instead, get a fresh partition, do an installation on BTRFS, and experiment with it.

@Chrysostomus @TriMoon /var/tmp shouldn’t be mounted as tmpfs. /tmp is mounted as tmpfs. /var/tmp is for temporary files that should persist between reboots. This is in the directory spec (I don’t know which, but maybe freedesktop). So if an app wants to store temporary files, then use /tmp, but if it’s needed between reboots, then /var/tmp. Erasing /var/tmp at shutdown will cause unwanted effects.

@Chrysostomus Storing snapshots of a subvolume inside itself is not a good idea in my opinion.

Let’s say storing snapshots of @ (mounted at /) in a directory on itself, e.g. /snapshots. It is not a good idea because it can be easily messed up. The directory structure would be messy because every snapshot would contain empty directories in place of all the previous snapshots, but not the actual previous snapshots.

@
└── snapshots
   ├── 1
   │  └── snapshots
   ├── 2
   │  └── snapshots
   │     └── 1
   └── 3
      └── snapshots
         ├── 1
         └── 2

It’s really messy.
The biggest problem with it is that if one snapshot is restored by

mv @/snapshots/2  @

then all the other snapshots are deleted, because subvolumes are excluded from snapshots, and all the snapshots are stored on @, so the /snapshots directory contains empty directories on snapshots. It is possible to move around directories when restoring, but it’s too hacky and error prone. The rollback functions of backup programs is very error prone and is broken in many situation.

On the other hand, if snapshots are on a separate subvolume, e.g. @snapshots, it is possible to roll back the system permanently, without reboot. It can be mounted on /mnt/snapshots or similar.
The directory structure would be much cleaner:

@
@snapshots
├── 1
├── 2
└── 3

It would be easy to

mv  @snapshots/2  @

All other snapshots would be preserved, and no hacky directory moving is involved.

Let’s say, I have a broken update, which is unable to boot, so I can’t boot @, but I have a backup from before the update in @snapshots/@/before-update.
I boot into it from GRUB BTRFS menu (so the system is temporarily restored).
I mount the partition root into /mnt/btrfs, then cd into this mount point, then

mv @snapshots/@/before-update  @

And my running system is permanently restored to its previous version. If I restart the system, it will boot into the restored version. (supposed subvol mount option is used, and not subvolid).

This easy moving is not possible with nested subvolume layout, or if snapshots of a subvolume are stored on itself.

Sorry if it was hard to understand, I will write a blog post about it with more examples.

I agree with you, as you can see from the preceeding thread. There is no good reason to nest subvolumes, but many good reasons not to.

Not sure who this is directed at, but I’ve been using btrfs as my main filesystem for about four years or so.

Not sure who this is directed at, but I’ve been using btrfs as my main filesystem for about four years or so.

It’s not directed at you, but somebody suggested a really dumb idea (that could result in broken installations), without trying it.

I recommend using Snapper instead of Timeshift.
It allows more flexible configuration:

  • importance of snapshots
  • pre-post snapshot comparison
  • cron based snapshot like Timeshift
  • rules to clean up snapshots based on
    • free space
    • space used by snapshots
    • counting snapshots
    • importance
    • etc.

It also has a GUI.
I suggest having the following layout for it:
@snapshots mounted to /.snapshots
@home-snapshots mounted to /home/.snapshots

Snapper does have its advantages, as you pointed out. On the other hand, timeshift brings following to the table:

  • default writable and therefore bootable snapshots. Synergy with grub-btrfs is great. If we used snapper, this would have to be tackled somehow.
  • filesystem browsing for snapshots
  • more beginner friendly interface
  • works also without btrfs
  • already shipped with manjaro for several years

So, switching over to snapper and snapper-gui would require careful planning. Ideally this could be user choice, perhaps as part of manjaro-hello? @Ste74 what do you think?

Hi

but that’s a problem, in my opinion a unsafe option to use for the goal we want.
You can setup the system, use ro-snapshots and even boot into these ro-snapshots into DE/GUI with only minor problems. And that’s a safety point.

  • Filesystem browsing: this feature is always present fully independent of Timeshift, you can use any filemanager to browse your snapshots

Snapper is good but there exists some caveats for normal users:

  1. use it as it was planned by suse developer, then u have to use snapshots/subvolume layouts that use nested child-subvolumes, means a hirarichical layout, and such layout is in common sense more complicated, like “notramo” above correctly sayed. I personaly don’t setup my BTRFS layout the Suse-way
  2. with Snapper you have to use a bootloader that use the default SubVolume setup, liek the patched GRUB from OpenSuse. Otherwise you can’t boot into Rollback-Snapshots taken.
  3. prior two points lead us to a way of using Snapper where we do never use Snapper for rollback, because we rollback manually on CLI (only two command are needed for, but thats to much for normal user without knownledge of BTRFS). On my setups i can use Snapper to make rollbacks but i can’t use it fully, menas boot into rollback-snapshots, because Snapper/Suse based his logic fully on BTRFS default-Subvolume setup.

I use BTRFS, snapper, snapper-gui, snap-pac, grub-btrfs and snap-sync on 5 machines. All machines use a central BTRFS backup storage and SSH tunnels. This setup need knownledge of BTRFS but i think is a lightweight and a commonly easy setup to understand with most features enabled we want.

  • flat BTRFS layout for @ → ‘/’, @home ->’/home’, @var → ‘var’, @snapshots → no montpoint
  • hirarchical BTRFS layout for snapshots. I use configuration of snapper for “root” and “home”, thus use subvolumes @snapshots/root and @snapshots/home.
  • /boot/efi is mountpoint for UEFI FAT32 partition, we include anything in /boot into snapshots of / = root-system
  • swap-partition instead of swapfile in BTRFS, if we don’t use system encryption, that’s easier. With system-encryption a @swap subvolume with BTRFS Swapfile, to get only one LUKS encrypted partition for the full system
  • /var is full into a separate @var subvolume. Instead of using many different subvolumes of each folder into @var to separate/splitout the one folder we need realy into our root-snapshots i separate @var-> /var fully from root-snapshots and include the only folder we need into root-snapshots → /var/lib. /var/lib must be snapshotted with our root system because we don’t have to break dependencies of installed library configuration or more important the pacman/pamac config files of installed packages of our current system. To get this behavior i copy /var/lib to /usr/var/lib and in fstab i make a bind-mount from /usr/var/lib to /var/lib. This is then the only point we have to consider. My /var have +C attribute set.
  • I use a fstab mount point of subvolid=5 to /btrfs folder, means i mount the BTRFS Root-Filesystem to folder /btrfs. Most btrfs command i use access the filesystem trough this /btrfs folder to ensure i use the right place for any action into the BTRFS filesystem hirarchy.
  • Now, this system can boot with GRUB ro-snapshots into DE/GUI.
  • Rollbacks i make by hand, easy case:
sudo btrfs subvolume delete /btrfs/@
sudo btrfs subvolume snapshot /btrfs/@snapshots/root/15/snapshot /btrfs/@
  • because we exchange our @ subvolume on rollbacks, anything setup’ed in GRUB, grub-btrfs and so on works, even if we bootet into a ro-snapshot. We don’t have to update-grub later or so on, because we boot our working system always at @.
  • thus i don’t have to tweak GRUB on startup and edit his menuentries on bootup, no patched GRUB needed like in Suse, no update-grub after rollbacks and so on.

Even better would be a solution to boot on damage into a ro-snapshot with GRUB, start a small scriptlet that make above rollback automatically and reboot.

On my USB backup drive i installed a identical bootable Manjaro system like on all my machines. The only difference is another subvolume @backups. Into this @backups subvolume if store with snap-sync my backups of my machines, like @backups/($hostname)/root/#/snapshot.
On realy hard damages of my machines i attach this USB drive, boot his Manjaro Installation with his @backups subvolume and made my recovery, by hand. Because we never knew exactly what errors we get in future we can’t relay on a fully automatical tool to make this recovery for us.

// sudo btrfs su li /btrfs
// subvolumes
ID 1372 gen 73430 top level 5 path @
ID 669 gen 73432 top level 5 path @home
ID 259 gen 73420 top level 5 path @var
ID 327 gen 43489 top level 5 path @snapshots
ID 504 gen 72482 top level 327 path @snapshots/root
ID 505 gen 73417 top level 327 path @snapshots/home
// snapshots
ID 1147 gen 34449 top level 504 path @snapshots/root/1/snapshot
ID 1148 gen 34450 top level 505 path @snapshots/home/1/snapshot
ID 1149 gen 34487 top level 505 path @snapshots/home/2/snapshot
ID 1211 gen 38203 top level 505 path @snapshots/home/48/snapshot
# Swap
UUID=06686a06-c069-49f7-86e4-7a962740b364       none                    swap            defaults                                                                        0 0

# UEFI
UUID=447C-E2BC                                  /boot/efi               vfat            noatime,codepage=437,iocharset=iso8859-1,shortname=mixed,utf8                   0 2

# System
UUID=26c8751d-2747-4a4d-b857-32c82d67b20a       /btrfs                  btrfs           noatime,ssd,compress-force=zstd:5,subvolid=5                                    0 0
UUID=26c8751d-2747-4a4d-b857-32c82d67b20a       /                       btrfs           noatime,ssd,compress-force=zstd:5,subvol=@                                      0 0
UUID=26c8751d-2747-4a4d-b857-32c82d67b20a       /home                   btrfs           noatime,ssd,compress-force=zstd:5,subvol=@home                                  0 0
UUID=26c8751d-2747-4a4d-b857-32c82d67b20a       /var                    btrfs           noatime,ssd,nodatacow,subvol=@var                                               0 0
UUID=26c8751d-2747-4a4d-b857-32c82d67b20a       /.snapshots             btrfs           noatime,ssd,compress-force=zstd:5,subvol=@snapshots/root                        0 0
UUID=26c8751d-2747-4a4d-b857-32c82d67b20a       /home/.snapshots        btrfs           noatime,ssd,compress-force=zstd:5,subvol=@snapshots/home                        0 0

# var/lib mount into subvol=@
/usr/var/lib                                    /var/lib                none            defaults,bind                                                                   0 0

# Backup
UUID=d49e1730-5137-473c-8e28-a76cf14e9830       /media/Backups          btrfs           nofail,noatime,ssd,compress-force=zstd:5,subvol=@backups                        0 0

You see, the @snapshots subvolume is outside of the subvolume we make snapshots from, this is the most often suggested layout for BTRFS. Opensuse goes another way and i don’t know why suse goes the more complicated way.

My setup need knowledge by the user. Using Snapper, his manual setup, the trick with the bind-mount of /var/lib, the manual rollback, the recovery with help of the backups made (manual), all this are small traps.

In my opinion: using BTRFS is the same discussion as using cryptography. In both cases you have to known about it, you have to learn some basic stuff to handle it correctly. A easy-dumb-use is currently not possible. The needed toolchain is’nt ready yet. Using Timeshift seems on first look a nice idea but on second look, it’s not the right choice. The goal is system-stability, safety and writable snapshots are contradicted in my eyes.

But otherside: if Manjaro where the first great distribution that had manage these pitfalls with BTRFS, Snapshots and Backups and made it easily available for common users, then it where a big win for Manjaro.
I was a windows user over about 25 years, but now with BTRFS and his features correctly used i would never go back again. My descision to choose Manjaro as my linux distributation is based on the use of BTRFS.

PS: sorry for my english :wink:

No, i understand you fully, you speak my meaning :wink:

As above shown: i use

@snapshots/root -> /.snapshots
@snapshots/home -> /home/.snapshots

and they parent is subvolume @snapshots. I think it’s even more easier to understand and cleaner. If we in future expand this concept we have the choice to create new subvolumes for as example @srv, @opt in same level as @,@home,@var. Or if we change Snapper config to add a new snapshot we attach it into @snapshots as example @snapshots/srv. Flat outside BTRFS layout for all system relevant parts, and hirachical BTRFS layout for all snapshots with one parent @snapshots. This Parent subvol can be only seen and accessed trough the /btrfs mountpoint of BTRFS Root-Filesystem mount.

Honestly never used BTFRS… maybe can be the time to try it out. After the choice of the structure sure we have to think about the system for managing it in Manjaro by default. Offer an alternative in hello for me no make sense because most users don’t know more about BTFRS if we exclude the ability to manage snapshots. Power users can set themselves the system in my POV.