Do I understand btrfs compression correctly?

mithrial · 3 March 2023 16:32

Hi,

today, I’ve converted my separate home partition from ext4 to btrfs. The main reason is that it brings compression, CoW, and maybe snapshots in the future. Also, I’m eager to spice things up a bit after a few months of a perfectly running system.

For formatting, I used

mkfs.btrfs -f --label home --checksum blake2 --data single /dev/mapper/crypthome

inside a new luks2-container encrypted by a keyfile from the encrypted root partition.
There are no subvolumes because why would I need one?

Now, in the /etc/fstab, I’ve used the following options:

UUID=... /home btrfs defaults,noatime,compress=zstd,autodefrag 0 2

After changing all the UUIDs in fstab and crypttab, everything worked right away (weird flex, but hey).

So now to my actual question. I’ve setup zstd with the default compression level 3 (as you see above). After a few tests on a subset of my home directory before, level 3 seems to be good enough. I’ve tried 8 and 12 but the compression ratio was negligible whereas the speed was halved or less.

For all my finished projects, I have a special folder where I put in a <project>.tar.zst of each project to have it not pollute my system with a few thousand files and also regain disk space by compressing it.
From time to time, I want to go back to the project and check a few files. Usually, I use fuse-archive (a package from the AUR) to mount the zst-file so that I don’t have to extract it.
But now, the underlying filesystem does the compression already. So, would it be enough to simply store the .tar file without compressing it myself? This would speed up creating and accessing immensely.
I don’t care that I maybe could save a few MB or even a few GBs when compressing the project myself with a “ultra” level, I want convenience.

The backups are done with restic which also compresses the files, so the backup storage is not overloaded.

Did I understand this correctly?

Aragorn · 3 March 2023 17:00

If this is an SSD, then I’d drop that autodefrag mount option if I were you. It’s pointless on an SSD because there is no latency, and it only wears out your SSD faster.

There is only so much compression any data file can take, and sometimes higher compression ratios actually yield bigger files due to the complexity of the compression algorithm.

In addition to that, it is indeed pointless to use compression on files that are stored on a filesystem that itself uses compression. Just trust btrfs and its own zstd compression algorithm for the greatest efficiency.

andreas85 · 3 March 2023 17:04

Yes

You even may omit the tar

~~This could be a reason to use a subvolume with different parameters for zstd.~~ read the docs

~~compression 3 for /home~~
compression 9 for /home/…/closed/projects

If you use subvolumes it is possible to use

separate snapshot-strategies, and
separate backup strategies
for different subvolumes.

It is possible to do (fast, compressed, differential) backups of snapshots of complete subvolumes (I have never before seen anything like this)

This is not what timeshift or snapper do. But it is possible with btrfs send/receive

P.S. On compression zstd:

most times it makes writing slower (depending of the compression factor)
most times it makes reading faster (if your SSD is slower than your RAM )

So, if you mostly write, use 3. If you mostly read, use 9

mithrial · 3 March 2023 19:03

Thank you all for you valuable input!

This is exactly why I asked this question. I’ve removed it, thank you.

That would be perfect.
How would I create such two subvolumes?

One complication: I’m using systemd-homed which runs after mounting the disks, so I can only add /home to /etc/fstab because during this mounting, the contains the directory /home/username.homedir which is then mounted somehow (?) by system to /home/username. I can’t add a subvolume mount to /etc/fstab.
(Although not in my but potentially the /home/username.homedir could be an encrypted luks container.)

I just looked at it, and apparently, systemd automatically creates a subvolume for my user when it sees that it’s running on btrfs.
I need to investigate where to put this “closed project” subvolume.

This is the output of findmnt --real -l --no-truncate (root and boot excluded)

/home  /dev/mapper/crypthome       btrfs  rw,noatime,compress=zstd:3,ssd,discard=async,space_cache=v2,autodefrag,subvolid=5,subvol=/
/home/username
       /dev/mapper/crypthome[/username.homedir]
                                   btrfs  rw,nosuid,nodev,noatime,idmapped,compress=zstd:3,ssd,discard=async,space_cache=v2,autodefrag,subvolid=5,subvol=/

I know about this sending/receiving but where to can I send these snapshots? I have a remote storage similar to S3 for my incremental backups with restic. But I guess these snapshots are not encrypted prior to sending, right? And doesn’t the receiving side need to understand what I’m sending?

Aragorn · 3 March 2023 19:44

Just create it inside your home directory. It’ll show itself as a directory, and it doesn’t even need to be explicitly mounted, because it’ll be accessible from when your home directory is mounted.

Of course, one caveat here is that a btrfs snapshot of your home directory will not include the data in the subvolume. So you’ll need to be aware of that when planning your backup strategy.

No, it doesn’t have anything to do with networking. It’s for sending snapshots onto another btrfs drive on the same machine.

mithrial · 3 March 2023 20:08

Cool, how do I set the compression option when not during mounting?

Aragorn · 3 March 2023 20:18

Ah, that’s not possible without explicitly mounting the subvolume separately, because that’s only settable via the mount options, and if you do not mount the subvolume separately, it’ll inherit the mount options from the parent.

mithrial · 3 March 2023 20:26

I created a subvolume in /home/username/work/archive. I could set the compression by setting the property. (I guess?!)
But I can’t verify because the subvolume is not shown in /sys/fs/btrfs.
It is also not show anywhere in any mount information, so I don’t trust that it’s actually valid. However, the directory is created and I can write to it.
A small test with compsize might indicate that the higher compression level is applied, but I would trust myself (I hear deduplication might interfere).

andreas85 · 3 March 2023 20:29

That is only half of the truth. You can pipe

btrfs send →
over pv →
over ssh →
to btrfs receive (on another machine)

I do this with my differential backups (the other way)
It is even faster than rsync (but can not resume, when interrupted)

TriMoon · 4 March 2023 12:51

IIRC, you are able to set the compression property of a subvolume using the btrfs property set -t s command
See: man btrfs-property

Ofcourse keep in mind that changing compression like that will only take effect on new files created, it will not change already available files.

Example output:

$  btrfs property get /mnt/btrfs_rootfs/Backups/@LutrisGames/
ro=false
compression=zstd

Zesko · 4 March 2023 13:06

There is the known issue of property:

github.com/kdave/btrfs-progs

Creating a new subdirectory and files do not follow the zstd compression level in recursive btrfs properties.

opened 12:30PM - 09 Jan 23 UTC

Zesko

1. Create a new Btrfs filesystem on the USB stick. 2. Create a new root-directo…ry in this filesystem. 3. Set the compression level: `zstd:10` for this root directory: ``` $ btrfs property set root/ compression zstd:10 $ btrfs property get root/ compression=zstd:10 ``` 4. Create a new file or subdirectory in this root directory and check its property: ``` $ btrfs property get root/file compression=zstd ``` ``` $ btrfs property get root/subdirectory/ compression=zstd ``` But they show their compression zstd **without** level 10. I am not sure if it is bug.

The property of sub directory does not follow level of zstd.

andreas85 · 4 March 2023 13:08

https://btrfs.readthedocs.io/en/latest/btrfs-property.html

It is only possible to set compression on/off, or type. (not the level)

The level can be set by mount-options when mounting (read the doc) a subvolume

But they are always shared for the whole filesystem(btrfs-volume)

TriMoon · 4 March 2023 13:10

I missed the part about compression LEVEL indeed
Ahhh, but the OP is using the default level anyhow

BTRFS is still a WIP (Like most bleeding-edge technologies), this could be improved to allow per subvolume settings ofcourse to choose compression levels independently
Maybe someone already proposed it to the devs of BTRFS who knows…
Anyhow it is getting better and better over time…

Thanks for updating my personal info about this aspect

mithrial · 4 March 2023 15:25

The higher compression level would be nice for the archive subvolume but it’s not a deal breaker.

How can I verify that the subvolume is actually used and persistet over reboots? The show command doesn’t list a parent, but it should, right?

And it’s a subvolume inside a subvolume. How does it know it should be mounted?

Aragorn · 4 March 2023 15:40

It is always being used, because to userspace, there is no difference between the subvolume and its directory name. As soon as you put a file in the directory, you are writing to the subvolume.

Also, if it’s listed in /etc/fstab, then an explicit mount of a subvolume will survive a reboot. Do however note that subvolumes are not block devices, and that therefore you have to use the UUID of the main btrfs filesystem. The explicit mount of a subvolume is done via the subvol and/or subvolid mount options — see…

man mount

… for details.

I’m not currently using any nested subvolumes — I have distinct partitions that each have their own dedicated and independent btrfs filesystem on them — but the UNIX-standard mount command should list the explicit subvolume mounts.

mount | grep btrfs

A nested subvolume and its contents are always visible and accessible when the parent is mounted, but in order to mount the subvolume with differing mount options — e.g. read-only — it must be explicitly mounted via /etc/fstab, or if you will, manually from the command line.

mithrial · 4 March 2023 15:56

Thanks, I’ll experiment with this. The subvolume didn’t show up in the mount command.
Probably, I’ll have to revert switching to systemd-homed and put it manually into fstab.

Aragorn · 4 March 2023 16:09

Then it’s not explicitly mounted, because a nested subvolume, when explicitly mounted, is actually a kind of bind-mount.

When systemd-homed was released as a systemd add-on, I looked up at the sky and whispered “Lord, please let this chalice pass me by”, and I’m not even religious.

Zesko · 4 March 2023 17:12

fstab has some limit.

You can use systemd-mount. There are mount-points in systemd list: systemd-mount --list

You can edit the systemd-mount of systemd-homed if it exists.

systemctl edit --full [systemd-homed-mount]

[Unit]
After=[systemd-homed-login.mount?]

[Mount]
What=/dev/mapper/...
Where=/home/[user]
Type=btrfs
#here, you can config these mount options
Options=subvol=/[@user],noatime,compress=zstd:9

Maybe it would help.

See:

dalto · 4 March 2023 19:14

An important point to keep in mind is that compression managed via mount applies to the whole filesystem, not the individual subvolume. Essentially, this means that which subvolume is mounted first will be the one that controls the compression applied. This will usually be the subvolume mounted at /. You can easily check this with the mount or findmnt commands after booting. You should see they all have the same compression type set.

It makes no difference if you use fstab, a systemd mount unit or the mount command.

Another important thing to note is that compression is only applied to future changes so if you converted an ext4 fs to btrfs, none of that data will be compressed.

You can use the btrfs filesystem defrag command to compress existing files though.

Aragorn · 4 March 2023 19:25

I wouldn’t recommend that on an SSD, though. On the other hand, on a spinning HDD it’s all fine.