Confusion about btrfs maintenance

jackdinn · 10 June 2024 10:56

I dont know if i even need these automated but im really confused trying to use btrfsmaintenance.

firstly it has 2 packages called btrfsmaintenance
1 aur/btrfsmaintenance 0.5-2
2 extra/btrfsmaintenance 1:0.5.1-1

It says i have both installed?

secondly, the btrfs assistant only shows 3 of the 4 (scrub, trim, balance, defrag) it does not show trim.

so i just try to use the shell. i edit /etc/default/btrfsmaintenance and change the BTRFS_TRIM_PERIOD="weekly" and run sudo /usr/share/btrfsmaintenance/btrfsmaintenance-refresh-cron.sh but the sudo systemctl status btrfs-trim.timer still shows as monthly timer.

I think it was supposed to make something in /etc/systemd/system but there is only btrfs-scrub.timer.d and btrfs-balance.timer.d

Then i notice there are errors in journalctl when this tries to execute any of the commands /usr/share/btrfsmaintenance/btrfs-balance.sh: line 47: run_task: command not found

So after loads of digging i find 2 errors in the script /usr/share/btrfsmaintenance/btrfsmaintenance-functions

Im not sure what i am doing here, im loosing the will, do i need this or am i doing it all wrong?

andreas85 · 10 June 2024 12:00

You find good Information about Btrfs in the wiki

and in

Note that Btrfs is not a Windows file system!

Best to check with the source (Btrfs developers)

Defragmentation
is rarely needed with Btrfs. Should not be used if snapshots or compression are used
Scrub
is not beneficial at all. It can only be beneficial if you have RAID 1. And even then only if you really think something is wrong.
Trimming
is done automatically by systemd ~~Btrfs~~ at best
Balance
is needed in some cases. But Btrfs will balance itself as long as you don’t exceed 80% fullness

Used carelessly, these commands only put a strain on the lifespan of the hard drive/ssd, but do nothing more than waste heat

Aragorn · 10 June 2024 12:04

Normally, this will be taken care of by systemd’s fstrim.timer, which will trim all read/write-mounted filesystems once a week — Sunday at midnight.

jackdinn · 10 June 2024 12:24

[quote=“[HowTo] make btrfs faster, post:1, topic:162462”]
Be sure to use the noatime mount option !
[/quote]

what in fstab mount for the subvolumes, ok

Yea iv been trying to get through it and understand for the last 6-7 days now.

O right, yea i do have that service & timer.

can someone just give me reasonable time periods for the other 3 (if all 3 are wanted/needed)? (and what filters to use) thx.

Never mind, sorry im tired. I shall just stop going down this path.

Aragorn · 11 June 2024 02:49

It should also never be used on an SSD, because…

it’s completely pointless; and…
it unnecessarily increases wear.

As @andreas85 says, only do that if you think something’s wrong.

Likewise. As long as your filesystem isn’t 80% full, there is no need, because btrfs will balance itself.

Molski · 11 June 2024 04:27

Regular scrubs don’t hurt. They have been known to catch faulty hardware. Even my Synology box (by default), does a btrfs scrub every 6 months. I increase it.

andreas85 · 11 June 2024 14:02

A scrub is really useful if you are using RAID1 . Then the scrub not only detects the error, but can also fix it. The damaged copy is deleted and the
undamaged copy is used to rewrite the area elsewhere.

Attention:
A scrub does not test all sectors of a drive, but only reads (and tests) the sectors that btrfs is currently using.

The following applies to all RAID levels:
Hardware that reports errors during a scrub should definitely be replaced promptly.

Molski · 11 June 2024 21:49

It was my understanding, that you need that for correcting errors. But within the btrfs metadata itself, contains checksums of chunks of data. So you can still detect errors.

Correcting them will need a redundant multi-device setup. Or even snapshots on the same device can also be of use. (Something such as bad CPU cache causing corrupt data.) I know @Zesko has the Snapper commands to repair this type of situation, even from a previous snapshot. (I can’t seem to find the post by searching.)

But even on my single device btrfs filesystem:

[tdell@mbox pacman.d]$ sudo btrfs fi show /
Label: none  uuid: 310e0472-f96e-485c-8fa1-585bcccfb624
        Total devices 1 FS bytes used 544.54GiB
        devid    1 size 896.78GiB used 737.09GiB path /dev/nvme0n1p2

[tdell@mbox pacman.d]$ sudo btrfs inspect-internal dump-super /dev/nvme0n1p2 | grep csum
csum_type               0 (crc32c)
csum_size               4
csum                    0xffc3fac5 [match]

andreas85 · 12 June 2024 12:44

A btrfs snapshot points to the same data. So if the original data is corrupted and scrub discovers this, the data in the snapshot is also lost because the pointers point to the same data.

Only with the metadata does btrfs protect itself by default by keeping 2 copies. This can prevent damage to the file system.

Only a RAID1 setup with different devices can benefit from a scrub. But even then, the following applies: Replace devices that produce errors quickly! Otherwise, it is like playing with fire.

The only reason for a scrub that I can think of is if the computer was turned off by a power failure or something similar while data was being written.

Molski · 12 June 2024 19:57

You’re right of course, with the snapshot part of it, I was thinking of another situation.

But I was still emphasising the scrub for checking. Dmesg will output these as warnings on the fly, but I thought scrubing does a checksum of the b-tree data. (Which is not a thorough as a file system check.)

From the man page:

Scrub  is  a pass over all filesystem data and metadata and verifying the checksums. If a valid copy is avail‐
able (replicated block group profiles) then the damaged one is repaired. All copies of the replicated profiles
are validated.

So this does do a checksum validity check on a single device scrub, right? (Which is not the same thing as an fsck.) And the repair is with redundant data.

But you said scrubbing does nothing here. But I thought one of the features about btrfs was all about checksum validation. Which hinders performance, for data integrity.

andreas85 · 13 June 2024 15:27

Yes

but you only have redundant data if you use RAID1

And if you use RAID1 the data will be repaired with any read of it (not only with scrub)

Molski · 13 June 2024 18:20

I said two things. I tried to redact the repair and everything about RAID multiple times.

A single device does this (even though this happens with RAID as well. The checksum would determine which of the data is the non-corrupt data, you would think).

It’s been said twice, that scrubing without RAID, does nothing. That’s what I’m trying to say isn’t the case. If it is not, I’d like to know, as I’m always learning. The man page can’t surely be wrong??

I’m trying to say it does a crc32 check (by default) on data that’s been written in any scrub. Even on a single device.

system · 15 June 2024 06:21

This topic was automatically closed 36 hours after the last reply. New replies are no longer allowed.