Does anyone know if there are confirmed problems with BTRFS and Timeshift snapshots causing the system to freeze up? I 'm having this problem where the desktop will freeze for long periods (several minutes), without any particular demand neither on the CPU nor on the disk. It will then unfreeze for a short period and then freeze back.
I once managed to fix the issue by running “timeshift --check”, which got rid of corrupted/incomplete snapshots and resumed my system to normal. But this time the situation was so bad that I had to turn-off using the power button in the middle of the checking/fixing procedure, which caused corruption so bad my system does not boot anymore.
I wonder if anyone else has experienced similar problems while using BTRFS and very frequent timeshift snapshots (I was running mine on an hourly basis).
I ended up with a dirty filesystem, with a number of “extent holes” that do not get repaired even with the --repair option. Will try the “–init-extent-tree” from live distro later.
I wonder how btrfs has some major sponsors such as OpenSUSE and Synology, if it is not stable yet even on a simple notebook. I thought to myself: If Synology uses it on their commercial NAS systems, then it must be stable enough for me to run on a notebook…
Maybe they use some more stable version than we have here at Manjaro??
I warn you that it will take very long, several days. And it probably won’t repair the damage anyway. Do it only if you have a different machine to work on. You can run it only from a different OS (Manjaro live USB e.g.)
I think in a professional scenario the admin will set up proper backup, not just rely on btrfs snapshots. But the snapshots are nice (when they work).
Thanks Eugen. Makes me think twice… The system is working now, despite de dirty filesystem. If I could know that the the existing “extent hole” errors will not propagate causing further problems, I think I would leave them as they are…
Hi Eugen. Still having some freeze issues but they seem fewer and farther in between this week, following what seemed a partially successful “btrfsck --repair” from live distro last week. I now managed to get dmesg output right after the computer returned to normal and found this:
Interestingly, while the computer was frozen, I noticed hardly no disk activity at all (this notebook has a LED that flashes on disk access), so it did not look as if btrfs was doing any house cleaning either. It just completely froze…
Do you understand what this log talks about? (pretty much all of it is marked red in the original output). Thank you so much.
Another weekend toiling with this issue. It is now clear that the freezes relate to the snapshot creation process with Timeshift and btrfs
Hourly snapshots were bringing my computer to a total freeze. I reduced to only at boot, but even that causes the notebook to freeze for like 20 minutes, with a few short hiccups in the meantime, but mostly frozen all the way.
Disk activity remains very low during the freeze and CPU usage also seems to remain low. On some other forums (mostly SuSE) I found users having the exact same problem, but they tend to report high CPU usage with btrfs-transacti and btrfs-cleaner, none of which seem to be installed on Manjaro (I might be wrong).
This seems to be a relatively prevalent matter. I see numerous posts on google but none with a clear solution.
turn off Timeshift’s “Enable BTRFS qgroups (recommended)”
It’s in Settings –> Users. Sorry, cannot recommend this feature.
Then turn off btrfs’ quotas:
‘sudo btrfs quota disable .’
btrfs-transacti & btrfs-cleaner appear to be related to the kernel, they don’t seem to be executables. They certainly are part of manjaro. It’s also difficult to see them in the various monitoring tools, but good old ‘top’ shows them fine.
I found this little quote from the btrfs IRC channel:
Zygo: quota is not on my “btrfs features that work” list
You can check to see if btrfs quotas are turned on using:
‘btrfs qgroup show .’
It is not well documented in the man pages, at least not in my opinion as you need to know to check the btrfs-qgroup man page instead of the btrfs-quota man page.
Without turning off the qgroups feature in Timeshift first, it does little good to turn quotas off manually, as Timeshift will turn them back on.
My system has been suffering for weeks, after figuring this out, it’s been running fine for the last day. No more mysterious freezes. Hopefully it will continue to stick. I figured it was worth weighing in on what I’d discovered, and hope this helps everyone else.
Also, I tried using various kernels 5.10, 5.9, & 5.4 but the problem remained.
I’m having the same problem, basically btrfs and timeshift with quota enabled I had extreme freezes. Eventually my data started to become corrupt, I managed to fix some of the corruption with “btrfs check --repair” but the problem came back. I found this suggestion to disable quota and it seems to have worked so far. No freezes.
Basically I did “btrfs quota disable” and made sure timeshift had it disabled too as above.
I did have the same problem a few years ago. I used snapper and had activated quotas. I did not find a solution then. But today i think the quotas (together with a lot of snapshots) may be the source of the problem.
When i do recall it right, the freezes did occur when the snapshots where to be deleted.
Not 100% sure if the 0B size problem relates to this qgroups fix, something else, or maybe related to adding compress=zstd:1 to my fstab?
Anyone else run into this?
Edit: The above issue is directly related to disabling qgroups, they are required to calculate the snapshot diff sizes it seems. Re-enabling qgroups will cause the missing diff sizes to be displayed/filled in retroactively, but of course will bring back the system freezes. ¯\_(ツ)_/¯
Also there may be other side effects for snapshot deletion when disabling qgroups, check these github issues I found for more info.
I notice the freeze issue when deleting snapshopts in Timeshift in my PC using SSD.
I tested deleting snapshots in the same Manjaro KDE in VM that has no freeze, but it is pretty fast.
I do not know why.
I have been using BTRFS on laptops, homePCs of my extended family and for my homeserver. I follow the best practices and have quotas disabled.
Never had issues for the past 3 years. In my opinion its the ideal filesystem for most home users as it allows the easiest snapshots/restore/backup, working without interrupting the user.
The only thing I had issues with (but this was on Ubuntu) was Snapper. In my opinion you should stay away from it. It used to install in obscure locations (they fixed that), very hard to remove/update and the folder structure it uses for snapshots makes absolutely zero sense and is just confusing. Lead often to filled up disks because snapshots were hidden in subsubsubfolders within the subvolume and never got deleted.
Instead, Timeshift works nicely, is easy to use. Manjaro has integrated snapshots nicely in the Grub boot menu. And for proper backups (a snapshot is only a backup if you move it to a different drive or different system), btrbk is the way to go in my opinion. It can be installed via AUR.
BTRFS quotas is optional functionality. It is not necessary for any of those actions at all. If something is optional, decide why you would need it. I believe Timeshift used to have it enabled by default just because it wants to get the filesize of snapshots for its GUI. That’s not really a valid reason to enable Quotas.
I believe for new installs (not updates) they disable it by default (or I just hope they do).
With quotas/qgroups enabled, deletion of snapshots can become very slow. If you yourself have no specific reason to use Quotas, disable it in Timeshift (if you disable it via btrfs command but its still enabled in Timeshift, Timeshift will enable it again). Timeshift doens’t need it.
I moved many unnecessary heavy resources (e.g. Steam Games and many VMs) from the subvolume @home into a new subvolume @nosnapshot that you don’t need to create a new snapshot.
Make @home snapshots lighter weight and faster.