Ok so now you have the solution and you spent all this time, you will leave and say “OK I don’t need to waste your time guys, problem solved”? And you will not explain anything?
Im sorry, you really helped me. Thank you. I created a new topic here: Over 100GB of timeshift snapshot files. RSNYC, because the actual topic was something else and probably no one would come down here. I used the combined information of you (@omano) and @MrLavender to formulate an answer. I hope this is in your interest. Please comment the post or help people there, because i could just tell them what i’ve learned here and like i wrote there: please corrigate the post if something is wrong or missing.
For people who don’t know how to do the basic cleaning, have a look into this: System maintenance - ArchWiki
Speaking of the journal
You can vacuum the storage (before the snapshot) with
sudo journalctl --vacuum-size=50M
You can limit the storage size by editing the file /etc/systemd/journald.conf and set the maximum log size
SystemMaxUse=500M //not less recommended
I recommend to keep ONE. If after an update you have an issue with a specific package, you can try to reinstall the previous version, and make a report that the current package has an issue, reinstalling the previous one fixes it.
I would recommend our WIKI instead:
Also I don’t know why you reopened a thread as it is basically the continuity of Why are so many programs appearing when I search for them? but Here you start with the solution. It makes it cleaner as there are lot of posts there, but maybe a moderator will not see it the same way.
I would not go below 500MB, personally. That is the sensible minimum for debugging/troubleshooting purposes - at least a couple of days in the past are important.
If one does not care for logs from previous boots, there is
Storage=volatile for that. But 50MB is absolutely useless.
It is good before you take a snapshot, to vacuum all the logs before the snapshot so it doesn’t waste space for the snapshot.
//EDIT: when I take a snapshot I remove all the cache for
pacman and Pamac, for the AUR builds, things like that whatever is not needed to restore the system snapshot I remove, including journals.
Related to this hardlink behaviour, i think its best to first delete the Snapshot and then create the new Snapshot… if we do it vise versa, there will be a hardlink on the newest snapshot that leads to a not existing File(s), right?
(btw. i didnt read the whole topic, i hope this problem wasnt discussed already)
No, you’re confusing them with symlinks.
It’s like a house with multiple doors, if you block one off you can still access the house through the other door(s). When you block off all doors, it’s marked for deletion/reuse.
I think i still dont really understand what Hard Links are…
It still looks some kind of a shortcut to me, because also in the wiki article (to bad its only in german) is written:
Eine verbreitete Anwendung von harten Links ist die Erstellung von [Schnappschüssen]. Dabei werden statt einer vollständigen Kopie aller Dateien (complete backup) nur neue oder geänderte Dateien gesichert (incremental backup) und ältere Dateien als harte Links auf bereits vormals gesicherte Dateien (backup set) repräsentiert. Da harte Links kaum Speicherplatz benötigen, wird gegenüber einer vollständigen Sicherung entsprechend signifikant wenig Speicher benötigt, trotzdem lassen sich alle Veränderungen an einem [Verzeichnisbaum] rekonstruieren.
Translate it means, that this file replaced with Hart Links, which also use less filesize, when its the same.
So as a sample, when i create a snapshot and i have already 2 snapshots in Timeshift,
the Third Snapshot includes a few hardlinks that are linked to the 2 older snapshots because some files was identical.
But what happends, when i delete this 2 older snapshots now and click on restore on my last available snapshot (the Third)? Which was created with all that Hard Links.
When a hardlink is there to save diskspace, how can the last snapshot restore this files (hardlinks)
without data corruption?
When i compare this to my example, there is no House anymore, the doors may exists but all doors can only lead to fragments now… (fragments = only when the files was identical in the older snapshots, because all other files gets restored correctly if they was changed)
To be clear this regards hard links (as used by
rsync), not btrfs snapshots. This is roughly the way I think about it.
A file consists of the contents and a hardlink, so all files start with 1 hardlink. You can add more hardlinks, each hardlink points to the same contents, but with a different name/location etc within the filesystem. If you delete one hardlink the others and the content remain. There’s no difference between the original hardlink and any extra hardlinks.
file1 → content of file1
file2 → content of file2
file1 → content of file1
file2 → content of file2
The snapshots are identical. Delete either snapshot the other remains unchanged. You save space as you don’t need 2 copies of the content. In this context there are no fragments. There’s either 1 or more hardlinks plus the contents of the file…or nothing.
Contrast that with not using hardlinks:
file1 → content of file1
file2 → content of file2
file1 → copy of content of file1
file2 → copy of content of file2
Hopefully that makes sense.
This is not completely accurate, if memory serves it’s got something to with inodes, but I don’t pretend to understand it on that level.
So in short, you can delete any snapshot, all the other will remain functional.
No, if a file has been deleted, then it will not include a link to it. It can’t, because a hard-link is only another name for an existing file. If the file does not exist, then you cannot create a new path to it.
See my tutorial regarding UNIX filesystems…
Im using rscync too btw.
Thanks for your time and to try to explain it to me, but i can’t follow your example, i think at this point im just concede before i waiste someone’s time and just give it up to understand the function behind it and book it under: its just magic
I just dont get the point, how disk space can be saved with multiple snapshots, i mean the real content must be saved somewhere, if not directly into the snapshot1 or snapshot2 folder… just in case you delete randomly all other snapshots till one left.
I think the best way for me to finally undestand it, by checking propertys from all folders. At this moment and from my understanding there must be a secret snapshot0 folder somewhere (and this folder auto sync when i delete a random snapshot), and this folder with the mainfiles only gets fully deleted when “all other snapshots” gets deleted too.
One Folder to rule them all, One Folder to find them, One Folder to bring them all and in the darkness bind them.
Atleast this is how my brain works to save diskspace and how its a save card to blindy delete random snapshots in timeshift rsync.
Creating a new topic without bothering to link to this thread is inconsiderate of all the time our helpful volunteer users have spent to help you. I’ve merged the threads and the posts are in chronological order. The solution is also unmarked as obviously it did not solve your issue.
Read… my… tutorial…!
Seriously, I’m not kidding. It’s all in there.
Concretely, in UNIX, the filename is not stored in the file itself, nor in in any file table, as is the case in DOS or Wintendo. In UNIX, a file is identified by its
inode, and the
inode contains a link counter. For most files, this link counter will have the value
1, because there is a directory somewhere that contains an entry to a filename, with a pointer next to it, pointing at the
inode in question. Therefore, every filename is itself a hard-link.
So, if you create another hard-link to a file, then you are incrementing the link counter in the
inode by adding another name to the file, in some directory.
timeshift works is that the target volume contains a directory
./timeshift, with in it, a directory
snapshots. Below that, you will find directories with the timestamp of when they were made as their name.
[nx-74205:/dev/pts/3][/root] # mount /mnt
[nx-74205:/dev/pts/3][/root] # cd /mnt/timeshift/snapshots/
[nx-74205:/dev/pts/3][/mnt/timeshift/snapshots] # ls -l
drwx------ 1 root root 112 Aug 31 12:53 2023-08-31_12-50-00
drwx------ 1 root root 112 Sep 2 17:11 2023-09-02_13-05-00
drwx------ 1 root root 112 Sep 2 17:11 2023-09-02_17-10-00
drwx------ 1 root root 112 Sep 13 12:55 2023-09-05_09-15-00
drwx------ 1 root root 112 Sep 10 12:53 2023-09-06_19-40-00
drwx------ 1 root root 112 Sep 7 20:23 2023-09-07_20-20-00
drwx------ 1 root root 112 Sep 9 20:16 2023-09-09_17-30-00
drwx------ 1 root root 112 Sep 10 15:02 2023-09-10_14-50-00
drwx------ 1 root root 112 Sep 13 12:55 2023-09-12_01-25-00
drwx------ 1 root root 112 Sep 12 16:14 2023-09-12_16-10-00
drwx------ 1 root root 112 Sep 13 12:54 2023-09-13_12-40-00
drwx------ 1 root root 112 Sep 14 19:10 2023-09-14_19-05-00
If I pick one of those directories — say, the most recent one — then we get this…
[nx-74205:/dev/pts/3][/mnt/timeshift/snapshots] # cd 2023-09-14_19-05-00/
[nx-74205:/dev/pts/3][/mnt/timeshift/snapshots/2023-09-14_19-05-00] # ls -l
-rw------- 1 root root 1141 Sep 14 19:05 exclude.list
-rw------- 1 root root 294 Sep 14 19:10 info.json
drwxr-xr-x 1 root root 116 Mar 31 09:12 localhost
-rw------- 1 root root 70625027 Sep 14 19:08 rsync-log
-rw------- 1 root root 60830 Sep 14 19:08 rsync-log-changes
localhost directory in that list is the one with the hard-links and files. But it doesn’t matter whether a file was deleted between the penultimate snapshot and the last snapshot, because the penultimate snapshot still contains a link to the original copy of the file.
This file itself is not stored in the directory — only its name is — and as long as the link counter to the file is not
0 — i.e. there’s still a directory entry somewhere that has a name and a link pointer for it — the file will continue to exist. If there isn’t, then the link counter becomes
0, and then the file is considered deleted, and its extents (blocks) will be freed for reuse by something else.
timeshift only makes the initial copy, and from there on, with every subsequent snapshot, it looks whether the original file — the one on the storage medium that is being backed up — has been modified, yes or no.
If it has been modified,
rsync, because that’s what
timeshiftis using in the background — will make a new copy of the file.
If the file has not been modified, it will make a hard-link to the original copy of the file.
If the file has been deleted in the meantime, it will neither be copied nor hard-linked.
If the file did not exist yet in the previous snapshot, then it will be copied.
Yoo annastanda now, orra yoo wanna me to drawa a peecture?
From my viewpoint this is a pretty new/tough topic… because i never looked into Unix Filesystem yet and not even try to understand it fundamentally.
Its a new world when someone like me, who is growing up with a 486 PC and after using Dos/Windows for 30years and looking today behind the horizon.
I would lie, when i said i understand everything right now, but i can pretty close follow you.
I don’t mind if you want to draw me a picture
Well, this is a bit off-topic, but you could say that UNIX and DOS/Windows evolved from opposite ends of the spectrum.
DOS was an unauthorized 16-bit rewrite (by Tim Paterson of Seattle Computer) of the 8-bit CP/M operating system developed by Gary Kildall of Digital Research, and CP/M didn’t even support hard disks or directories. Bill Gates bought 86DOS — as it was called back then — from Paterson for USD $63’000, rebranded it to MS-DOS, and offered Paterson a job at Microsoft.
Windows was a graphical user interface that ran on top of DOS, and as of Windows 3.x, its appearance was modeled after OS/2 1.x, which was developed by IBM in cooperation with Microsoft. But OS/2 was a real operating system, whereas Windows was only a shell around DOS. And up until Windows NT came along — but still including Windows 95, 98 and ME, which were DOS-based — Windows ran everything in ring 0 of the processor — i.e. the kernel ring — and all in the same address space. Programs could therefore overwrite each other’s in-memory data, and a single misbehaving program could crash the whole machine.
UNIX on the other hand was developed on a minicomputer — a machine big enough to cover an entire wall — as a scaled-down version of Multics, which was itself a mainframe operating system.
So UNIX was already far more sophisticated from the first moment on, offering multitasking and concurrent multi-user access via time-sharing, with security and access control built into the filesystem itself by way of the user/group/others permissions model and file ownership, and process privilege and memory separation by way of the processor’s privilege rings. If a program misbehaved, then the worst that could happen was that the program would crash and lose its own unsaved data, but that was about it. The rest of the system would continue functioning normally.
Then some people at MIT didnt like passwords so kept leaving them empty.
(legend has it those same people still recommend no passwords on things like your wireless network)
Well, more correctly, they set a blank password. One has to have a password set in order to be able to log in, and a blank password is a password too.
Modern UNIX systems like GNU/Linux do however allow for a minimum password length to be required — to be determined by way of a PAM setting, and one can even make it far stricter by requiring certain combinations of uppercase, lowercase, numbers and special characters.
Windows users hate UNIX.