BorgBackup Shows More Data Than There Should Be

After having used BorgBackup 3x to backup my /home (excluding any first-level hidden files/directories), I have noticed upon completion of the most recent backup that the “All Archives” size is stated to be ~700GB. I checked the size of the drive containing the repo with Thunar. The properties tab says its >600GB. My entire /home (including hidden files/directories) is just over 200GB. I though Borg was supposed to deduplicate files. I haven’t added much data to my /home since the first backup and subsequent backups are supposed to only save the new files. Am I misunderstanding the way Borg works? Here is my Borg code:

[*user*@*device* ~]$ borg create --stats --progress --compression lzma,9 \
> --exclude '/home/*user*/.audacity-data' \
> --exclude '/home/*user*/.bash_history' \
> --exclude '/home/*user*/.bash_logout' \
> --exclude '/home/*user*/.bash_profile' \
> --exclude '/home/*user*/.bashrc' \
> --exclude '/home/*user*/.cache' \
> --exclude '/home/*user*/.config' \
> --exclude '/home/*user*/.deepin-screen-recorder' \
> --exclude '/home/*user*/.dir_colors' \
> --exclude '/home/*user*/.dmrc' \
> --exclude '/home/*user*/.gnupg' \
> --exclude '/home/*user*/.googleearth' \
> --exclude '/home/*user*/.gphoto' \
> --exclude '/home/*user*/.gtk-recordmydesktop' \
> --exclude '/home/*user*/.ICEauthority' \
> --exclude '/home/*user*/.lesshst' \
> --exclude '/home/*user*/.local' \
> --exclude '/home/*user*/.lynxrc' \
> --exclude '/home/*user*/.mozilla' \
> --exclude '/home/*user*/.pki' \
> --exclude '/home/*user*/.profile' \
> --exclude '/home/*user*/.sc_history' \
> --exclude '/home/*user*/.sc-iminfo' \
> --exclude '/home/*user*/.thunderbird' \
> --exclude '/home/*user*/.viminfo' \
> --exclude '/home/*user*/.w3m' \
> --exclude '/home/*user*/.Xauthority' \
> --exclude '/home/*user*/.Xclients' \
> --exclude '/home/*user*/.xinitrc' \
> --exclude '/home/*user*/.xsession-errors' \
> --exclude '/home/*user*/.xsession-errors.old' \
> /run/media/*user*/Elements/backup::backup-{user}-{now} /home/*user*/
------------------------------------------------------------------------------                                                                                                                
Archive name: backup-*user*-2020-09-05T15:21:55
Archive fingerprint: 619a3b80b012840567572f795d281130bcac9c9b2b54fc0d1c9d2ecc401382ec
Time (start): Sat, 2020-09-05 15:21:56
Time (end):   Sat, 2020-09-05 15:22:35
Duration: 39.22 seconds
Number of files: 80880
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:              272.90 GB            236.87 GB             61.79 MB
All archives:              854.79 GB            746.22 GB            233.57 GB

                       Unique chunks         Total chunks
Chunk index:                  166811               538400
------------------------------------------------------------------------------
[*user*@*device* ~]$ borg list /run/media/*user*/Elements/backup
backup-*user*-2020-08-30T22:37:51 Sun, 2020-08-30 22:37:51 [9974e24669872a951b546c93fcbcd8b99f8fc0bbbb4be14f7a436668a0fd2864]
backup-*user*-2020-09-02T16:57:49 Wed, 2020-09-02 16:57:50 [8dce20e003e236a6b8a291816a323cb2ea7e56e6bb6a2d29a62adecdeb734ae3]
backup-*user*-2020-09-05T15:21:55 Sat, 2020-09-05 15:21:56 [619a3b80b012840567572f795d281130bcac9c9b2b54fc0d1c9d2ecc401382ec]
[*user*@*device* ~]$ borg info /run/media/*user*/Elements/backup
Repository ID: 6bceca12db4bf89a8f23e8b19e6237dbd0a30d3d5ca03afda627df5230bc39e0
Location: /run/media/*user*/Elements/backup
Encrypted: No
Cache: /home/*user*/.cache/borg/6bceca12db4bf89a8f23e8b19e6237dbd0a30d3d5ca03afda627df5230bc39e0
Security dir: /home/*user*/.config/borg/security/6bceca12db4bf89a8f23e8b19e6237dbd0a30d3d5ca03afda627df5230bc39e0
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
All archives:              854.79 GB            746.22 GB            233.57 GB

                       Unique chunks         Total chunks
Chunk index:                  166811               538400
[*user*@*device* ~]$
1 Like

In your output it says deduplicated 233.57 GB

I also get surprised by the size of my /home/ backup.
Seems like it’s mostly from changes in offline browser files,
even tho I excluded all the obvious cache locations.
I didn’t exclude some offline files because I didn’t want to exclude settings of plugins etc, and I don’t understand what half of those files are …

What @j77h said but let’s go through the numbers in detail:

                       Original size      Compressed size    Deduplicated size
This archive:              272.90 GB            236.87 GB             61.79 MB
All archives:              854.79 GB            746.22 GB            233.57 GB
  1. All archives, deduplicated size: 236.87 GB is the total space that Borg is currently taking for all of your backups. Please verify this with the following command:

    du --max-depth=0 --human /run/media/*user*/Elements/backup/
    

    (The difference between those 2 is that du measures things in GB, whereas Borg actually uses GiB instead of GB: 1000 versus 1024) :man_shrugging:

  2. All archives: 854.79 GB is the amount of space that is currently backed up before compression and deduplication.
    This is just a theoretical number and gives an indication of the space that would be used by any other backup program if it would be doing full backups all the time without compression or deduplication…
    The same is true for All Archives: Compressed size: that’s just an indication of how much space would be in use if only compressing and not deduplicating.

  3. For your current backup:

                           Original size      Compressed size    Deduplicated size
    This archive:              272.90 GB            236.87 GB             61.79 MB
    
  • Original size is the amount of data to be backed up and should be about equal to your home directory minus the exclusions.
  • Compressed size is the amount it would take if fully compressed (meaning: 14% data saved through compression) again: without deduplication!
  • Deduplicated size is the amount of data actually stored during this backup session and is the delta between this backup and your previous backup, so just 61.79 MB was actually changed in the Borg repository while making this particular backup.
  1. On top of what you’re doing, I also do:

    --exclude "snap" \
    --exclude "Downloads" \
    

    because:

  • snaps generally install 3 versions of themselves (currently I have no snaps, but if I would accidentally install one, I’d want them excluded)
  • Downloads is a temporary directory and I delete everything in there older than 1 month about 1/month as the stuff I need has been copied elsewhere in the meantime. (and if it wasn’t, I’ll download it again!)

:innocent:

2 Likes

Okay, I was interpreting the deduplicated size as referring to the amount of data that was cut from the backup. Your explanation makes much more sense. :+1:

Your du command indeed shows the appropriate size:

[*user*@*device* ~]$ du --max-depth=0 --human /run/media/*user*/Element
s/backup/
218G    /run/media/*user*/Elements/backup/

Thanks for the tip about Snaps. I try to avoid them whenever I can.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.