Difficulty: ★★★★☆
BTRFS doesn’t have an online in-band deduplication feature like ZFS. But it can somewhat save space by setting equal extents in a “shared state” and reference them.
An exemplary view
mkdir -pv /path/to/btrfs/mountpoint/testdir
cd /path/to/btrfs/mountpoint/testdir
# Create 5 zeroed images
for x in $(seq 1 5); do dd if=/dev/zero of=${x}G.img bs=${x}MB count=1024 status=progress; done
# Check the size
$ sudo compsize .
Processed 5 files, 116014 regular extents (116014 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 3% 453M 14G 14G
zstd 3% 453M 14G 14G
# Remove duplicate extents
sudo duperemove -r -D .
# Check the size again
$ sudo compsize .
Processed 5 files, 1179 regular extents (117189 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 3% 4.6M 147M 14G
zstd 3% 4.6M 147M 14G
Comparison:
state | extents | disk usage | references | uncompressed | compressed |
---|---|---|---|---|---|
before | 116014 | 453MB | 116014 | 14GB | 453 MB |
after | 1179 | 4.6MB | 117189 | 147MB | 4.6 MB |
There are now 1175
more references, but the extents and disk usage are heavily reduced while it was already compressed by 97%, so the actual size is on the disk 3% of the real size.
du
or any file manager will still report the full size (uncompressed and non-deduplicated):
$ du -hs .
15G .
A closer look:
$ sudo btrfs filesystem du -s .
Total Exclusive Set shared Filename
14.30GiB 146.81MiB 448.00KiB .
As you see, the total size here is the referenced size in compsize
and the exclusive size the uncompressed, but deduplicated size.
Short exlanation about duperemove
duperemove is still in development and is beta software, but can be considered stable enough for daily usage. In any way, use it on your own risk.
duperemove doesn’t manage any deduplication. What it actually does is gathering information, creating checksums and passing this information to the BTRFS module.
Note that deduplication can increase free space, but the downside is a higher fragmentation rate. So on HDDs it would be recommended to run defragmentation.
Installation
There are 2 AUR packages which can be used: duperemove-git and duperemove-service.
Install them as needed:
pamac build duperemove-git duperemove-service
Or use directly the source:
- GitHub - markfasheh/duperemove: Tools for deduping file systems
- Mek101/duperemove-service - duperemove-service - Codeberg.org
Usage
Loading…
[INFO] Nothing was found. Did you write something here?
[INFO] Write something here so that it can be parsed and printed.
[ERROR] EIO - 5
This is a wiki article. You are free to copy, share, or edit the content without restrictions as long as it doesn’t miss the main topic.