How to switch to the checksum xxHash instead of crc32c for BTRFS and AMD Ryzen?

Hi everyone ,

Every Manjaro Linux Kernel currently uses the default checksum algorithm crc32c for CPU e.g Intel.

❯ sudo lsmod | grep crc32c

libcrc32c              16384  3 nf_conntrack,nf_nat,btrfs
crc32c_generic         16384  0
crc32c_intel           24576  2

❯ sudo dmesg | grep crc 

[    2.998606] Btrfs loaded, crc32c=crc32c-intel, zoned=yes, fsverity=yes

I know that crc32c-intel supports Intel well, but what about AMD Ryzen?

BTRFS uses the checksum crc32c by default.

❯ sudo btrfs inspect-internal dump-super /dev/nvme0n1p2 | grep csum_type  

csum_type               0 (crc32c)

There is one interesting thing called xxhash that I want to switch. Linux Kernel 5.3+ supports it. I know Fedora supports xxhash for BTRFS.

I tried to add xxhash in MODULES in the config file /etc/mkinitcpio.conf.
Then sudo mkinitcpio -P linux, then reboot.
But the checksum algorithm is not changed , still using crc32c.

❯ cat /sys/fs/btrfs/b83453d-153e-4e0c-9956-65343455/checksum 
crc32c (crc32c-intel)

Does anyone know how to switch from crc32c to xxhash ?

I think I should give up, because many old checksum results were already stored in many existing metadata when I use the filesystem long.
You should not change the checksum algorithm in Kernel for BTRFS, otherwise many old metadata could go wrong.

I would assume BTRFS behaves in the same way as ZFS. Each block contains a pointer/reference its stored-state attributes, such as compression algorithm, compression-level, checksum algorithm, etc.

Therefor, existing blocks should not be affected. They will continue to use the same checksum to verify their integrity. Newly written blocks, however, will use the new checksum (if you apply such a change.)


I am confused by this “kernel” checksum vs. BTRFS checksum, however. Aren’t they two separate things?

That is a good question. I am confused too.

Before I would have thought that if BTRFS reads the parameter “checksum” from Kernel config before installation and then set the defined checksum in BTRFS config, that was my guess. There is no clear document about that.

I would not think this is the behavior. It wouldn’t be appropriate (or even make sense.)

For example, in ZFS, the “checksum” property is defined with ZFS commands. It has nothing to do with the kernel, nor even matter if you use FreeBSD, Linux, or whatever.

I would assume the same thing applies to BTRFS.

I’m not as familiar with BTRFS, but here’s an example of viewing and changing the checksum property for a ZFS dataset. (I edited some of the text to make it less confusing.)

Query the current value:

zfs get checksum myPool/backups/photos

NAME                   PROPERTY  VALUE
myPool/backups/photos  checksum  fletcher4

Set the new value:

zfs set checksum=sha512 myPool/backups/photos

Query the new value:

zfs get checksum myPool/backups/photos

NAME                   PROPERTY  VALUE
myPool/backups/photos  checksum  sha512

There’s likely a BTRFS method of doing the same thing as above.

UPDATE: The terminology is different. A ZFS “dataset” is similar to a BTRFS “subvolume”.

1 Like

Ouch!

Looks like BTRFS doesn’t currently support changing the checksum for existing filesystems. You must define it a the time of creation! :astonished:

For example, mkfs.btrfs --csum xxhash


Apparently, the BTRFS developers didn’t use foresight to deal with different digest sizes. So basically it’s a “hard lock” on the checksum you use at the time of creating a new BTRFS filesystem.

ZFS developers, however, thought ahead, and thus you can easily change the checksum at any time, as many times as you desire, even with different digest sizes.


CRC32 digest is 32bit

xxHash digest is 64bit

2 Likes

Some people said that BTRFS config would be hardcode unlike ZFS config.
But BTRFS would need some converter or rebuild …

Sounds risky, not worth it.

This again illustrates how ZFS is more matured and flexible. I’m serious when I say that we’d all be using ZFS instead of BTRFS if there was no legal issues and it was part of mainline kernel development.


EDIT: We replied at the same time. I explained above why this is not possible with BTRFS, sadly. :pensive:

1 Like

If it means anything, @Zesko, you’re using the “hardware accelerated” method. If you were not, it would instead read: crc32c-generic

The word “intel” can be misleading. It’s still supported by modern AMD CPUs. It might use the word “intel” because the Intel company was the first to introduce and develop it.

Think about “AES-NI hardware acceleration” for encryption. It was originally developed by Intel, yet it is fully supported by AMD CPUs as well.

1 Like

about crc32c and crc32c-intel , there is a difference in benchmark

about algorithm for crc32c

https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums#Benchmarking

I see clearly a difference in benchmark between CRC32C (Software) and CRC32C-Intel (hardware acceleration in CPU)

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.