Is BTRFS unstable? (Update: It's not unstable, the problem is probably something else)

I’m again getting some errors that .zshsomething (it crashed before i could read the whole name) folder isn’t lockable and read only, and that another folder can’t be written into which then leads to crashing.

Should i back up my data and use something else?

I only went with BTRFS because of system restore as that’s supposed to be faster to make a restore point, but it seems i’m getting folder ownership and other errors which never happened on ext4. The nvme drive is new, so i doubt it has corruption on it.

Or is this something else related?

By now i have accumulated some data i don’t wish to lose, so it would be nice to know the problem root before i commit more data to this drive…

System:
  Kernel: 6.1.4-x64v1-xanmod1-MANJARO arch: x86_64 bits: 64 compiler: gcc
    v: 12.2.0 parameters: BOOT_IMAGE=/@/boot/vmlinuz-manjaro-xanmod
    root=UUID=27a6f9c0-8b45-42c5-85e1-be095307048f rw rootflags=subvol=@
    amdgpu.gpu_recovery=1 audit=0
    resume=UUID=b58bc35c-1a93-4c4c-a1ec-5eefe535dea6 udev.log_priority=3
    amd_iommu=on vfio-pci.ids=1002:6658,1002:aac0
  Desktop: KDE Plasma v: 5.26.4 tk: Qt v: 5.15.7 wm: kwin_x11 vt: 1 dm: SDDM
    Distro: Manjaro Linux base: Arch Linux
Machine:
  Type: Desktop Mobo: ASRock model: B550M Pro4 serial: <superuser required>
    UEFI: American Megatrends LLC. v: P2.30 date: 02/24/2022
CPU:
  Info: model: AMD Ryzen 5 5600G with Radeon Graphics bits: 64 type: MT MCP
    arch: Zen 3 gen: 4 level: v3 note: check built: 2021-22
    process: TSMC n7 (7nm) family: 0x19 (25) model-id: 0x50 (80) stepping: 0
    microcode: 0xA50000C
  Topology: cpus: 1x cores: 6 tpc: 2 threads: 12 smt: enabled cache:
    L1: 384 KiB desc: d-6x32 KiB; i-6x32 KiB L2: 3 MiB desc: 6x512 KiB
    L3: 16 MiB desc: 1x16 MiB
  Speed (MHz): avg: 3691 high: 3900 min/max: 1400/4464 boost: enabled
    scaling: driver: acpi-cpufreq governor: performance cores: 1: 2988 2: 3900
    3: 3900 4: 3900 5: 3900 6: 3213 7: 3900 8: 3900 9: 3900 10: 3900 11: 3900
    12: 3002 bogomips: 93422
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: retbleed status: Not affected
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, IBRS_FW,
    STIBP: always-on, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Bonaire XTX [Radeon R7 260X/360] vendor: PC Partner / Sapphire
    driver: vfio-pci v: N/A alternate: radeon,amdgpu arch: GCN-2
    code: Sea Islands process: GF/TSMC 16-28nm built: 2013-17 pcie: gen: 3
    speed: 8 GT/s lanes: 16 bus-ID: 01:00.0 chip-ID: 1002:6658 class-ID: 0300
  Device-2: AMD Cezanne [Radeon Vega Series / Radeon Mobile Series]
    driver: amdgpu v: kernel arch: GCN-5.1 code: Vega-2 process: TSMC n7 (7nm)
    built: 2018-21 pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4
    speed: 16 GT/s ports: active: DP-1,HDMI-A-1 empty: DP-2 bus-ID: 06:00.0
    chip-ID: 1002:1638 class-ID: 0300 temp: 35.0 C
  Display: x11 server: X.Org v: 21.1.6 with: Xwayland v: 22.1.7
    compositor: kwin_x11 driver: X: loaded: amdgpu unloaded: modesetting,radeon
    alternate: fbdev,vesa dri: radeonsi gpu: amdgpu display-ID: :0 screens: 1
  Screen-1: 0 s-res: 5120x1440 s-dpi: 96 s-size: 1354x381mm (53.31x15.00")
    s-diag: 1407mm (55.38")
  Monitor-1: DP-1 mapped: DisplayPort-0 pos: primary,left
    model: AOC Q3279WG5B serial: <filter> built: 2020 res: 2560x1440 dpi: 90
    gamma: 1.2 size: 725x428mm (28.54x16.85") diag: 842mm (33.1") ratio: 15:9
    modes: max: 2560x1440 min: 720x400
  Monitor-2: HDMI-A-1 mapped: HDMI-A-0 pos: right model: AOC Q3279WG5B
    serial: <filter> built: 2020 res: 2560x1440 dpi: 90 gamma: 1.2
    size: 725x428mm (28.54x16.85") diag: 842mm (33.1") ratio: 15:9 modes:
    max: 2560x1440 min: 720x400
  API: OpenGL v: 4.6 Mesa 22.3.1 renderer: AMD Radeon Graphics (renoir LLVM
    14.0.6 DRM 3.49 6.1.4-x64v1-xanmod1-MANJARO) direct render: Yes
Audio:
  Device-1: AMD Tobago HDMI Audio [Radeon R7 360 / R9 OEM]
    vendor: PC Partner / Sapphire driver: vfio-pci bus-ID: 3-2.2:4
    alternate: snd_hda_intel chip-ID: 1235:8200 pcie: class-ID: 0103
    speed: Unknown lanes: 63 link-max: gen: 6 speed: 64 GT/s bus-ID: 01:00.1
    chip-ID: 1002:aac0 class-ID: 0403
  Device-2: AMD Renoir Radeon High Definition Audio driver: snd_hda_intel
    v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4
    speed: 16 GT/s bus-ID: 06:00.1 chip-ID: 1002:1637 class-ID: 0403
  Device-3: AMD Family 17h/19h HD Audio vendor: ASRock driver: snd_hda_intel
    v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 link-max: gen: 4
    speed: 16 GT/s bus-ID: 06:00.6 chip-ID: 1022:15e3 class-ID: 0403
  Device-4: Focusrite-Novation Scarlett 2i4 USB type: USB
    driver: snd-usb-audio
  Sound API: ALSA v: k6.1.4-x64v1-xanmod1-MANJARO running: yes
  Sound Interface: sndio v: N/A running: no
  Sound Server-1: JACK v: 1.9.21 running: yes
  Sound Server-2: PulseAudio v: 16.1 running: yes
  Sound Server-3: PipeWire v: 0.3.63 running: yes
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: ASRock driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1
    port: e000 bus-ID: 04:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: enp4s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  IF-ID-1: virbr0 state: down mac: <filter>
Bluetooth:
  Device-1: Cambridge Silicon Radio Bluetooth Dongle (HCI mode) type: USB
    driver: btusb v: 0.8 bus-ID: 3-2.1:3 chip-ID: 0a12:0001 class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 0 state: up address: see --recommends
Drives:
  Local Storage: total: 698.65 GiB used: 305.19 GiB (43.7%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Kingston model: SNV2S500G
    size: 465.76 GiB block-size: physical: 512 B logical: 512 B speed: 63.2 Gb/s
    lanes: 4 type: SSD serial: <filter> rev: SBI02102 temp: 35.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 860 EVO 250GB
    size: 232.89 GiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    type: SSD serial: <filter> rev: 1B6Q scheme: GPT
Partition:
  ID-1: / raw-size: 448.97 GiB size: 448.97 GiB (100.00%)
    used: 305.19 GiB (68.0%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 608 KiB (0.2%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
  ID-3: /home raw-size: 448.97 GiB size: 448.97 GiB (100.00%)
    used: 305.19 GiB (68.0%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-4: /var/log raw-size: 448.97 GiB size: 448.97 GiB (100.00%)
    used: 305.19 GiB (68.0%) fs: btrfs dev: /dev/nvme0n1p2 maj-min: 259:2
Swap:
  Kernel: swappiness: 30 (default 60) cache-pressure: 50 (default 100)
  ID-1: swap-1 type: partition size: 16.5 GiB used: 0 KiB (0.0%)
    priority: -2 dev: /dev/nvme0n1p3 maj-min: 259:3
Sensors:
  System Temperatures: cpu: 34.5 C mobo: 33.0 C gpu: amdgpu temp: 35.0 C
  Fan Speeds (RPM): fan-1: 0 fan-2: 1109 fan-3: 0 fan-4: 0 fan-5: 0 fan-6: 0
    fan-7: 0
Info:
  Processes: 341 Uptime: 5m wakeups: 0 Memory: 30.74 GiB
  used: 4.32 GiB (14.1%) Init: systemd v: 252 default: graphical
  tool: systemctl Compilers: gcc: 12.2.0 clang: 14.0.6 Packages: pm: pacman
  pkgs: 1835 libs: 499 tools: pamac pm: flatpak pkgs: 0 Shell: Zsh v: 5.9
  default: Bash v: 5.1.16 running-in: yakuake inxi: 3.3.24

I would say no, because using it over years without issues.
Here you can see whats stable and what not:
https://btrfs.wiki.kernel.org/index.php/Status

3 Likes

Well, how do i check for errors then?

I think it’s unstable by design. You can’t get filesystem manipulation features without introducing some instability.

1 Like

Unstable shouldn’t really be a design choice when dealing with something that’s supposed to save your data. :confused:

How can i check my filesystem for errors? Just to see if there’s any problems there.

In my setup btrfs is stable :wink:

(over 30.000 devices with btrfs as root filesystem in use)

  • Sometimes with power off while writing
  • no maintenance done at all

Please be careful with btrfscheck !

You find good Information about Btrfs in the wiki

and in

BTRFS - Checksum errors (updated 2023/10/17)

2 Likes

You can do a fsck on the filesystem and / or btrfs srub (it’s checks validation of files no repair)

https://btrfs.readthedocs.io/en/latest/fsck.btrfs.html

https://btrfs.readthedocs.io/en/latest/btrfs-scrub.html

1 Like

Maybe your custom kernel is cause of instability :thinking:

3 Likes

If you don’t feel confident in using it, stay with ext4. It is stable and won’t let you down.
Create a VM and get used to btrfs inside a save environment.

You can install btrfs-desktop-notification-git from AUR. It notifies you immediately when any Btrfs warn or error message appears from Kernel log. Then you can click the notification button “open” to open any terminal to shows logs.


Can you show us the output journalctl --no-pager -p 3 | grep 'BTRFS'

1 Like

Thanks for the links i’ll check those out. :slight_smile:

Well, fsck returns this:

If you wish to check the consistency of a BTRFS filesystem or
repair a damaged filesystem, see btrfs(8) subcommand 'check'.

And i don’t know how to use the btrfs command, can someone give me an example? It’s looking for arguments, don’t know what to tell it to do exactly.

Not ruling that out. But i’d like to check for file system integrity first. If it’s fine, i’ll change the kernel.

Didn’t know confidence was required to use a filesystem. :wink: I mean, doesn’t it come default on Manjaro? Been a while since the install, i forgot…
I mean, it should just work without my intervention, had i known that i require special something to use BTRFS i’d definitely stick with ext4. But the filesystem might not even be the problem, might be corrupted drive, kernel, who knows… Need to troubleshoot it first.

Ah, cool, thanks for the tip, i’ll install it.

Well…

sij 18 14:51:53 manjaro-linux kernel: BTRFS critical (device nvme0n1p2): corrupt leaf: root=257 block=1000865792 slot=4 ino=1603, name hash mismatch with key, have 0x000000003fa4e081 expect 0x00000000e2e14a39

Sounds serious… :frowning: Is it the kernel or the file system?

Btrfs detected the corrupted leaf key in B-tree.
But the error message is very general and confuses many people why is that, I doubt if the disk is faulty.

Check which file is corrupted:

$ sudo btrfs inspect inode-resolve 1603 /

Run scrub, btrfs-desktop-notification will notify you.

$ sudo btrfs scrub start /
1 Like

//etc/pam.d/chpasswd

Umm, that sounds like an important thing to be corrupted…

Running scrub now, do i need to start the desktop notification somehow? Or is it already running after install?

Also, can i use my computer now that it’s scanning the file system?

Well you can also run:

sudo watch -n1 btrfs scrub status / 

And it will watch the current progress.

Yes… scrub checks the files by its checksum und repair them if possible. Something like chkdsk which just checks for file corruptions, but not corruptions of sectors . That can be done online.

1 Like

Is it possible it’s done already?

UUID:             27a6f9c0-8b45-42c5-85e1-be095307048f
Scrub started:    Wed Jan 18 19:02:19 2023
Status:           finished
Duration:         0:02:21
Total to scrub:   304.83GiB
Rate:             2.16GiB/s
Error summary:    no errors found

EDIT: nevermind, it’s doing something…

How long does this usually take?

Yepp… it took 2 minutes to check and no files are corrupted.

Depends on how fast your computer is and the connection is. 2TB RAID0 BTRFS takes 1,5h for me on HDDs.

Maybe, and that happens sometimes with btrfs. The space cache is corrupted not the file itself. I am not a very expert here, but I did on an “unmounted partition” (not online/offline):

sudo btrfs check --clear-space-cache v1 /dev/sdXY

or for v2:

sudo btrfs check --clear-space-cache v2 /dev/sdXY

No idea when you installed it, but today v2 is the default, v1 previously.

Can you check cat /etc/pam.d/chpasswd?

I guess, this corrupted leaf is gone after auto scrub a few hours ago.

AFAIK, Btrfs reads any corrupted metadata and can auto repair as possible “self-healing” when using the profile “DUP” of Metadata.

The profile “DUP” of metadata is default

Well good to know no files are corrupted… :slight_smile:

If you mean when i installed btrfs, it was a couple of months ago so… V2?


#%PAM-1.0
auth            sufficient      pam_rootok.so
auth            required        pam_unix.so
account         required        pam_unix.so
session         required        pam_unix.so
password        required        pam_unix.so sha512 shadow

That is all ok.

1 Like