IO error on btrfs sub-volume, how to recover?

Dear all,

I’m running current Manjaro w/ a 5.16 kernel on a Ryzen 9 5900HX system. My main drive is a btrfs-formatted NVME SSD. While playing with my KVM disk images, I tried to copy a large file, but that fails with an IO error.

rsync: [sender] read errors mapping "/home/christian/VM-Storage/win10.qcow2": Input/output error (5)

It tries to copy the 100G file but fails eventually. Most data is ok, I can even boot that VM. The Windows filesystem isn’t completely used, so there might be a chance that most if not all of my data is safe.

Now, how can I get rid of that IO error? I understand that fsck isn’t recommended with btrfs, and had btrfs scrub run on the (mounted) fs - didn’t help.

What should I do?

This is not completely clear to me.

  • Does the windows VM residue on a btrfs filesystem ?
  • What does btrfs scrub tell ?
  • Can you copy the VM image to /dev/null ?

The error of rsync may come because reading fails OR because writing fails !

  • Where did you try to write the VM onto ?
  • Did it fail because of nospace ?
  • Is this a btrfs RAID0 or RAID1 ?
  • what does btrfs filesystem usage /home tell ?

Please provide Information:

You probably need to rebalance your btrfs filesystem. :arrow_down:

The NVMe SSD is btrfs-formatted with a number of subvolumes.

$ sudo btrfs subvolume list -t /
ID	gen	top level	path	
--	---	---------	----	
256	11896	5		@
257	11896	5		@home
258	11896	5		@cache
259	11896	5		@log
270	11733	256		var/lib/docker/btrfs/subvolumes/b6ca7cf06d6b4d41f8ecf6be35623fd4cdddb61aa98850001eebd0da48a95674
271	11733	256		var/lib/docker/btrfs/subvolumes/46e38d188bee2f3bfa790ed57ef31be6574ce71ee024b87ffbdd4915a5fea005
272	11733	256		var/lib/docker/btrfs/subvolumes/682d95db8f7dde23e1c5c34143de0edeebda178e75540b981b4afcb2e3e27ef0-init
273	11733	256		var/lib/docker/btrfs/subvolumes/682d95db8f7dde23e1c5c34143de0edeebda178e75540b981b4afcb2e3e27ef0
280	11780	5		timeshift-btrfs/snapshots/2022-02-20_10-38-29/@
281	11780	5		timeshift-btrfs/snapshots/2022-02-20_10-39-00/@
282	11780	5		timeshift-btrfs/snapshots/2022-02-24_09-38-46/@

The VM storage file is located under my user’s home dir. I initially ran into the IO error when I tried to cp the file and tried to use rsync as an alternative.

scrub result

Running scrub gives me:

$ sudo btrfs scrub start -B -r /dev/nvme0n1p2
scrub done for b7a8a09e-5bed-46bc-99f3-5284934ef9fb
Scrub started:    Thu Feb 24 16:32:30 2022
Status:           finished
Duration:         0:00:57
Total to scrub:   185.02GiB
Rate:             3.24GiB/s
Error summary:    csum=1
  Corrected:      0
  Uncorrectable:  0
  Unverified:     0

Actually no errors are displayed, I had a number of “uncorrectable” previously. However, there’s a number of warnings/errors in dmesg.

[ 7946.082476] BTRFS info (device nvme0n1p2): scrub: started on devid 1
[ 7988.988319] BTRFS warning (device nvme0n1p2): checksum error at logical 146570256384 on dev /dev/nvme0n1p2, physical 147652386816, root 257, inode 53306, offset 17113878528, length 4096, links 1 (path: christian/VM-Storage/win10.qcow2)
[ 7988.988325] BTRFS error (device nvme0n1p2): bdev /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 1438, gen 0
[ 8003.157295] BTRFS info (device nvme0n1p2): scrub: finished on devid 1 with status: 0
[ 8096.115335] BTRFS warning (device nvme0n1p2): csum failed root 257 ino 53306 off 17113878528 csum 0x7a427bdd expected csum 0xdc789b4c mirror 1
[ 8096.115343] BTRFS error (device nvme0n1p2): bdev /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 1439, gen 0
[ 8096.129440] BTRFS warning (device nvme0n1p2): csum failed root 257 ino 53306 off 17113878528 csum 0x7a427bdd expected csum 0xdc789b4c mirror 1
[ 8096.129449] BTRFS error (device nvme0n1p2): bdev /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 1440, gen 0
[ 8096.131429] BTRFS warning (device nvme0n1p2): csum failed root 257 ino 53306 off 17113878528 csum 0x7a427bdd expected csum 0xdc789b4c mirror 1
[ 8096.131432] BTRFS error (device nvme0n1p2): bdev /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 1441, gen 0
[ 8096.134617] BTRFS warning (device nvme0n1p2): csum failed root 257 ino 53306 off 17113878528 csum 0x7a427bdd expected csum 0xdc789b4c mirror 1
[ 8096.134620] BTRFS error (device nvme0n1p2): bdev /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 1442, gen 0

There’s enough free space:

$ df -h /home 
Dateisystem    Größe Benutzt Verf. Verw% Eingehängt auf
/dev/nvme0n1p2  923G    196G  725G   22% /home

Yet I can’t copy the file:

$ cp win10.qcow2 win10.backup.qcow2
cp: Fehler beim Lesen von 'win10.qcow2': Eingabe-/Ausgabefehler

I tried the rebalancing but it didn’t help. Thanks nontheless.

This is not what i asked for :wink:

We need:

sudo btrfs filesystem usage /home

This will show internal infos about your btrfs volume !

Sorry, my bad. Here you go.

$ sudo btrfs filesystem usage /home
Overall:
    Device size:		 922.42GiB
    Device allocated:		 198.02GiB
    Device unallocated:		 724.39GiB
    Device missing:		     0.00B
    Used:			 195.41GiB
    Free (estimated):		 724.81GiB	(min: 362.62GiB)
    Free (statfs, df):		 724.81GiB
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		 284.48MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:194.01GiB, Used:193.59GiB (99.78%)
   /dev/nvme0n1p2	 194.01GiB

Metadata,DUP: Size:2.00GiB, Used:934.61MiB (45.64%)
   /dev/nvme0n1p2	   4.00GiB

System,DUP: Size:8.00MiB, Used:48.00KiB (0.59%)
   /dev/nvme0n1p2	  16.00MiB

Unallocated:
   /dev/nvme0n1p2	 724.39GiB

So you have enough unallocated space left :+1:

This is not completely clear to me.

  • Does the windows VM residue on a btrfs filesystem ? YES
  • What does btrfs scrub tell ? OK
  • Can you copy the VM image to /dev/null ?

The error of rsync may come because reading fails OR because writing fails ! reading fails

  • Where did you try to write the VM onto ? same disk
  • Did it fail because of nospace ? No
  • Is this a btrfs RAID0 or RAID1 ? RAID0
  • what does btrfs filesystem usage /home tell ? enough room

Conclusion:

The file has at least one block which is not clean

  • wrong checksum (but scrub did not report any ???)

  • or read-error (this would mean your ssd has read-errors :cold_sweat:)

  • Do you have a snapshot of this file ?

  • Do you have a backup of this file ?
    If not you could try to copy what is left. There exist programs that can read what is left of a broken file even when I/O-errors appear. But i do only photorec (which works for FAT))

For the future you may consider using btrfs with RAID1 :wink:

There’s an IO error when reading the file. No matter where I try to write to. (Even tried with another device) Why should writing to /dev/null make a difference?

However, here you go:

$ cat win10.qcow2 | pv > /dev/null
cat: win10.qcow2: Eingabe-/Ausgabefehler
15,9GiB 0:00:04 [3,73GiB/s] [    <=>                ]

This was only to exclude all other possible errors :wink:

Wie gesagt, es kann ein einzelner Block sein. Und dann kann es sein der VM ist das völlig egal.
Man braucht halt ein Programm das die Datei trotz read-error in einzelnen Blöcken voll bis zum Ende liest.
Photorec wird zur rettung von SD-Karten verwendet. Aber wenn ich das richtig weiß, gab es da in Verbindung damit ein Programm, das die Datei häppchenweise liest. Wenn ein Lesefehler auftritt, wird nicht der nächste Block probiert, sondern irgendein anderer. Das Programm arbeitet sich durch bis es so viele Blocks als möglich gerettet hat.
Das Programm kann dann z.B. nach 3 Stunden abgebrochen werden. Dann sind die geretteten Daten da, und der Rest sind Nullen.
Das Problem ist dass jeder IO-Error relativ viel Zeit braucht. Deswegen ist es nicht vorteilhaft nach einem Lesefehler gleich den nächsten Block zu lesen (zumindest war das bei harddisks so)

Viel Glück
As said, it can be a single block. And then it may be that the VM doesn’t care at all.
You just need a program that reads the file through to the end in individual blocks, despite the read error.
Photorec is used to recover SD cards. But if I remember correctly, there was a program associated with it that reads the file block by block. If a read error occurs, the next block is not tried, but some other one. The program works its way through until it has saved as many blocks as possible.
The program can then be aborted after e.g. 3 hours. Then the saved data is there, and the rest are zeros.
The problem is that each IO error takes a relatively long time. Therefore it is not advantageous to read the next block immediately after a read error (at least that was the case with hard disks)

Much luck

rsync can copy the first readable part with -P but not more :frowning:

P.S.: The best tool would have been ddrescue

DDRescue-GUI - A simple GUI (Graphical User Interface) for ddrescue.
Ddrescueview - A graphical viewer for GNU ddrescue mapfiles.
Ddrutility - A set of tools designed to work with ddrescue to aid with data recovery.

Refound by surfing on Tools Overview | root.nix.dk

So I was able to find two ways of copying corrupt data:

rsync -Ph --progress <source> <dest> # -h and --progress only for pretty output

will copy the data but end with an error. The file remains at <dest> nonetheless.

dd if=<source> of=<dest> conv=noerror,sync

will also copy the file, and reports a number of errors as it goes along.

In the end, both resulting copied files have the same byte size, but different md5 sums.

I then deleted the corrupt file, and wrote back one of the copies I created. Worked without a problem, I was lucky.

I also ordered a replacement SSD, just in case.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.