External USB SATA HDD enclosure "Buffer I/O errors" on kernel 6.10

I recently upgraded from kernel 6.6 to kernel 6.10 (Linux 6.10.13-3) to fix issues related to hibernation (see my post here).

However, on kernel 6.10, I’m now experiencing issues with my USB SATA HDD enclosure that I use for backups:

Oct 12 15:16:06 host kernel: usb 6-1: new SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:16:06 host kernel: usb 6-1: New USB device found, idVendor=174c, idProduct=5106, bcdDevice= 0.01
Oct 12 15:16:06 host kernel: usb 6-1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
Oct 12 15:16:06 host kernel: usb 6-1: Product: AS2105
Oct 12 15:16:06 host kernel: usb 6-1: Manufacturer: ASMedia
Oct 12 15:16:06 host kernel: usb-storage 6-1:1.0: USB Mass Storage device detected
Oct 12 15:16:06 host kernel: scsi host9: usb-storage 6-1:1.0
Oct 12 15:16:06 host kernel: usbcore: registered new interface driver usb-storage
Oct 12 15:16:06 host kernel: usbcore: registered new interface driver uas
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] Very big device. Trying to use READ CAPACITY(16).
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] 35156656128 512-byte logical blocks: (18.0 TB/16.4 TiB)
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] Write Protect is off
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] Mode Sense: 23 00 00 00
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] No Caching mode page found
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through
Oct 12 15:16:07 host kernel:  sdf: sdf1
Oct 12 15:16:07 host kernel: sd 9:0:0:0: [sdf] Attached SCSI disk
[unrelated messages cut]
Oct 12 15:22:08 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:22:46 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:23:01 host kernel: INFO: task kworker/u50:0:92 blocked for more than 122 seconds.
Oct 12 15:23:01 host kernel:       Tainted: G           OE      6.10.13-3-MANJARO #1
Oct 12 15:23:01 host kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 12 15:23:01 host kernel: task:kworker/u50:0   state:D stack:0     pid:92    tgid:92    ppid:2      flags:0x00004000
[very long stack trace cut]
Oct 12 15:23:01 host kernel: Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
Oct 12 15:23:23 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:23:30 host kernel: sd 9:0:0:0: [sdf] tag#0 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=217s
Oct 12 15:23:30 host kernel: sd 9:0:0:0: [sdf] tag#0 CDB: Write(16) 8a 00 00 00 00 00 00 00 8a f7 00 00 05 3b 00 00
Oct 12 15:23:30 host kernel: I/O error, dev sdf, sector 35575 op 0x1:(WRITE) flags 0x800 phys_seg 1339 prio class 0
Oct 12 15:23:30 host kernel: Buffer I/O error on dev sdf1, logical block 759, lost async page write
Oct 12 15:23:30 host kernel: Buffer I/O error on dev sdf1, logical block 760, lost async page write
[further errors cut]
Oct 12 15:23:30 host kernel: Buffer I/O error on dev sdf1, logical block 761, lost async page write
Oct 12 15:23:30 host kernel: Buffer I/O error on dev sdf1, logical block 768, lost async page write
Oct 12 15:24:01 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:24:32 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
[further resets cut]
Oct 12 15:26:24 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:27:02 host kernel: usb 6-1: reset SuperSpeed USB device number 3 using xhci_hcd
Oct 12 15:27:08 host kernel: sd 9:0:0:0: [sdf] tag#0 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=218s
Oct 12 15:27:08 host kernel: sd 9:0:0:0: [sdf] tag#0 CDB: Write(16) 8a 00 00 00 00 00 00 00 90 32 00 00 07 37 00 00
Oct 12 15:27:08 host kernel: I/O error, dev sdf, sector 36914 op 0x1:(WRITE) flags 0x800 phys_seg 1847 prio class 0
Oct 12 15:27:08 host kernel: buffer_io_error: 1329 callbacks suppressed
Oct 12 15:27:08 host kernel: Buffer I/O error on dev sdf1, logical block 2098, lost async page write
Oct 12 15:27:08 host kernel: Buffer I/O error on dev sdf1, logical block 2099, lost async page write
[further errors cut]
Oct 12 15:27:08 host kernel: Buffer I/O error on dev sdf1, logical block 2106, lost async page write
Oct 12 15:27:08 host kernel: Buffer I/O error on dev sdf1, logical block 2107, lost async page write
Oct 12 15:27:08 host kernel: xhci_hcd 0000:27:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0013 address=0xa7020000 flags=0x0000]
Oct 12 15:27:44 host kernel: xhci_hcd 0000:27:00.3: xHCI host not responding to stop endpoint command
Oct 12 15:27:44 host kernel: xhci_hcd 0000:27:00.3: xHCI host controller not responding, assume dead
Oct 12 15:27:44 host kernel: xhci_hcd 0000:27:00.3: HC died; cleaning up
Oct 12 15:27:44 host kernel: usb 6-1: USB disconnect, device number 3
Oct 12 15:27:44 host kernel: buffer_io_error: 1837 callbacks suppressed
Oct 12 15:27:44 host kernel: Buffer I/O error on dev sdf1, logical block 3961, lost async page write
Oct 12 15:27:44 host kernel: Buffer I/O error on dev sdf1, logical block 3962, lost async page write
[further errors cut]
Oct 12 15:27:44 host kernel: Buffer I/O error on dev sdf1, logical block 3969, lost async page write
Oct 12 15:27:44 host kernel: Buffer I/O error on dev sdf1, logical block 3970, lost async page write

The drive is practically rendered unusable, each and every drive-related task (e.g. fsck) takes forever and ultimately crashes.

However:
The drive works fine when booting the system with kernel 6.6 (so that’s my workaround for now).

Are newer kernels known to have issues with USB devices (or SATA devices connected via USB or with ASMedia chipsets)? Is there a fix?

Help would be much appreciated.

As it seems related to the async IO and therefore the filesystem cache - you could test using the udev-usb-sync scripts.

Either get it from Frede H / Udev Usb Sync · GitLab or build the custom package from AUR.

pamac build udev-usb-sync

Strict limit of write cache / 0s sync time policy for usb devices by default
Search results for 'udev-usb-sync' - Manjaro Linux Forum

I don’t think so. The usb reset here:

which blocked a kernel process:

It seems to me that the driver waits for the usb controller.

At the end, that is a force disconnect by the host (or user):

Which result in unwritten data.

Also, here very important:

So it seems there is a change in the code of the kernel which doesn’t work good with this specific USB controller.

Or it is now a coincidence that it works under 6.6, but it is actually a loose connection. Which is also possible. :man_shrugging:

I’m actually not entirely sure what’s going on.

I tested the device using kernel 6.10 again today, and it operated normally for about 10-15 minutes before throwing errors again. Yesterday, it threw errors almost right from the start on multiple runs.

However, on kernel 6.6, it seems to never throw any errors (also tested this again today, drive operated normally for over 1h).

One thought, which may or may not be relevant.

I’m currently trying to recover as much data as possible from a badly-scratched blue ray disc, using a USB-attached reader. I’m getting similar sorts of messages (albeit related to reading) while dd is trying to access the scratched areas, such as:

Oct 13 13:33:54 antwerp kernel: Buffer I/O error on dev sr0, logical block 5181896, async page read

I wonder if you might have a damaged disc and the errors are appearing when you’re trying to write to the damaged areas? This would probably be at random times, so it might be just coincidence that it’s when you’re using 6.10.

I wonder if badblocks might reveal something useful? According to its manpage, it’s best run via e2fsck using the -c flag. e2fsck’s manpage says specify -c twice to scan using a non-destructive read-write test.

There again, this may be a complete red herring.

I have 2 links:

Which describes your problem. At least, it is a special ASMedia chip problem on Windows and Linux. Driver update includes a workaround/quirk.

On Linux, I found this:

But it is not modified. However, it is known, that this device has an unusual behavior which needs a quirk. US_FL_NEEDS_CAP16 is needed because this controller gets wrong capacity when a device sends it in 32bit, while it needs it in 16bit (so far I understand) and that results then in wrong behavior (device and controller are not able to communicate).

However, I can’t tell you why it doesn’t work in 6.10.

Try a device which lower than 2TB or USB2.0 devices. These one should work, at least, I believe.

2 Likes

Thanks, but these are more than 10 years old and don’t contain any usable findings, let alone solutions. The first one (from the kernel mailing list) already came up during my investigation prior to posting here on the forum.

Not an option. I’m using the external enclosure for backups, thus it needs to be both large enough (in terms of storage space) and fast.

In the meantime, I tried a couple of things, e.g. different USB cables, different power brick, but the issue remains.

Still, the only concrete clue I got is that everything works fine on kernel 6.6, but not so on kernel 6.10 (a simple fsck right after attaching the enclosure is enough to provoke errors).

2 Options:

  1. Stay on v6.6 which goes EOL Dec 2026. Plenty of time.
  2. Report the issue upstream to the devs and hope that someone has that old chipset and is able to fix it.

That is my current workaround - I have both kernel 6.6 and 6.10 installed and use 6.6 for backups.

I might do that, but I honestly don’t think that there is much to gain there. As you said, you’d have to hope that someone has the exact same device/chipset (which is around 10 years old).