External USB HDD fails to mount with Manjaro 6.1 and 5.x kernels, works fine in Manjaro 4.19 kernel

Thank you, just making sure. I really don’t know anymore, so all I can do is suggest that you run a SMART check on it:

sudo smartctl --test=short

…and when the test is finished provide the complete output of:

sudo smartctl --all /dev/sdf

…not just what’s in your opinion the relevant parts.

OK, here’s the full output

$ sudo smartctl --all /dev/sdf -d usbcypress
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.1.55-1-MANJARO] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Scorpio Blue Serial ATA
Device Model:     WDC WD1600BEVT-00A23T0
Serial Number:    WD-WX21A30W4141
LU WWN Device Id: 5 0014ee 2aeef9e4c
Firmware Version: 01.01A01
User Capacity:    160 041 885 696 bytes [160 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database 7.3/5528
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun Oct 22 13:39:06 2023 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 5580) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  68) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x7037)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   153   152   021    Pre-fail  Always       -       1325
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       182
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       63
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       109
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       62
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       404
194 Temperature_Celsius     0x0022   114   093   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        63         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

The only relevant difference compared to the output I posted at the beginning seems to be

82c81,82
< No self-tests have been logged.  [To run self-tests, use: smartctl -t]
---
> Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
> # 1  Short offline       Completed without error       00%        63         -

Well, I honestly don’t know anymore. Sorry.

:sob:

Perhaps someone else does

Thank you for trying, I appreciate it :pray:

Perhaps the older kernel in Linux Mint knows automatically to use some older usbcypress compatibility when connecting the device, which the newer kernel doesn’t provide anymore?

Even on Linux Mint, smartctl needs the argument -d usbcypress given explicitly for the command to finish, it doesn’t detect the type automatically. Could this be a clue?

You could just install an older kernel to test…:man_shrugging:

It might. However, to what…

:man_shrugging:

Is this recent issue Changes to default password hashing algorithm and umask settings possibly related?

Good idea :+1:

It works great with the linux419 -kernel (Linux 4.19.295-1)! The disk is mounted and everything is fine.

In addition to linux419, I tested these:

  • linux54 Linux 5.4.257-1 → only blank screen right after grub, doesn’t boot
  • linux510 Linux 5.10.197-1 → similar problems as with linux61 (Sense Key : Illegal Request [current])
  • linux515 Linux 5.15.133-1 → similar problems as with linux61

So now the problem seems to be… how to make linux61 connect it like linux419?

:man_shrugging:

I suspect this is related to the recent issue with NTFS/FAT/EXFAT external drives not being mounted; the in-kernel driver ntfs3 wasn’t working as expected in recent kernel versions.

The workaround was to reinstall ntfs-3g and blacklist the ntfs3 driver:

sudo pacman -S ntfs-3g
sudo bash -c 'echo "blacklist ntfs3" > /etc/modprobe.d/disable-ntfs3.conf'
# or, if you prefer:
echo 'blacklist ntfs3' | sudo tee /etc/modprobe.d/disable-ntfs3.conf

Thanks, I tried, but unfortunately no change after these (and reboot) - still the same errors in dmesg

[   42.346861] sd 10:0:0:0: [sdf] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[   42.346863] sd 10:0:0:0: [sdf] tag#0 Sense Key : Illegal Request [current] 
[   42.346864] sd 10:0:0:0: [sdf] tag#0 Add. Sense: Invalid command operation code
[   42.346866] sd 10:0:0:0: [sdf] tag#0 CDB: Read(6) 08 00 00 3f 08 00
[   42.346867] critical target error, dev sdf, sector 63 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 2
[   42.346870] Buffer I/O error on dev sdf1, logical block 0, async page read
[   42.346873] Buffer I/O error on dev sdf1, logical block 1, async page read
[   42.346874] Buffer I/O error on dev sdf1, logical block 2, async page read
[   42.346875] Buffer I/O error on dev sdf1, logical block 3, async page read

I see. It’s difficult to follow so many apparent issues in a single thread. However, that sure smells like a partition or filesystem error. Is the drive seated properly?

Perhaps try fsck to scan and attempt a repair, if needed. See man fsck for usage.

Yes, if I boot into Manjaro kernel 4.19 the external drive works perfectly. On newer kernels (I tested 5.x and 6.1) it fails with the error messages above.

You may also try to start with fallback-initrd (select it in grub)

:man_shrugging:

OK, I selected fallback (kernel 6.1.55-1) in grub menu - no difference, still only works in kernel 4.19

Then I’ve gotta officially say:
:man_shrugging:

@aragorn why remove the Kernel category?

Because it’s not a kernel problem. It may be related to the kernel version, but that doesn’t mean that the kernel doesn’t work. There’s more involved than just the kernel when it comes to mounting issues.

P.S.: Don’t pay too much attention to the category description. :wink:

I pose this question not for the OP but for those who actually might know the answer:

When was the ntfs3 driver introduced to the kernel; or, specifically, which kernel was it introduced in?

Am I correct in assuming it’s not present in 4.19 kernel?

That being the case, my previous suggestion to use ntfs-3g and blacklist ntfs3 should only be applicable to 5.x kernel and upwards, or even 6.1 kernel and upwards.

Can someone either confirm or deny this with due accuracy?

Yes, you are correct. It was only much more recently that ntfs3 became the default. Definitely not before the release of 5.15, I would say.

Thank you.

Then my suggestion for the OP is to again install the 6.1 LTS kernel. Reboot. – and then apply the procedure I previously mentioned:

sudo pacman -S ntfs-3g
sudo bash -c 'echo "blacklist ntfs3" > /etc/modprobe.d/disable-ntfs3.conf'

Or, this as an alternative to forcing bash:

sudo pacman -S ntfs-3g
echo 'blacklist ntfs3' | sudo tee /etc/modprobe.d/disable-ntfs3.conf

Reboot again, and see if the apparent mount failure still persists.

Nothing ventured; nothing gained. Cheers.

1 Like