Re: Ntfs3 keeps corrupting my ntfs partitons

This relates to the thread here. Since it is closed, I was only able to start a new thread. This is it.

I have seen similar symptoms with the ntfs3 driver using the linux66 kernel package.

I have been using the (commercial) Paragon (ufsd) driver in the past on Ubuntu, but decided to give it a go after switching to Manjaro. Not last because their install.sh limits compatibility to kernel 6.4 as the upper boundary.

Alas, it seems mine will be just another thread sharing details about something going wrong. It’s not clear what exactly is going wrong. I only see the symptoms.

So it all started when I downloaded Kali, Tails and some other stuff via qbittorrent. I downloaded them into an NTFS drive mounted read-write via ntfs3 driver. Mount options as per mount output were at the time:

rw,nosuid,nodev,noexec,noatime,uid=1000,gid=1000,dmask=0007,fmask=0117,windows_names,iocharset=utf8,user

I have a particular directory called ISO where I downloaded those images mentioned above and which at the time also contained a dozen or so Windows-related ISOs (downloaded through my VS subscription).

Suddenly and without any explanation I could find, the majority of those Windows-related ISO files were gone. Alright, not ideal but I can download them again. Alas, it also turned out that not all of the files that were currently being downloaded or had just finished downloading would appear in the directory.

“That’s strange”, I thought to myself and started investigating. My first suspicion was that I had accidentally jailed qbittorrent somehow via firejail/firetools. Alas, lsns showed that there were no mount namespaces that would explain the weirdness going on. Some peeking into /proc/$PID for the process in question also didn’t yield anything out of the ordinary.

What’s even weirder, Firefox had the same issue. Because I also wanted to download Tiny Core Linux (also released as of late), I used a browser (they don’t seem to offer torrents).

The individual Firefox processes were in all sorts of namespaces, just no mnt namespaces involved either …

So I tried what one would usually do, I told both Firefox and qbittorrent to open the location of the respective downloaded file(s).

With Firefox I landed in the said ISO directory and it was missing plenty of the files in question. With qbittorrent one of them landed me inside a subdirectory of ISO and according to Dolphin it contained two files. Copying the file paths to the clipboard I was even able to see those files from the terminal. Bummer.

ISO-subdirecory

But inside the alleged parent directory ISO there was no such subdirectory to be seen. Schroedinger’s subdirectory, so to speak.

Alright, since I have a license of the Paragon NTFS driver for Linux, I decided to make a few changes to its install.sh so I could employ its chkntfs utility. For obvious reasons I wanted to refrain from using the ntfsfix utility from Tuxera, because the manual page makes it clear it isn’t equipped to fix certain conditions (and I had not made any negative experiences with chkntfs).

And lo and behold:

$ sudo chkntfs -f --verbose /dev/nvme0n1p1
(Mar  5 2024 21:00:06)
GetMount(/proc/mounts): "/dev/nvme0n1p1" is not mounted
GetMount(/etc/mtab): "/dev/nvme0n1p1" is not mounted
fstat("/dev/nvme0n1p1") returns mode 60660 and size 0
ioctl("/dev/nvme0n1p1",BLKSSZGET) returns 512
"/dev/nvme0n1p1": discardzeroes is not supported by this device
"/dev/nvme0n1p1": disk size 0x1d1c1000000 bytes. ~1863GB, sector size 0x200
Checking Volume /dev/nvme0n1p1...
Type of the filesystem is NTFS.
Serial number 3C8022DA-80229A82.
Volume label is: FASTDATA.2TB.
Verifying 45696 records...
Verifying 5709 file(s) with EAs...
Verifying 361 folders...
Correcting error in index 0x30 "$I30" for file 0x7d79 (ISO).
Updating bitmap attribute "$I30" in file 0x7d79 (ISO).
Sorting index 0x30 "$I30" in file 0x7d79 (ISO).
info: device "/dev/nvme0n1p1" does not zeroes while discarding
Minor inconsistencies detected on the volume. 24 files contain incorrect links count.
Recovering 24 orphan files...
Recovering orphaned file "tails-amd64-6.0-img" (0x34cf) into directory "ISO" (0x7d79).
Recovering orphaned file "tails-amd64-6.0-iso" (0x34d0) into directory "ISO" (0x7d79).
Recovering orphaned file "neon-user-20240228-1346.iso" (0x34d1) into directory "ISO" (0x7d79).
Recovering orphaned file "SHA256SUM.kali" (0x34d2) into directory "ISO" (0x7d79).
Recovering orphaned file "nethunterpro-2024.1-pinephone-phosh-img-tar-xz" (0x34d5) into directory "ISO" (0x7d79).
Recovering orphaned file "tails-amd64-6.0.iso.torrent" (0x5e23) into directory "ISO" (0x7d79).
Recovering orphaned file "nethunterpro-2024.1-pinephone-phosh.img.tar.xz.torrent" (0x5e26) into directory "ISO" (0x7d79).
Recovering orphaned file "kali-linux-2024.1-raspberry-pi5-arm64.img.xz.torrent" (0x5e27) into directory "ISO" (0x7d79).
Recovering orphaned file "nethunterpro-2024.1-pinephone-phosh.img.tar.xz" (0x5e29) into directory "ISO" (0x7d79).
Recovering orphaned file "TinyCore-current.iso" (0x5e2b) into directory "ISO" (0x7d79).
Skipping further messages about recovering orphans.
Recovered 24 orphans.
Verifying files security...
$UpCase file is formatted for use in Windows 7 and later versions
Checking volume bitmap (59617 kBytes)
Free clusters 0x55cf170 - 0x55cf171 marked as used.
Free cluster 0x102093f3 marked as used.
Used clusters 0x2d22 - 0x2d23 marked as free.
Correcting $Bitmap data attribute.
   1863.02 GB total disk space.
   1629.38 GB in 6819 files.
      1972 KB in 363 directories.
    631996 KB in use by the system.
       512 MB occupied by the log/journal file.
         4 KB in each allocation unit.
 488378367 total allocation units on volume (1863.02 GB).
   61090771 allocation units available on volume (233.04 GB).
The volume /dev/nvme0n1p1 has been repaired successfully.

chkntfs returns 2. run 0 seconds

So that was the root cause.


I know the ntfs3 driver is said to be completely rewritten from scratch and Paragon even claims to test its consistency against their other NTFS driver (called ufsd - with u for universal, because it also supports HFS+) and said to derive from their original NTFS for DOS driver, but something doesn’t hold up here.

Now at this point I am well into the realm of speculation, because my Linux kernel endeavors can only be described as superficial, compared to the Windows kernel driver experience. However, it looks like moving files may be the underlying cause as per this Launchpad issue for Ubuntu. I haven’t tried to confirm my suspicion by further risking the data integrity on my drive or validating it in the qbittorrent and Firefox source, but a common way to download something is to download it into a file - Firefox appends .part - and then move it into place. So a move would be involved under these conditions.

It’s still beyond me what then causes the corruption or how. I will try to dig through my logs a bit and hopefully something will turn up which helps to narrow it down further.

PS: generally I found this blog entry while researching this topic and it links to other resources including the Launchpad ticket linked above. But no solution to the file system corruption issue …

1 Like

If this is a multiboot scenario … were you vigilant about properly unmounting in between?
Specifically and to the point - ntfs3 main change that ‘bites’ people is refusing to mount partitions marked ‘dirty’, such as after booting using ‘fast boot’/‘fast startup’, or similar.
Did you boot windoze with fast boot before trying to access ntfs partitions using ntfs3 ?

Yes, everything was properly unmounted and no, there was no Windows boot in between. And no I am not using suspend or hibernate on the Linux or the Windows side either (fast boot and all that crap is also disabled; it’s one of the first things I do after installing Windows).

Also, said symptoms occurred live, right before my eyes. I.e. while Manjaro was running. I had ls’d (well, technically I use lsd but the readdir() call ought to be equivalent) the folder some minutes before, then suddenly as qbittorrent had finished its first downloads the files disappeared both in Dolphin and “from ls’ view” - in hindsight this only reinforces the notion about moving files being the culprit, btw.

I only rarely boot Windows these days, but I do it. Consequently I need a proper file (journaled) system to share between both. exFAT is not that file system and all Windows-side solutions for Linux file systems have been borderline unusable and very very fragile (to the extent that monthly security updates were able to break some of them).

PS: the comment by @omano from here resonates a lot with me:

I wouldn’t work on Linux file systems from Windows, and I would trust more the NTFS3 driver by a corporation that made working on file systems for almost 30 years its core business, than a driver rewritten from scratch on a GitHub project by one man. At the end of the day you do what you want but I don’t see how that makes more sense “the other way around”.

It reflects the position I held just until before I encountered those issues described above. But given even Linus Torvalds describes ntfs3 as still fairly solidly experimental, I think I’ll rather revert to ntfs-3g and live with the somewhat lower performance. Data integrity is also a performance trait to me. Search for (for the source of this quote):

Subject: Re: [GIT PULL] ntfs3: bugfixes for 6.0
Date: Wed, 17 Aug 2022 14:55:38 -0700

I am unable to include links in my replies, it seems.

One more finding. I had copied a few of the ISO files onto another target disk that was also mounted via ntfs3. I did that via rsync and I just ran a chkntfs on that disk and there were no similar issues to those encountered on the drive to which I had downloaded the files.

This is somewhat odd, because I was under the impression that rsync uses a similar scheme of copying to some (randomly named) file during the process but then moving it into place with its desired target name.

If exchanging files between platforms is a core necessity - I suggest using vfat exfat as it is the de-facto file system for exchanging such data.

1 Like

Has OP tried ntfs-3g? GitHub - tuxera/ntfs-3g: NTFS-3G Safe Read/Write NTFS Driver It is in the Manjaro repo.
I’ve been using it for years. For example to move Linux distros and textfiles. Have not noticed any corruption and it should be pretty obvious if it did happen. Distro ISO that is corrupted, well, of course it wouldn’t work. And text-files I assume would get weird letters or symbols. Hard to miss.
For at least a decade I’ve had NTFS and Ext4 partitions and moving files from Ext4 to NTFS.

@linux-aarhus thanks for the recommendation but FAT (no matter which of the versions that are covered by vfat) isn’t able to deliver what I want or need. Hardlinks are a hard minimum requirement for me, the ability to have reparse points on the Windows side is nice (but not mandatory, of course), too. Data safety via a journaled FS is also something I expect these days. To the best of my knowledge none of these are offered by any FAT version supported under the moniker vfat. exFAT is barely better.

My main misconception was that the ntfs3 driver would perform at least as well (not just speed-wise) as the commercial counterpart. And when I had contacted the Paragon support months ago they practically assured me that their performance (ntfs3 vs. ufsd) is on par. This is clearly not the case so if and when it becomes available for a recent Linux kernel I may either try their ufsd driver again and until then I will settle for ntfs-3g.

@zhongsiu yes he has :wink:, and that’s what I am settling for (mentioned here), for now anyway. But I was used to the better performance of ufsd (Paragon’s commercial offering) before and had hoped that the ntfs3 driver contributed by them would perform at the same level. It does not, so I am reverting to the time-proven solution as you suggest, even if it means certain features that I rarely expect on the Linux side won’t be available. Since I also have some Linux setups where I don’t own the machine (and so can’t use my commercial license), I had ntfs-3g also in use on some of those machines and also never encountered data corruptions. The most obvious shortcoming of ntfs-3g is the inability to remount, because I use that a lot. It always takes a umount followed by a mount instead, which also means I have to be able to umount in the first place (remounting something rw or ro however, works without issue even when the mount point is technically “busy”).

Thanks everyone. I was less looking for a solution (I am by now convinced it’s a defect in the ntfs3 driver) than I wanted to make sure to add what bits of information I had gathered in addition to the info already out there.

Yes - there is some limitations with the FAT file system - but with the exFat version (file size > 4GiB) it is quite reliable for platform data exchange.

ntfs drivers outside Windows will always be a reverse engineering of a proprietary file system - with all the uncertainty that brings.

But hey - that is your prerogative

On a personal opinion I ditched Windows as my primary system years ago. The only place I use Windows is a win10 vm running Visual Studio for maintaining a .NET4/MSSQL backend running on IIS10.

2 Likes

Allow me to chip in here: it is not a matter of prerogative! You can’t offer a file system driver that looses data. The ntfs3 driver is faulty, at least according to my and Vidarr and other users’ findings.

Before other users find out that they lost their data, it would be wise to temporarily disable or remove the ntfs3 driver until the issue has been fixed. Anything else would be deliberately causing data corruption, with all its consequences.

1 Like

I most certainly is - you choose to use it - NTFS - despite the risks - this is your choice.

May I remind everyone, the Linux kernel is GPL

  • read on to understand the implications
  • in this context you have no right to complain or demand …

https://www.gnu.org/licenses/gpl-3.0.en.html

https://www.gnu.org/licenses/gpl-faq.html

1 Like

Aside:- Some torrent applications only move finished file(s) to the target directory. In the interim, while files are downloading, they remain in a subdirectory which is often hidden. This could also explain your experience, based solely on these quoted portions of your post. qBittorrent, for example, does this (manually configurable).

Interestingly, this ‘corporation that made working on file systems for almost 30 years its core business’ most likely grew from the initial efforts of one man working on a driver from scratch.

It’s also possible a driver might not have fundamentally changed in 30 years, and the focus remains on a yearly redesign of the GUI for marketing purposes.

:endofmusings

This might be of related interest:

Cheers.

Apologies, I didn’t initially notice this thread is getting a little long-in-the-tooth.

1 Like

As of late, so to speak. Depending on the distro (and no, I am not solely using Manjaro where I get the latest stuff), exFAT can be an issue. Not by itself, but once you need to work with a file system that wasn’t cleanly unmounted etc.

However, since Microsoft opened up the specification for exFAT in 2019, I guess it is the safer bet for the future.

While that is true, having worked with the commercially offered driver by Paragon before, I was quite surprised to find that the upstreamed one was causing such issues.

As I understood it the company did a rewrite for this driver whereas the commercial one was adapted from their original NTFS for DOS driver. But still they literally have decades of experience with NTFS and must have accumulated many many test cases (which they claim both drivers run against). And since they don’t keep pace with the Linux kernel the way Manjaro does, the commercial driver simply won’t build and link against the kernels that I get with latest Manjaro (stable).

On the other hand it feels like the “fairly solidly experimental” aspect that Mr. Torvalds so candidly voiced doesn’t come exactly out when consuming the kernel through a distro.

I get your points and also the point about the GPL is well understood - although not everyone is a techie and not all techies “speak” C etc. (and even techies only get 24 hours per day :wink:) - which is why I posted this as a warning to other users. My “complaint”, if you will, is less about the fact that the driver may not “be there” in terms of data integrity and so on, rather than how this isn’t communicated when you start using it. Btw, the Linux kernel is under GPL2, not 3.

Either way I am grateful you and others have shared their experience and advice. Thank you!

@soundofthunder thank you for pointing that out. Perhaps qBittorrent would then be a good way to reproduce this. I shall give it a try in a VM, perhaps with different versions of the kernel/driver. On that other point: according to Paragon themselves, the drivers don’t share the same lineage (the GPL’d one seems to be a rewrite, whereas the commercial one traces its roots to their NTFS for DOS driver).

I’m aware of the licenses. I am also very aware of the difficulties for developers to develop NTFS drivers under Linux.
Where I do not agree with is how lax this serious bug is dealt with. The whole world was upside down when the xz package contained malicious code. Github went as far as to ban the original maintainer, although he was just a victim. Yet the xz vulnerability hardly affected any user (luckily it seems to have been caught in time). The original maintainer and the Linux community reacted quickly - kudos!

The ntfs3 kernel driver has been repeatedly reported to loose files. The OP mentioned a bug report on the Ubuntu launchpad site. Following half a year of not receiving a response, the person who reported the bug posted the following comment in German, which I translate:
“Apparently. My bug report is more than six months old, has given me an insane amount of work, and doesn’t interest a single specimen of the species porcus. After all, it’s just about data loss; there are much bigger problems. Utterly ridiculous.”

I have not added my bug report to the Ubuntu site since I use Manjaro. Which brings me to the question: where do you report bugs on Manjaro? Is it “anywhere on this forum”, as suggested in the How to join Manjaro-Development section or how to report bugs? thread?

What is of no less concern to me is that I have not seen any announcement by the developer as to the bug, nor on whether or not it has been addressed / fixed. A two years old article in the Register does also not inspire confidence in the ntfs3 kernel driver.

Over at bugzilla.kernel.org, there is a list of reported ntfs3 issues but all are marked as “NEW”, with no “resolution”. This surely builds confidence.

Am I to understand that data loss is a minor issue? I guess we all have so much data, we can loose a little here and there.

As you pointed out so nicely by posting the GPLs, it is my choice to use or not to use the ntfs3 driver. It would, however, be useful to see a sign of life from the developer / maintainer like “yes, I saw the bug report” or “no, I don’t have time/cannot reproduce/lost interest/I’m sick”, in other words some form of acknowledgement and if there is a chance that the bug will be addressed.

Let me emphasize: The xz vulnerability was a mere package issue. The ntfs3 data loss is on the Linux kernel level.

I should point out that I have used the ntfs3 driver without issue for a fair while; the luck of the draw, I suppose. Sooner or later (as with most software) there will be an issue or two, a bug, perhaps. What I’ve noticed is that many of the so-called bug reports stem from ignorance of how ntfs3 actually works.

I sometimes see, for example:

The not-a-bug in this case, is that ntfs3 actively prevents mounting when a dirty bit is detected. The best way to solve that is to run chkdsk from within Windows to fix any filesystem errors and clear the dirty bit. In this example, the poster claims they “tried running, chkdsk from Windows, but that did not change anything”.

The likely scenario is that only the most basic scan was used chkdsk x: /f and a deeper (and more time intensive) scan was likely needed, yet not performed.

The point though, is that the mere fact of not being allowed to mount and access their ntfs volume translates as a bug to those unaware of the difference. This potentially wastes much time for developers; there is little wonder that many of these remain effectively ignored.

Of course, I can’t speak to the intent or otherwise of kernel developers, but at least from my user perspective, confidence is generally satisfactory with regard the ntfs module.

Just my 2 cents. Keep the change.

Cheers.

Thanks for sharing your thoughts. In my case I received “Looks like your dir is corrupt” errors (checking with dmesg) that I could fix using chkdsk /F within Windows. See Vidarrs original post and the link to my blog entry.

Loosing data is considered a serious thing. Even if there is a way to recover it, you never know when it will actually become irrecoverable.

I used the ntfs3 drivers in scripts, so the only thing I had to change was a line in a script. I shared more details on my website, including how I use the driver and which options I use. I also did the RTFM and changed the mount options to the somewhat different ntfs3 options (different from the ntfs-3g driver).

My NTFS drives typically hold between 5000 to 260,000 files so I’m not really inclined to use experimental stuff.

The problem can easily be solved by switching back to the ntfs-3g driver. Until I hear that the file/directory corruption or orphaned files bug has been fixed I consider the ntfs3 kernel driver unsafe. Unfortunately I haven’t found neither an acknowledgement of the bug by the maintainer/developer nor a release note that the bug has been fixed. Perhaps I have been looking in the wrong place?

May I ask what kernel version you use? The developer seems to improve the driver constantly: Commits · torvalds/linux · GitHub But yeah, the driver is considered stable, doesn’t mean it is production ready. When a kernel dev says “stable”, then he means “public beta”. LTS is intended for production.

However, I use linux 6.6 atm and I cannot produce any errors on my win11 partition, which was preinstalled. Could you tell how to produce such “corrupt errors”? Do you still use linux 5.15? Probably fixes were not back ported yet.