Re: Ntfs3 keeps corrupting my ntfs partitons

I recall encountering similar directory and file corruption in years gone by, even when using the Microsoft driver. In my case, the disk was the culprit, despite it showing no obvious signs of any distress. SMART revealed it to be well within it’s service life, and undamaged, and yet the issue persisted.

Finally I performed a chkdsk scan for bad sectors; after successful completion the condition still existed, and eventually I performed a low-level format (took several days back then) and afterwards, no more issues became obvious.

Of course, that’s a different situation entirely. Frankly, I find it difficult to believe that ntfs3 is actually the cause - though, if it were the Microsoft driver, I might have little doubt.

I’m not using LVM on Windows! I create empty, unformatted LVM volumes under Linux and let Windows format the “partitions” using NTFS. Windows sees the LVM volume just as if it was a normal partition.

My hash scripts use the snapshot feature of LVM and don’t modify the NTFS partitions at all.

I’ve been using rsync or the rsync-based Luckybackup utility for more than a decade, together with the ntfs-3g driver. I had never any issue with that. Only after switching to the ntfs3 driver did I loose files. I have since reverted to the ntfs-3g driver and all looks fine.

Using my hash scripts also allows me to detect bit rot or file corruption, in addition to keeping track of all my files. If all goes well (which usually happens), after I backup my computer to remote or external drive(s), I know for sure that the backup contains an exact copy of the source (same number of files, same file names, same hashes for each file). The hashes, file names etc are stored on cloud-backed folder and also copied to the external / remote drives in question.

I have usually at least 3 backups.

erm… what backup script? I see only a mount script… How do you copy files? By hand with a file manager? Which one?

What I am asking: Please, give me exact instructions how to create such errors. Thank you.

dm-27 must be some sort of fakeraid, could be also Microsoft’s LVM. So probably ntfs3 with fakeraid, is a edge case scenario, so not well tested, meaning: this edge case is not included in paragon’s test scenarios, which are used for testing the driver on linux. Also what you are showing is not a “basic partition” in terms of Windows.

PS: Please stop using pictures of text of the terminal. There are code blocks and you can copy and paste them. Thanks. Now everyone has to copy your stuff by hand instead of being able to copy&paste it.

This topic is a great example of the importance of information.

  • NTFS

Only when challenged you provide extra information

  • logical volume - this is also a feature Windows provides

When challenge with the choice of using Windows logical volumes

  • we are informed about the creation of a LVM group on Linux then attached that to Windows and let Windows format the volume with NTFS

I think I have mentioned earlier the conflict of interest with Paragon Software.

When any software is reverse engineered there is an important rule - relating to copyright.

The process must documented so the developers can prove how they reached the result and that the code is not just a result of decompiling parts of the application - then use that source to create the functionality.

That process and result is required to be different than the original - otherwise you become a target of intellectual property infringement.

This means the derived code is likely to behave different and may therefore create issues.

When you choose to use NTFS as an important filesystem for sharing data between Windows and Linux - you must have your reasons - but complaining about issues in such a specialised configuration - you should ensure you do not apply updates without thinking of the consequenses.

Kernel 5.15 released 2021-12-31 implemented ntfs3 - even so it does not imply it is production ready - also mentioned before.

Thanks for your comments and questions. I didn’t want to hijack the thread, just share similar experiences as the OP. In fact, the OP commented on my website where I posted about the ntfs3 issue.

First in reply to @megavolt: Reading my own post on my website I noticed that I made some mistakes in my replies above, thanks to @megavolt pointing out the dm-27.

Here is the corrected version of events, as described in my post and as far as I can recall now:

  1. I have a LVM volume called media-photo_raw that is formatted to NTFS and used entirely for backing up the picture files as they are imported by Adobe Lightroom (the “make a second copy to…” option on the LR import screen). The drive had been formatted to NTFS within the Windows VM. Of course, the files were copied while running the Windows VM.
  2. After I import the photos from memory card to my NVMe-based work drive called vmvg-workdrive, I go through them and delete those that I don’t want to keep. However, I do NOT touch the files on the media-photo_raw volume holding the backup on import. Obviously the media-photo_raw volume fills up rapidly. Once the photos on the vmvg-workdrive are processed, I move them to long-term storage on large HDDs (LVM volume in RAID1 that’s been formatted to NTFS by Windows 10).
  3. After each step I run a hash script that creates a list of files and the corresponding hashes, for each storage media/volume.
  4. Over time, the original backup files are not needed anymore, as I have multiple backups of the long-term storage drives. So I wrote a script that deletes photos on the media-photo_raw volume based on various criteria to free up space.
  5. When I ran the script the first time, it deleted around 27,000 files on media-photo_raw.
  6. About a month later, after importing more photos, I ran another script to update/add the hashes for the new files. This is where I got the “input/output errors”. I checked it by executing: find . -type f -printf '%T+\t%s\t%i\t%p\n' > "$lof-new"
  7. At first I thought this was hardware failure of the HDD. I ran an extended SMART test and it came out clean. This is when I started to suspect the ntfs3 driver and did some Internet search to find that I am not alone.
    (Sorry for the lengthy introduction, but it might be relevant.)

Now, below are the answers to your questions:

What backup script? I use Luckybackup (basically a front-end for rsync), which uses a bash script I wrote to mount the LVM-based NTFS partition. The command that mounts the file system is: mount -t ntfs3 -o "$rw_mode",iocharset=utf8,dmask=027,fmask=137,uid=$(id -u $user),gid=$(id -g $user),discard "$mount_dev" "$mount_path"
Important: I did not use any backup script on the media-photo_raw in question.

Here is the code snippet of the bash script that deleted the 27,000 files:

cnt=0
while read file; do
	echo "Deleting $file"
	rm "$file"
	((cnt++))
done < "$lof-discard"
echo "$cnt files deleted from raw backup media"
echo "Removing $(find -depth -type d -empty | wc -l) empty folders"
find -depth -type d -empty -delete
echo "Removed empty folders";;

dm-27: This is the media-photo_raw LVM volume that was mounted by my script using the following mount option:

mount_op="-t ntfs3 -o ${rw},iocharset=utf8,dmask=027,fmask=137,uid=$user,gid=$group,discard"
mount ${mount_op} "${lvmount:-$lv_path}" "$mounton"

(About using photos: Totally agree!)

@linux-aarhus

NTFS - Only when challenged you provide extra information: I will gladly help where possible, but didn’t know if this here was the right place. To summarize my different NTFS use cases:

  1. LVM raw volume created by Linux, formatted by the Windows VM to NTFS.
  2. Native NTFS volume, typically a pre-formatted external HDD (like a WD Elements external drive).

Logical volume: I thought I made it clear that I use the Linux LVM. Microsoft doesn’t care much about open standards, so their proprietary version of LVM called “Storage Spaces” is a no-go for me. LVM stands for “Logical Volume Manager” and should be a familiar term to most Linux users.
As mentioned before, I use a mount script and a launcher to mount or unmount my NTFS volumes. The code that selects the right Windows partition to mount is here:

    kpartx -av "$vm_path"
    if [ -b "${vm_path}p1" ]; then
        dev="$(lsblk -no NAME,SIZE ${vm_path}p? | sort -h -k 2 | awk '{ print $(NF-1) }')"
        dev="$(echo $dev | awk '{ print $NF }')"
    else
        if [ -b "${vm_path}1" ]; then
            dev="$(lsblk -no NAME,SIZE ${vm_path}? | sort -h -k 2 | awk '{ print $(NF-1) }')"
            dev="$(echo $dev | awk '{ print $NF }')"
        else
            dev="$vm_volume"
        fi
    fi

As you can see, I use kpartx as device mapper (it supports even nested LVM volumes).

I think I have mentioned earlier the conflict of interest with Paragon Software: Of course I understand and appreciate the efforts that are going into the development of the ntfs3 driver. There must be a way to track that development, especially with regards to bugs.

Is my case of using NTFS on LVM an edge case? Perhaps, though I have no clue if the use of LVM or the device mapper kpartx has any bearing on the functionality of the ntfs3 driver. As I repeatedly mentioned, I used the ntfs-3g driver for around 12 years without an issue.

Could the large number of files I deleted be the issue? Perhaps.

I believe that at least one external drive was also affected. As explained before, these external drives don’t use LVM but were purchased pre-formatted with NTFS.

Important: The OP who started this thread does not use LVM! Unfortunately I didn’t screenshoot or copy the output of the chkdsk /F command under Windows, but I believe it mentioned “orphaned files”.

Does this answer the questions? If not I’m happy to add if and where I can.

That makes me thinking and also your whole blog post…

For the Linux host to be able to mount a NTFS partition created by Windows on an LVM-formatted drive, you need to use a utility that reads the partition table on the LVM volume and creates device maps for those partitions. I’m using the kpartx utility which is probably the easiest way to mount those partitions.

Windows doesn’t know what LVM (Logical Volume Manager) is, but it knows what LDM (Logical Disk Manager) is. LVM is Linux-only and LDM is Windows-only, although there is a tool called ldmtool on Linux. You assigned it with kpartx somehow. But clearly, if you formatted the drive with NTFS on a VM with Windows, then it can never be LVM, but LDM or a basic partition.

I thought you created a LVM and NTFS partitions on Linux, but based on your blog that is totally wrong, because Windows cannot handle LVM of Linux, there is not even driver for that on Windows.

LVM != LDM

So your scenario is very very niche, and most don’t use LDM+NTFS on Linux.

1 Like

@heiko_s

I didn’t realize you - sort of - hijacked the thread - I am embarrased :frowning:

So your whole intention is create a debate which supports your point of view - subseqently linking to your blog - you are naughty.

So the issue is a Linux LV - which is then assigned a LD on Windows and formatted accordingly.

With Windows possilby knowning what a LVM is - due to the Ubuntu coorporation - then you may be looking at an issue which is entirely self-incficted - and your argumentation about the kernel ntfs3 driver - falls with a BANG.

As I said earlier - ntfs is not natively supported on linux - for native support it requires the filesystem to be opensouced - which ntifs is not.

2 Likes

Another consideration.

I believe you mentioned earlier that you created a Virtual Machine to access a physical volume; the particular context I’m uncertain of now (I think you perform chkdsk in that environment).

However, it may be worth noting that this kind of raw disk access can have unpredictable results. I’m also uncertain which VM software was used, so for the sake of analogy I’ll refer to VirtualBox:

VirtualBox can allow raw hard disk access to (non-USB) physical disks or partitions as virtual disks. Raw hard disk access in VirtualBox is implemented as part of VMDK image support.

This involves creating a VMDK file which defines where the data is to be stored. With raw partition support, any partitioning information is stored inside the VMDK image, and the host’s partitioning information is not affected.

Note that Oracle classes this capability as experimental.

Using this method to access an NTFS (or any) filesystem is potentially dangerous. It’s difficult to predict issues that might arise. It’s possible that the corruption you describe might actually have been caused during one of these sessions.

Just something to consider.

Cheers.

Related:

Furthermore - These posts would probably be best moved to a separate thread; it’s clearly diverging from the OP’s original intention.

1 Like

I’m not very familiar with Windows - gave up years ago. I find Linux 100x easier and better documented.

As to LVM versus LDM: I had to read up on LDM. During installation Windows 10 creates several partitions, including at least one or two hidden partitions. These must be basic partitions, as far as I understand now.

Windows VMs (any VM, for that matter) can be installed on qcow2 files, on bare metal drives, ZFS, LVM raw volumes, it really doesn’t matter as long as you use the right driver, which is usually a virtio driver under Windows. Here is the xml configuration for my Windows 10 VM pertaining to the media-photo_raw volume mentioned in my last post (the volume that had the errors):

<disk type="block" device="disk">
  <driver name="qemu" type="raw" cache="none" io="native" iothread="1"/>
  <source dev="/dev/media/photo_raw"/>
  <target dev="vdc" bus="virtio"/>
  <address type="pci" domain="0x0000" bus="0x0a" slot="0x00" function="0x0"/>
</disk>type or paste code here

So your scenario is very very niche, and most don’t use LDM+NTFS on Linux:

First of all, now that I start to understand what LDM means, let’s forget about it. I’m not using LDM and never have used it.

I totally disagree with you regarding niche case! Anyone using a Windows VM and mounting Windows NTFS partitions on Linux is in the same boat as I am.

For example, most new VFIO users (i.e. those running Windows in a VM using VFIO or virtio drivers) will opt for the qcow2 file as the default storage option for the VM. You can mount Windows NTFS partitions created inside the qcow2 file on Linux, much the same as you can mount Windows NTFS partitions residing inside LVM volumes.
Let’s put it this way: The main reason for using the ntfs3 driver is to access partitions on dual-boot systems or, nowadays, partitions used by a Windows VM. People who occasionally need to transfer files - on a USB stick or external drive - from Linux to Windows or vice versa will likely use exFAT. To share files in a heterogeneous network (or when running a Linux host and a Windows VM), you can use Samba (smb).

I have a good idea of how much interest VFIO / passthrough generates since I wrote several tutorials on that, the first one 12 years ago with over 600,000 readers. Most people today rather run Windows in a VM than dual-boot. Companies like Proxmox and Unraid make a living of virtualization.

And who would use Windows to backup files when you got all the utilities built into the Linux OS - free of charge? Using LVM I can snapshot the Windows VM and do a backup using a cron job or systemd or whatever. I’ve used ntfs-3g thousands of times over the last decade or two, most often together with LVM-based NTFS partitions.

I hope I explained myself better now.

EDIT in response to your replies that happened to overlap with this reply:

This will be my last post on this thread. If anyone has questions or suggestions to me, you can send me a message.

@linux-aarhus
The OP had made a comment on my website with a link to this thread. This is why I’ve chipped in here, and as it turns out, taken over the discussion.

My intention for all this is very simple: Share the concern of the OP that there is a bug in the ntfs3 driver that causes files to be orphaned / lost.

I must have explained myself very badly. For one thing, it doesn’t really matter if I use LVM or not. Windows doesn’t know anything about the underlying media, all it sees is the VirtIO drive for which Windows has the virtio storage driver.

The second point is: ntfs-3g just works. This by itself should raise a flag regarding the ntfs3 kernel driver.

The 3rd point is: Some people have put a lot of work into the ntfs3 kernel driver. All I want to do here is honor that effort and help point out a serious flaw (loosing files is a serious flaw). I guess I tried to do that in the wrong place.

@soundofthunder
Forget about VMDK, I’m not using VB. I run my Windows VM on kvm/qemu.

For reference, here a diagram: The Linux Storage Stack Diagram (Linux Kernel 6.2)

You can click on the sections and get more info. As you can see, the file system driver (whether ext4, ntfs or whatever) has nothing to do with the block layer or the (optional) device mapper. They work independently.

Hope this helps.

Useless information, I’m afraid, as my reference was purely:

Likewise, your helpful suggestion, is contextually misplaced. I certainly do not have a need for guidance. However, I do wish you well in your efforts to elicit patronage for your site.

Cheers.

1 Like

LTS Kernel 6.1 and below using per default NTFS-3G anyways.
NTFS3 was later introduced or did i miss something?

You missed something.
The ntfs3 kernel module was introduced with kernel 5.15.x :wink: (recently confirmed by a Team member)

2 Likes

All right thanks for the correction, so it was introduced at Kernel 5.15 but default NTFS mount was still directed to NTFS-3G.

Well, no. According to NTFS:

…which is consistent with my understanding. From the same page:

…and this indicates that ntfs-3g utilities can be installed for use with ntfs3 (via the ntfsprogs-ntfs3 package). This might also suggest that the ntfs3 driver has been used since the patch (see also Ufsd) was released and opensourced in 2020.

It’s a little confusing though.

I recall reading both opinions: that ntfs3 was default, and also that ntfs-3g was. I had presumed ntfs3, however, as it was recently announced that ntfs3 is only now the default, I imagine that means I was wrong.

@Vidarr (if you ever return to this thread)

It occurred to me that there are two settings in qBittorrent which, if not checked, could easily lead to damage to an ntfs volume, especially when transferring large files (archives, ISO files).

  • Pre-allocate disk space for all files
  • Append .!qB extension to incomplete files

Well, actually, it’s the first of these that could lead to filesystem damage. Checking that option should allow each file to be downloaded to it’s own contiguous space, rather than being intermingled with other downloading streams with the chance of cross-linking. As you have found, the files are still there, only not accessible.

This is something that can (and does) happen regardless of the ntfs driver used. That also includes the Microsoft driver.

That is all. Cheers.

I don’t believe that info… with LTS kernel 6.1 my NTFS Partitions will be not mounted (per default) with NTFS3… and i remember many many people said this in the past around this Forum also.

With LTS 6.1 there is no need to blacklist NTFS3 driver, because NTFS-3G will be used.

Well, I recall it differently; at least, I think I do.

The only reason (as far as I am aware) that blacklisting the NTFS3 module in favour of NTFS-3G became popular, is that nobody understood why NTFS3 was actively preventing NTFS volumes from mounting.

The easiest ‘not-a-solution’ was to blacklist NTFS3 and install/use NTFS-3G which ignored the dirty bit. This was also the go to suggestion by many people equally ignorant of the reasons for this NTFS3 security measure.

For those interested, see: [Primer] NTFS on Linux for related information.

4 Likes

Maybe these are on systems originally installed and set up with older kernels? I don’t know, but I can say I’ve not yet had any issues with a disk in the dock which still has some NTFS partitions on it, which I’ve been using as “spill-over” storage for many years now.

1 Like

Thanks for pointing that out. “Pre-allocate disk space” is supposedly to create them as sparse files? Or non-sparse files with the full size of the to-be-downloaded file pre-allocated?

Whoa, I never encountered that on Windows. Interesting. Perhaps because I don’t usually use Bittorrent from Windows?! :thinking:

I’ll put it on my TODO list and tinker with it a bit inside of a VM. Hopefully I’ll be able to reproduce the issue and perhaps even create a sensible bug report. Thanks for your pointers!

The size of the to-be-downloaded file is pre-allocated; as to the qBit internals, you would need to check with the developer.

My reference was to the types of damage and not specifically qBit. I suppose that didn’t translate well in the wee hours. However, I find that if the previously mentioned option is not checked, it allows a greater possibility of miswritten data (particularly at higher speeds).

Using an NTFS qBit download destination, in Linux, is less than ideal.

2 Likes

Actually we fixed it in Kernel 6.8.8-2 to use ntfs-3g again if ntfs3 kernel driver is not your thing …

1 Like