The issue was identified in the Debian 12.3 “Bookworm” release but the same kernel version is currently in use in Manjaro stable.
The bug may exist in other kernels as well.
So it would be a good idea to check if the Manjaro kernels also suffers from this bug and possibly push kernel version 6.1.66 with reverted problematic code to the stable (which is in the testing branch already).
uff thats horrible news… how can i know, which files on my ext4 partitions are getting corrupted now, since i installed last stable update on the 12.01?
Should i maybe restore my timeshift before i update now?
Bad news indeed. Just checked my system and it seems like my (actual) data did not get affected, but I had similar “hidden” data corruption because of a Cryptomator low level bug a while ago and it was a nasty situation. However, I wonder how often this bug gets triggered on actual systems, considering there have been no reports concerning corrupted data (or at least not more than usual)?
I wonder if the bug is limited to ext4, I have had some REALLY strange things happening on my btrfs the last few weeks.
Pamac failed to build because a file was just magically removed. Yesterday the directory for a mountpoint was missing, switched to root in dolphin to create the dir again, and suddenly it appeared but NFS still refused to mount not really giving me any reason why. I do not use option x-mount.mkdir
Also lists the directory last modified on sep 16 so…
A reboot and everything was like nothing ever happened.
A scrub on the btrfs shows no errors.
I have 6.1 installed, but have not used it, only 6.5 and 6.6 and it seems I should be unaffected?
My ext4 partition shows no signs of corruption, but that is also not involved what so ever with my system other than game partition.
seems like bcachefs will get a running start considering zfs, ext4 issues lately and general unreliability on btrfs when using raid levels other than 1.
If you are on the new broken kernel already, probably reboot into a live image, fsck, check the live data against your backups and restore backups of any lost/broken files.
If you are on the old kernel but have the new broken kernel installed, purge the new broken kernel, prevent it from being installed by holding packages back etc and wait until the fixed kernel is available, then install it and reboot into it.
I never did anything with fsck, how can we identify which files are corrupt… do we need the newest live image to compare our updated and possible corrupted harddrive files against the live image?
According to this mail (by a SUSE employee concerned with this sort of stuff, I hope) the corruption is a pure data corruption. Does it even make sense to fsck in this case? And comparing backups is also problematic, considering files may simply have changed because they were… well, changed. What would be really useful would be a list of popular applications making use of the O_SYNC|O_DIRECT combo in question, but such a thing is obviously not something readily available.
Im actually thinking, if its not better if i do a timeshift rollback that i have created at the 11.30
and skip the faulty stable update released at 12.01 (where i have also a timeshift backup from btw.) and just update my system again.
On the otherhand, im not sure what the result is, when i just skipped a stable release and if thats not leading to a bigger issue, at the end of the day.
Please report back @Teo what your result is from this fsck.
I was exactly thinking the same, its pretty hard to find out where actually are the corrupted files
and how to compare them manually.
Im also wondering why almost no one talking about this problem right now.
Its fixed right now but the damage is done if we getting broken/corrupt files while we was using this Kernel… the files aren’t magical fixed now. I was using this Kernel for 9 full days and maybe many people more here.
We need a solution for this… everyone with ext4 right now is in the same boat and the leak was there atleast this is what the developer told us: “A Data loss is/was possible in this time”.
I not even have a idea, how exactly this corrupt files was triggeret, was it from writing or was even reading a file could lead to this problem… how big are the chances for gaining corrupted files… was it 1% or 10% or 50%?
I read later (in the readhat forum i think) this should not be an issue on 6.5 and above, and since i am currently on 6.6 i do not think there is corruption and have not scanned.
It would be nice if someone confirms this, that 6.5 and above is troublefree.