EXT4/Luks filesystem error on root partition

marli · 26 July 2024 19:11

I’ve identified a filesystem issue just days after installation, on the root partition. The output of:
tune2fs -l /dev/mapper/id-of-root-partition

Mount count:              3
Maximum mount count:      -1
FS Error count:           11
First error time:         Wed Jul 24 17:45:20 2024
First error function:     ext4_validate_block_bitmap
First error line #:       423
First error err:          EFSCORRUPTED
Last error time:          Thu Jul 25 12:18:09 2024
Last error function:      ext4_free_inode
Last error line #:        362
Last error err:           EFSCORRUPTED

This rises the question: why is
Maximum mount count: -1
I suppose this means that the filesystem will never be checked at boot time. Why is it set like that? (It must have been set by the installer.)

PC is installed a few days ago, selecting the “encrypt system” option when selecting partitioning. I’ve never before used encryption on the root partition. Is that why I have
Maximum mount count: -1
?

Does a filesystem check at boot time cause some issues when root partition is encrypted?

System was installed using manjaro xfce minimal on usb stick.

BG405 · 26 July 2024 19:16

Although I’m not cognizant of encrypted systems, unless you have “unlocked” the filesystem, I’m not sure fsck can handle it, as it can’t “see” the contents. There may be a way to do this using a Live session but I’ll leave it to someone with experience using encrypted root to confirm or otherwise.

I’d suggest not encrypting root unless you really need to; just /home and other partitions used for data and probably swap.

Nachlese · 26 July 2024 19:25

May be easiest to boot from USB to do it.

full inxi output would help to help you

inxi -Fazy

(ideally off of your system when it’s running - not from USB)

In principle:

sudo cryptsetup open /dev/sdXy encrypted (to open the encrypted container)

see whats inside and check the / file system

sudo e2fsck -v /dev/mapper/encrypted

marli · 26 July 2024 19:41

Is there a similar command that will list errors in the swap partition?
(similar to tun2fs -l )

I’m thinking if there are errors there as well, i might have received a faulty ssd with this new laptop.

Nachlese · 26 July 2024 19:46

not that I know of
swap is not a file system - no structure to correct there

but you can simply create a “fresh” swap space - just have to update /etc/fstab, probably
as the UUID will change

swapoff -a
create fresh
swapon -a

but doesn’t make much sense to do that → no file system there that can get corrupted

marli · 26 July 2024 19:48

I was thinking if there was a way to see if there had been read/write errors for the swap

Are we sure running e2fsck -v booted from usb is not causing any issues with the encryption? Does anyone have an idea of why Maximum mount count defaults to -1 on new installations?

Nachlese · 26 July 2024 20:01

I have no idea - I would always set that to 10 or 20 mounts
a forced check when an error is present should be default already
but setting that explicitly can’t hurt.

Yes - the check and repair will not harm anything
it will make sure the file system is o.k.
… in the sense that a corrupted file system could already have resulted in data loss
a file system check will repair the file system - but cannot necessarily guarantee that no data loss occurs

encryption has got nothing to do with it

In your case, the file system is inside an encrypted container
which you must first open to even get to the file system

… presumably that is the reason why you put it in such a encrypted container … so that no one could easily get to the file system (and contents)

marli · 26 July 2024 20:44

I checked on the other manjaro laptop in the house, it does have Maximum mount count defaults to -1 as well. That laptop does not have encryption enabled.

So, it looks like all manjaro installations comes with fschk disabled. Should that call for a bug report?

Edit: I also notice that Check interval defaults to 0 on manjaro installations.

From man tune2fs:

It is strongly recommended that either -c (mount-count-dependent) or -i (time-dependent) checking be enabled to force periodic full e2fsck(8) checking of the filesystem. Failure to do so may lead to filesystem corruption (due to bad disks, cables, memory, or kernel bugs) going unnoticed, ultimately resulting in data loss or corruption.

I think this sounds like a bug in the manjaro installer…

Nachlese · 26 July 2024 21:05

No.

how do you figure? how do you come to that conclusion?

To me, it looks like a file system check is done (and forced) when the file system is detected to be unclean.
A check will be run if that is the case.
Every time.

There is just no forced check based upon the mount count.

It’s not - that is intentional.

I can already hear the complaints from people
why a check is forced after x times when no error was present to require it.

You and every one else is free to change that
just as the 5% of space reserved for the root user
(which is a huge waste of space accessible to “normal” users on large partitions)
it’s easily done …

Wrong assumption - just no forced check based upon mount count.

marli · 26 July 2024 22:14

Thanks for clarifying this.

What is the setting called that enables/disables the check on error?

Why does the man tune2fs recommend check on mount count or time interval if the filesystem is checked on errors anyway?

Nachlese · 26 July 2024 23:05

What is the purpose of this thread?
Making me (or others) your personal educator?

Where did you look to find answers to your many questions?
Where would you guess you would need to look to find the setting you are looking for?

See Table 3.3.3 here:

mkinitcpio - ArchWiki

the fsck HOOK

Note what it says about the Grub configuration
(/etc/default/grub)

It’s a bit cryptic - but to run the check is the default.

Does it?
I don’t think it does.
Looks like another (wrong) assumption to me.

linux-aarhus · 27 July 2024 05:58

I recently learned that one never runs fsck on the physcial volume holding the container - if I recall correct this will cause corruption of the container.

Only run fsck inside the unlocked volume where the filesystem is existing.

That means if the filesystem inside the container is in need of repair

boot a live ISO
unlock the container
run fsck on the unlocked container

marli · 27 July 2024 10:01

Thank you. It’s not easy for a regular user to realize that one should search for mkinitcpio to find documentation for fsck. Hence I ask. Also I guess the documentation I found (and quoted above) for tune2fs was outdated. That is also not easy to understand while there is no date printed in the documentation.

marli · 27 July 2024 10:08

It did actually fix itself at reboot. No need to do anything.
But it is really hard to find/understand from the documentation that fsck is automatically run at boot if the filesystem is labeled with errors.

There is absolutely nothing I can see in the docs telling that the fsck hook will actually run the fsck if filesystem is labeled with error. The understanding from the doc could just as well be understood in the way that fsck hook will run fsck if days count or max mount count is reached.

Nachlese · 27 July 2024 10:26

and it will do that
not just on “unclean” or on errors, but also on any other trigger like a set max mount count or a time elapsed since the last mount