NILFS: A filesystem designed to minimize the likelyhood of data loss


#1

What is NILFS?

NILFS is an abbreviation for New Implementation of a Log-structured File System.

It is a file system for Linux, developed in Japan at the NTT Cyber Space Laboratories. The initial developers were Ryusuke KONISHI, Koji SATO, Seiji KIHARA, Yoshiji AMAGAI, Hisashi HIFUMI and Satoshi MORIAI. (A full list of the main contributors can be found here.)

The first publications about NILFS date back to 2005. Unfortunately they were in Japanese, which did not help in promoting NILFS outside the Japanese speaking community, but it is included in the Linux kernel since version 2.6.13 and was published under the GNU General Public License.

What is it about?

NILFS was designed to minimize the likelyhood of data loss caused by filesystem corruption or human error. Therefore the developers chose the approach of an incremental round robin log-structured copy on write file system with a single place of writing using CRC32 for data and metadata.

(More about that topic you can find in this presentation by Ryusuke Konishi )

What does that mean?

Log-structured filesystem: The whole filesystem is a chronological journal.

Copy on Write: Nothing gets deleted, untill the partition runs out of space or the garbage collecter is run on purpose. If you edit a file, the original version stays untouched and the changes will be saved saparetely.

Incremental: In order to gain efficency, only altered blocks will be written.

Round robin: NILFS regards a partition as an infinite circle that consists of a single linear chronological sequence, similar to a circular buffer. The writing process starts at the first block of the partition. As soon as the end of the FS is reached, a garbage collection must be performed. After that the writing process will continue at the first free block.

CRC32: Checksums are generated as a protection against silent data corruption of data and metadata.

Single place of writing: reduces complexity.

Why would I want that?

From the description above we can conclude that NILFS is actually a sequence of incremental checkpoints, that can be flagged as snapshots. Flagging as snapshot prevents the garbage collector from deleting a checkpoint. NILFS can thus be reverted to an older state if necessary. As nothing gets deleted as long as the garbage collector is not run, it is possible to undelete files that got deleted accidentially by the user. The structure of NILFS makes it also robust against power outages. There is no delayed allocation (cf. XFS) and no saparate journal that could become inconsistant in the case of a power outage. The round robin approach reduces the likelyhood of writing hot spots on SSDs.

Availability

NILFS2 is supported by the Linux kernel and by GRUB2. On some distributions the nilfs-tools are not preinstalled (Linux Mint e.g.). If this is the case, just open your package manager and install them.

Usage

I’m using NILFS2 for my ‘/’ (root), my ‘/home’ and even my ‘/boot’ partition, since it is supported by both, by GRUB2 and the Linux kernel.

Unfortunately (January 2017) Calamares, the graphical Manjaro installer has a bug, that prevents it from installing to a NILFS ‘/’ (root) partition.


But luckily I was able to find a solution.

How to install Manjaro on NILFS2

1.) You need a Manjaro ISO (I recommend USB) + the Manjaro ISO root password, which should be “manjaro”

2.) Boot the ISO

3.) Open Gparted. It will ask for the root password, which should be “manjaro”.

4.) Create a partition scheme like the following one:

Partition  Label        Mountpoint     Filesystem      Recommended Size              Creation order  Recommended UUID command
/dev/sda1  EFI          ('/boot/efi')  FAT16 or FAT32  betwenn 8 and 512 MiB¹        1
/dev/sda2  BOOT         ('/boot')      NILFS2          2048 MiB²                     2               sudo nilfs-tune -U cafebabe-cafe-cafe-cafe-cafebabe0002 /dev/sda2
/dev/sda5  ROOT_nilfs2  ('/')          NILFS2          about 20 GiB                  5               sudo nilfs-tune -U cafebabe-cafe-cafe-cafe-cafebabe0005 /dev/sda5
/dev/sda6  ROOT_ext4    ('/')          EXT4            about 20 GiB                  6               sudo tune2fs    -U cafebabe-cafe-cafe-cafe-cafebabe0006 /dev/sda6
/dev/sda4  HOME         ('/home')      NILFS2          as much space as you have ;)  4               sudo nilfs-tune -U cafebabe-cafe-cafe-cafe-cafebabe0004 /dev/sda4
/dev/sda3  SWAP         ('/swap')      SWAP            the same size as your RAM     3               sudo swaplabel  -U cafebabe-cafe-cafe-cafe-cafebabe0003 /dev/sda3

¹ For a Windows installation on EFI the EFI-partition must not be smaller than 512 MiB and in FAT32.
² 2048 might seem a lot, but because of the snapshotting nature of NILFS2 and the the Linux kernel images, that will be saved in /boot, this partition should not be too small, else you might run into trouble with the garbage collector.

Hint: Create the ext4 partition as the last one so it gets automatically the highest number, because in the end you might want to delete it.

5.) Optionally you can change the UUIDs of the partitions. The commands and example UUIDs are already listed in step 4.

6.) Open Calamares and start the install procedure.

7.) When Calamares asks for partitioning, choose “Manual partitioning” and select the mountpoints for the partitions. They are already listed in step 4 of this tutorial. IMPORTANT: For now, choose the ROOT_ext4 partition for ‘/’. DON’T select the ROOT_nilfs2 partition!

8.) Install.

9.) After the install, boot your Manjaro, in order to check, if it works.

10.) Boot again the ISO.

11.) Mount the partions ROOT_nilfs2 and ROOT_ext4.

12.) Copy all the data from ext4 to nilfs2 with the following command (you have to look up the paths of the mounted partitions):

sudo cp -Ra /THE-EXT4-ROOT-PARTITION/* /THE-NILFS2-ROOT-PARTITION

Ignore the journal copying error.

13.) Open /etc/fstab on the NILFS2 root partition as root and replace the root partition.

It should look like that:

# <file system>                           <mount point>  <type>  <options>                  <dump>  <pass>
UUID=550E-7218                            /boot/efi      vfat    defaults,noatime           0       2
UUID=cafebabe-cafe-cafe-cafe-cafebabe0002 /boot          nilfs2  defaults,noatime           0       2
UUID=cafebabe-cafe-cafe-cafe-cafebabe0006 /              ext4    defaults,noatime,discard   0       1
UUID=cafebabe-cafe-cafe-cafe-cafebabe0004 /home          nilfs2  defaults,noatime,discard   0       2
UUID=cafebabe-cafe-cafe-cafe-cafebabe0003 swap           swap    defaults,noatime,discard   0       0
tmpfs                                     /tmp           tmpfs   defaults,noatime,mode=1777 0       0

And make it look like that:

# <file system>                           <mount point>  <type>  <options>                  <dump>  <pass>
UUID=550E-7218                            /boot/efi      vfat    defaults,noatime           0       2
UUID=cafebabe-cafe-cafe-cafe-cafebabe0002 /boot          nilfs2  defaults,noatime           0       2
UUID=cafebabe-cafe-cafe-cafe-cafebabe0005 /              nilfs2  defaults,noatime,discard   0       1
UUID=cafebabe-cafe-cafe-cafe-cafebabe0004 /home          nilfs2  defaults,noatime,discard   0       2
UUID=cafebabe-cafe-cafe-cafe-cafebabe0003 swap           swap    defaults,noatime,discard   0       0
tmpfs                                     /tmp           tmpfs   defaults,noatime,mode=1777 0       0

14.) Mount the boot partition.

15.) Open the /boot/grub/grub.cfg as root.

16.) Find and replace:

cafebabe-cafe-cafe-cafe-cafebabe0006 -> cafebabe-cafe-cafe-cafe-cafebabe0005
gpt6 -> gpt5
ext2 -> nilfs2

17.) Save the changes, unmount the partitions and reboot.

18.) Boot will fail.

19.) No problem! Boot the fallback image from the grub menu.

20.) Success? Check with gparted! Yes? Congratulations!!!

21.) Now you need to execute sudo update-grub, in order to be able to boot normaly.

22.) The “sudo update-grub” didn’t help? Then install another kernel with the Manjaro settings tool. Boot into the new kernel. Remove the old one with the Manjaro settings tool. Then you can reinstall it, if you wish.

OR:

23.) Congratulations! Now you have a Manjaro, running completely on NILFS2.

If you wish, you may delete the ext4 partition now and use the free space for another purpose.

A life with NILFS2

How do I get rid of checkpoints to get more free diskspace?

Use the garbage collector:
sudo nilfs-clean /dev/sda2; sudo nilfs-clean /dev/sda5; sudo nilfs-clean /dev/sda4

For further information about converting checkpoints into snapshots, how to mount them and administration in general, please take a look at the manual: http://nilfs.sourceforge.net/en/manual.html

Recover a partition

Some technical details
How is NILFS integrated into the Linux kernel?
Actually NILFS is a separate kernel module and the garbage collector nilfs-clean is a userland demon. This is a very modular approach, which has the big advantage, that NILFS2 will stay compatible with the Linux kernel, even when the kernel developers will decide to make bigger internal changes.

"What is the role of “.nilfs” file? Is it deletable?"
In the highest directory of your NILFS-partitions you will find a file named .nilfs which is usually empty. This is why:
“The .nilfs file was used for locking between cleanerd and other NILFS utilities. This file is now obsolete, and you can delete it if you are using nilfs-utils 2.1 or later. mkfs.nilfs2 still creates this file to avoid troubles on older nilfs-utils environment.
The .nilfs is a regular file, and the kernel module of NILFS2 does not depend on this file.”
(source: http://nilfs.sourceforge.net/en/faq.html)

More about NILFS2


Why don't we use NILFS2 more often?
Best filesystem for low CPU usage - or are they all the same?
Nilfs
Why don't we use NILFS2 more often?
Tuning the NILFS2 file system
NILFS2 installation error (Calamares)
Why do Manjaro Stable updates break the system? Is it bad testing?
#2

Now Manjaro Forum has probably one of the best tutorials on NILFS2 on the web! Thank you for this excellent, monumental guide! :japanese_castle:
It is really clear, laconic and easy to understand!

You might want to add later how to create snapshots.


#3

Thanks!
And now I added also a link about the snapshotting :slight_smile:


#4

One more step should be added. After booting the fallback initcpio, the user should regenerate the initcpio so that it will include the NILFS drivers.

sudo mkinitcpio -p linux<version>

For example, with linux 4.4, use  sudo mkinitcpio -p linux44


#5

Thanks for the hint! :slight_smile:
This is actually a less complicated solution that replaces step 22.


#6

and what about / “ext4” ?
can merge it with /home ? cold or hot ?


#7

After you completed all the steps from the manual, you can delete the ROOT_ext4 partition.
If you like, you can use the freed space by growing the ROOT_nilfs2

You can merge the ‘/home’ into ‘/’ after you deleted ROOT_ext4 OR you can do the whole thing without even creating a separate partition for ‘/home’ OR you can delete the ROOT_ext4 partition, move the HOME partition left (using the Manjaro ISO) and then grow the HOME partition.


#8

I’m still hoping for a tutorial/script to revert back to a snapshot. But an excellent guide this is!


#9

Excellent, thanks!
I’m thinking of using a NILFS2 partition for my Steam games when I get a new SSD. In that case, it should be enough to just format the partition, create fstab entries and install the Steam library to the new partition?


#10

Never save in a game again?


#11

Savegames generally go to /home.


#12

Indeed. It should work like that.


#13

From what I have now read, it seems that nilfs2 compares to btrfs about like this:

Btrfs

  • saves more space because of the compression and taking less snapshots
  • has better performance in some cases because it does less writes?
  • single volume can extend over multiple devices
  • better tools for managing snapshots
  • can exclude subfolders from snapshots (/var)

Nilfs2

  • possibly more resilient against data corruption?
  • takes snapshots of everything all the time
  • possibly better performance in some cases?

#14

Peter Chubb made a comparison of ext4, f2fs and nilfs2 on various SD cards:


His conclusion was:

  • that nilfs2 is very good regarding throughput and lots of small files
  • and that every single model of the SD cards may have highly individual characteristics.

I found some benchmarks including NILFS2 and Btrfs but most of them are very outdated (like from 2009 or 2011).

Regardings Btrfs I must say, that I don’t know much about the internals yet. But I guess, that the incremental/differential and log-structure based nature of NILFS2 on the one hand and the journaling concept of Btrfs on the other hand causes a very different behaviour in certain aspects. Like in terms of cars: you might have a Wankel engine or the Petrol engine invented by Otto. Both do similar things, but some of the basic concepts are very different, which leads to a very different behaviour in the details.

I guess, from a technical point of view a snapshot in nilfs2 IS ABSOLUTELY NOT the same thing like a snapshot in btrfs. When I will have time, I will read some documentation about snapshotting in btfrs, so that I will be able to understand the differences better.


#15

Execution is of course different, but both are differential and cow. So for the user, the greatest difference is the number and management of snapshots. Btrfs can better limit the scope of the snapshots and take snapshots/differential copies of individual files, but nilfs2 makes checkpoints of everything, so it doesn’t require any additional mechanism or scripting for it like btrfs does.

Bottom line: I find that they both are very interesting and have each their unique advantages.


#16

I agree, but I think there is another question that can be highly relevant for the user:

What happens in case of a dirty shutdown?

1.) Was all the data saved by the user actually written to the SSD/HDD?

In case of ext4 and XFS, this is not necessarily true because of delayed allocation.
How do NILFS2 and Btrfs compare regarding this question?

2.) How about the possibility of inconsistencies of data and metadata?

3.) If there is such a possibility, how about the chance of rapairing these inconsistencies without causing additional data loss/corruption?


#17

I can only speak for myself, but I haven’t encountered any data loss with either ext4 or btrfs (both in use since the time they were still deemed “beta”).
That does of course not mean that data loss is impossible!

The ext4 fsck program should be able to repair most of the inconsistencies, but just as any fsck, I guess it’s not perfect. It’s run periodically on my ext4 partition after 30 mounts.

Btrfs fsck is another thing. It was broken for some time, don’t know how it is now. For btrfs data check, I use btrfs scrub, which can detect inconsistencies but not repair them. That’s indeed a problem. Btrfs still is under heavy development.

Regarding delayed allocation, with all its drawbacks, doesn’t it also have advantages, like better performance? I think the commit interval can be adjusted by a mount option (commit=XX)?


#18

Indeed — an that’s the reason, why we don’t want to pull the power plug of our production machines too often :wink:

Of course :slight_smile: Therefore it was introduced in XFS to cure XFS’s weakness with small files and to minimize fragmentation. But the downside is obvious. Don’t Pull the power plug. :wink:

The standard setup of /etc/fstab runs fsck for root partitions at every startup. But: The fsck.xfs is a void program, that does actually nothing (https://linux.die.net/man/8/fsck.xfs)

And NILFS2 doesn’t even have fsck by design.


#19

AFAIK only if they are marked with an “improper shutdown” flag.
“Clean” partitions aren’t checked. (I’m speaking of ext4, have no experience with xfs)
One can force the check with the fsck.mode=force boot parameter.

Any idea how fsck.btrfs is working now?


#20

As I said, the fsck.xfs is void:

fsck.xfs(8)                 System Manager's Manual                fsck.xfs(8)

NAME
       fsck.xfs - do nothing, successfully

SYNOPSIS
       fsck.xfs [ filesys ... ]

DESCRIPTION
       fsck.xfs  is  called by the generic Linux fsck(8) program at startup to
       check and repair an XFS filesystem.  XFS is a journaling filesystem and
       performs  recovery  at  mount(8)  time if necessary, so fsck.xfs simply
       exits with a zero exit status.

       If you wish to check the consistency of an XFS filesystem, or repair  a
       damaged or corrupt XFS filesystem, see xfs_repair(8).

FILES
       /etc/fstab.

SEE ALSO
       fsck(8), fstab(5), xfs(5), xfs_repair(8).

 Manual page fsck.xfs(8) line 1 (press h for help or q to quit)

:slight_smile: