Decrease dirty bytes for more reliable USB transfer

can someone point out where in the discussion it explains how the “sync” route reduces usb flash-drive/SSD endurance

Oh sorry I forgot to quote the man page:

sync
All I/O to the filesystem should be done synchronously. In the case of media with a limited number of write cycles (e.g. some flash drives), sync may cause life-cycle shortening.

man mount

It writes in shorter write cycles (since synchronously) and therefore use up the thumb drive more than with async, where the write cycles are longer.

2 Likes

Funny, my defaults aren’t that different from what Linus suggested.

https://lkml.org/lkml/2013/10/25/39

1 Like

Yeah really funny… 2013 … we have now 2022, so 9 years later. Is there a solution for all those workloads? It is the same, still. However… over years it didn’t bother me and the default was NEVER a problem, since after time you get used to this behavior that (and it is logical, that thumb drives are just slower.).

Funny enough… “Dave Chinner” … without knowing what he has written years ago, it seems we are on a similar wavelength. :joy:

In more detail, if we simply implement “we have 8 MB of dirty pages
on a single file, write it” we can maximise write throughput by
allocating sequentially on disk for each subsquent write. The
problem with this comes when you are writing multiple files at a
time, and that leads to this pattern on disk:

ABC…ABC…ABC…ABC…

And the result is a) fragmented files b) a large number of seeks
during sequential read operations and c) filesystems that age and
degrade rapidly under workloads that concurrently write files with
different life times (i.e. due to free space fragmention).

LKML: Dave Chinner: Re: Disabling in-memory write cache for x86-64 in Linux II

personally not the first time i’ve mind meddled with the same topic, and to this day it still remains a mind meddle. each time after reading and arriving at the usual tweaks proposed, I also get to read about the other counter-productive issues and pull what little hair i’ve got and get confused and then leave the stock settings and move on…

however kudos to all of you this has been somewhat enlightening, although its been confusing thanks for keeping it civil.

i think what confuses us is that most of us has no triage to the different issues at play here. most of us visit the topic as a one-stop write-cache tweak, when there are lot more at play here, and the way i see it there is still no one solution that fits all.

issues need tackling being;

  • data transfer reliability
  • performance
  • device longevity (flash/SSD)
  • file fragmentation
  • accurate file manager stats

i think in the case of custom handling of any of these is an absolute must, they should evaluate and use a solution per-case basis. everyone else should just let it be as is.

in my case, i’m using an old-laptop with rusty USB ports, and leaving flash drives, external HDDs plugged-in after bulk file transfers is frequent. what i’m most bothered about is when i accidentally knock on these plugged in supposedly in post-transaction state. some leads to corrupt data, worse with flash drives some lead to corrupt devices. so my issue is mostly weighing on “reliability”, and willing to sacrifice flash drive longevity, and HDD file fragmentation. hence i’m willing to go the “sync” route.

NOTE:
udev has pretty convincing device scanning going on behind the scenes. you can make out your usb HDD from your flash drives if need be. just use udevadmmonitor to find out all the key-values;

# udevadm monitor --property
2 Likes

If I would be in your situation, I would create script and a service und the sync command periodically or just run it when you do backups… can be easily integrated into a desktop file (Launcher) on the desktop.

Just like:

while true; do sync && notify-send "Write cache has been synced" && sleep 300; done

or

while true; do sync | zenity --progress --pulsate --auto-close --no-cancel --title="Write Cache" --text="Synchronize write cache" && zenity --notification --text="Write cache has been synchronized." && sleep 300; done

So every 300sec → 5min

In a desktop file under Exec=

/usr/bin/bash -c "while true; do sync | zenity --progress --pulsate --auto-close --no-cancel --title="Write Cache" --text="Synchronize write cache" && zenity --notification --text="Write cache has been synchronized." && sleep 300; done"

Man it is even a GUI…

It doesn’t bother to run it more frequently. It does it in the background any way, but this way you would have a notification: “Now the write cache is synchronized.”.

1 Like

Well, if you start concurrent copy jobs to same block, then you will get some fragmentation, depending on the size of the files being copied. However, if you do bulk transfer of multiple files, they’ll be copied one at a time and this doesn’t impact fragmentation. Another thing to consider, is that the fact machines have a lot more RAM these days allows them to have much larger read caches (for example, I currently have a 10GB read buffer/cache), and so fragmentation isn’t that much of a problem as it was in the past (not to mention the existence of SSDs and NVMEs, which are virtually unaffected by fragmentation)

I think we’ve reached a point where this conversation starts to be unproductive. There’s not much more to debate. Personally, I’d like to see Manjaro change these defaults, but I don’t mind if they don’t because I’m able to do it myself. At least these discussions can be useful to other users. Have fun :wink:

I was unable to copy a 65GB file to a 128GB USB3 pendrive with the default dirtybytes settings… The transfer speeds was dropped below 2MB/s after 5 minutes… Then I discovered the maxperfwiz script.

By default, these settings are active:
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 1500
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 1500
vm.dirtytime_expire_seconds = 43200

With 32GB ram, maxperfwiz suggested and set these settings:
vm.vfs_cache_pressure=75
vm.dirty_ratio=3
vm.dirty_background_ratio=3
vm.dirty_expire_centisecs=3000
vm.dirty_writeback_centisecs=1500
vm.min_free_kbytes=118812

And this solved the problem! Now I was able to write the pendrive with the same speeds as it was written under Windows.

This maxperfwiz should be the new default.

6 Likes

@cscs Hey, someone remembers it. :wink:

1 Like

And these are the suggested values on my other machine with 16GB ram:

vm.vfs_cache_pressure=75
vm.dirty_ratio=3
vm.dirty_background_ratio=3
vm.dirty_expire_centisecs=3000
vm.dirty_writeback_centisecs=1500
vm.min_free_kbytes=59030

Now I can copy files to 2 pendrive at the same time with good speeds.

Seriously, the default settings are not good on a modern PC. Before this maxperfwiz, I had ultra low copy speeds when I tried to move big files between partitions in my PC.

*and I only know about that script thanx to a Google search result.

1 Like

Looks like the only settings changed here are:

vm.dirty_expire_centisecs = 1500
vm.dirty_writeback_centisecs = 1500
vm.min_free_kbytes=118812

vm.min_free_kbytes = 67584
Was default for that setting.

Wonder what these particular settings do, looks like a different approach for a similar result, perhaps a better approach too?

From https://sysctl-explorer.net/

vm.dirty_expire_centisecs

This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100’ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up.

vm.dirty_writeback_centisecs

The kernel flusher threads will periodically wake up and write `old’ data out to disk. This tunable expresses the interval between those wakeups, in 100’ths of a second.

Setting this to zero disables periodic writeback altogether.

vm.min_free_kbytes

This is used to force the Linux VM to keep a minimum number of kilobytes free. The VM uses this number to compute a watermark[WMARK_MIN] value for each lowmem zone in the system. Each lowmem zone gets a number of reserved free pages based proportionally on its size.

Some minimal amount of memory is needed to satisfy PF_MEMALLOC allocations; if you set this to lower than 1024KB, your system will become subtly broken, and prone to deadlock under high loads.

Setting this too high will OOM your machine instantly.

vm.dirty_ratio

Contains, as a percentage of total available memory that contains free pages and reclaimable pages, the number of pages at which a process which is generating disk writes will itself start writing out dirty data.

The total available memory is not equal to total system memory.


AFAIK, the default settings are from old systems with little memory. In these days, especially the dirty ratio is problematic with the usual 8-16-32GB memory… But this is a complex situation, where you have to deal with changing system memory amounts…

2 Likes

This is one of those things that should be done by the installer I believe. Given that the nature of this settings is quite complicated for a novice user, and that there’s no “one-size-fit-all” values these days due to wide variety of RAM capacities on numerous PCs still running, I think installers like Calamares, Anaconda and so on should deal with this while deploying a new OS installation. Similar to how Ubiquity sets a swap file for all recent Ubuntu installations (btw, could be a good thing for Manjaro too).

1 Like

It should run at every boot. Some of the values depend on the amount of system memory, and you can change your memory modules at any time. So the values needs to be updated every boot I think.

I think I’ve come up with a real solution to this problem that can be evaluated to come by default in Manjaro:

4 Likes

Thats why the maxperfwiz was a proof-of-concept retort to hard tools previously used/abused.
In this case the script uses some basic arithmetic to arrive at best-guess values.

I installed your script via AUR - do I have to run it explicitly or is it done just by installing so that it works automatically? If it’s working automatically the question would be how to disable it?

That script is supposed to run automatically as soon as you attach a removable disk to one of your USB ports.
But you need two things in order for it to work:

  • Automount for all removable devices should be enabled in options
  • Kernel >= 6.2. It’s not going to work if you use an older kernel

To disable the script temporarily you could just rename udev rule file

sudo mv /etc/udev/rules.d/60-usb-dirty-pages-udev.rules /etc/udev/rules.d/60-usb-dirty-pages-udev.bak && sudo udevadm control --reload
1 Like

Ok, thanks a lot!

IDK why there is no such file in my case