I find it a huge dissapointment that by default, Manjaro does not come with kernel configuration to suit a desktop operating system. The following below are observations along with a suitable fix.
Any high disk IO operation, especially prolonged ones, results in a staggering system, period. And it’s to be expected. There are however a few tweaks to ameliorate this problem and bring it to a “sane” desktop level.
- An argument one might give is that the user is allowed to lower the IO priority of their task. This used to work with the cfq scheduler. That has been removed. Neither kyber nor does bfq handle priorities in an effective manner. mq-deadline makes the situation worse.
- The issue is primarily caused by a write_back cache buffer being incorrectly allocated, especially for machines with higher RAM than 4GB. Large amounts of data is preloaded in the buffer before it’s written to the actual disk medium, as big as cache size 9GB, which is, ridiculous. The fix is simple - ship manjaro with the kernel configured as follows. More information can be found at SUSE’s support/kb/doc/?id=000017857
To solve, reduce the buffer sizes and ratios.
vm.dirty_background_ratio = 2
- The swap is being used too much and on disks rather than SSDs, the impact can lead to a low responsive system; tied in with the first issue. The VFS cache is also being reclaimed far too often, leading to thrashing.
Give the kernel less of a tendency to reclaim VFS cache memory and to be less inclined to use the swap unless absolutely necessary.
- Audio crackling from ALSA/PulseAudio when high disk IO and high cpu IO. The sound/speaker buffers run out of data earlier than the consumer consumes them, not giving the callbacks the chance to feed the buffer fast enough. Again, to be expected. However, there are things that can be done to tune this to make it more malleable.
- Change the default priority, niceness and IO niceness of the audio driver/process to RT (realtime).
- The default scheduler that lands on SSDs is none, which is not the same as noop. Actually, forget this issue completely. There is a bigger problem with the kernel understanding user activity. When the disk is on high IO, the scheduler needs to change to the kyber scheduling algorithm, to suit IO requests. On a very heavily stressed machine, starvation occurs more often and several downloads [over 80] (notably HTTP ones) had increase risk of failure due to timeout responses being slow. I also don’t think I need to mention that bfq will suffer if high IO is present. Windows freeze, the mouse pointer barely moves, etc.
- After a thorough personal testing, bfq provides a more desktop oriented scheduler suited as default, for both HDD and SSD…
- Provide an automatic means in the kernel to detect high IO and automatically switch from bfq to kyber. When the heurestic notices lower disk activity, switch back to bfq.