Zombied programs refuse to close with SIGKILL/SIGCHLD

Operating System: Manjaro Linux
KDE Plasma Version: 6.6.4
KDE Frameworks Version: 6.25.0
Qt Version: 6.11.0
Kernel Version: 7.0.3-1-MANJARO (64-bit)
Graphics Platform: Wayland
Processors: 16 × AMD Ryzen 7 5800X3D 8-Core Processor
Memory: 64 GiB of RAM (62.7 GiB usable)
Graphics Processor: NVIDIA GeForce RTX 3090 Ti

as of late I’ve had programs, particularly wine/proton core programs, hang and zombie with no way other than restarting to close them. This has caused issues such as parent programs also hanging (e.g steam) or unrelated programs also failing to launch. this is somewhat easily reproducible and I’ve determined it to be potentially due to the new kernel and it’s NTFS3 changes.

I’m just wondering if there’s a flag i should have enabled or some other obvious change i should make to fix this, other than formatting to a more Linux centric fs (I have my reasons)

Did you try an older kernel, 6.18 LTS for example?

In that case you may benefit from specifying which version of ntfs you use to mount the device.

Another option is to blacklist the kernel ntfs3 driver.

This is normal. A zombie process does not take up any CPU cycles or memory. It is a child process whose parent has died, and as such, it is reparented to the init process, which in our case here is systemd. As such, there is no other way to get rid of zombies than to (cleanly) restart the system.

If you are certain of this and if it is reproducible on different hardware, then it might be worth bringing this to the attention of the upstream kernel developers — specifically, whoever is responsible for maintaining the ntfs driver.

1 Like

@pwx
I was previously on and still have 6.18.26-1 installed, the issues only started after installing and trying 7.0.1

I can think of some steps that should make it reproducible, I’ll have to test it on my storage box, might take me a while to set up.

as for my testing steps ill likely be following:

  1. Format a usb flash drive* to NTFS
  2. install Steam, and Proton9.0-4 on the main system volume, and Elite:Dangerous** on the NTFS3 volume
  3. run game with system monitor running and wait for proton’s “reaper” program to zombie

*it doesn’t need to be a flash drive this is just the most convenient storage medium i have on hand
**Elite:Dangerous isn’t the only program I’ve witnessed having this issue, it’s just the quickest and easiest one i have noticed it occur with

Mod edit: Consecutive posts merged.

Well, then it’s an issue with that particular third-party program, not with the ntfs driver in the kernel. :man_shrugging:

1 Like

reaper just happens to be the program that hangs for Elite
another example is if Helldivers2 crashes srt-bwrap hangs in a disk sleep state

Well, those are all Windows games, and thus third-party software written for an entirely different operating system.

So the issue is with that alien software, and/or with the translation layer — i.e. steam and/or proton — for running those games on an operating system they were never designed for.

2 Likes

I’ll poke around and see if i can find something native that has a similar issue

while trying to find a native program that hangs I’ve managed to get cp and rm to hang on two different drives while moving files around for testing. they both get stuck with a disk sleep.

Mod edit: Consecutive posts merged.

Then there might be something wrong with the drive itself, or possibly — if it is an SSD — with the drive’s firmware.

1 Like

it’s happening across multiple drives with different manufacturers and interface types

Crucial MX500 on revision M3CR046
Samsung 970 EVO Plus on revision 4B2QEXM7

i was copying from the crucial to the samsung and deleting from the samsung only, the crucial was also the original drive i was testing on.

Zombie processes aren’t a problem, their parents are. Have you tried pstree -p {PID} of the zombie process? It should tell on the bad parent so you can discipline them.

There’s a good video about it here: https://www.youtube.com/watch?v=E-vAUhE29rk

Actually, no, what he’s explaining/demonstrating is not the creation of a zombie process. He’s creating a defunct process, but that’s not what makes it into a zombie.

A zombie process no longer has a parent, or that is to say, it gets reparented to init because its natural parent has died, and the only way to get rid of it is to reboot userspace.


Edit: See this Wikipedia atricle… :backhand_index_pointing_down:

3 Likes

@Aragorn thanks for the re-education! :+1: I stand corrected and a little more knowledgeable.

1 Like

Steam at the date use ext4 for linux game and proton ( near wine ).
why trying in this case ntfs3 for steam games ?

last version proton : 10.0-4

Ok, i found my issue and my fix
first off, KDE Partition Manager needs to learn a little from GParted. KDEPM told me nothing and showed my drives/partitions as perfectly healthy.
GParted gave me the following warning which sent me down the right path

Forced to continue. $MFTMirr does not match $MFT (record 3). Failed to mount '/dev/sda1': Input/output error NTFS is inconsistent. Run chkdsk /f on Windows then reboot it TWICE! The usage of the /f parameter is very IMPORTANT! No modification was made to NTFS by this software.

I then followed it’s instructions for both partitions that were having issues, one was fixed outright (the C: of the windows install) and one i had to do an extra step of sudo ntfsfix -d -b /dev/sda1 because it was still showing the warning after a still mounted chkdsk

from a short amount of testing this has completely cleared the issues. not sure if this was an issue new to the Linux 7 NTFS kernel changes or it they just so happened to coincided with the update

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.