Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zombie processes are not cleaned up by the container #30

Open
MJDSys opened this issue Oct 29, 2022 · 1 comment
Open

Zombie processes are not cleaned up by the container #30

MJDSys opened this issue Oct 29, 2022 · 1 comment

Comments

@MJDSys
Copy link

MJDSys commented Oct 29, 2022

I'm using the fah-gpu-amd container to run the folding@home client on my desktop, as the OS it runs is not supported by ROCm userspace and having a consistent environment is much simpler.

I've noticed if the folding@home client kills a subprocess for some reason (pausing, or if some bug is detected), the process ends up as a zombie and the folding@home client never cleans it up. As folding@home is PID 1, there is no other reaper and the process continue to exist and the client gets wedged, unable to respawn the work unit. Note: this happens to both GPU and CPU WU on the same machine.

Could the folding@home client be updated to reap these processes? Otherwise, could the containers be updated with a different PID 1 to reap these dead children? If the new PID 1 is wanted, I could take a look at creating an appropriate PR.

@skandix
Copy link

skandix commented May 11, 2023

I guess this can be fixed by adding --pid-file=/var/run/fahclient.pid on the entrypoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants