Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

earlyoom will terminate itself #205

Closed
jcaesar opened this issue Jun 11, 2020 · 19 comments
Closed

earlyoom will terminate itself #205

jcaesar opened this issue Jun 11, 2020 · 19 comments

Comments

@jcaesar
Copy link

jcaesar commented Jun 11, 2020

mem avail:    21 of   950 MiB ( 2.27%), swap free:   81 of 1023 MiB ( 7.95%)
low memory! at or below SIGTERM limits: mem 10.00%, swap 15.00%
sending SIGTERM to process 397 uid 61876 "earlyoom": badness 0, VmRSS 1 MiB

Maybe this should be prevented by default? (Apologies if it is already, running earlyoom 1.6-1 from archlinux.)

@rfjakob
Copy link
Owner

rfjakob commented Jun 12, 2020

Hmm, how does this happen? Is earlyoom really the process using most memory on this machine?

@jcaesar
Copy link
Author

jcaesar commented Jun 12, 2020

Probably. There's not much else running that's not on ignore.

Config:

EARLYOOM_ARGS="-r 60 -s 15,5 -m 10,5 -p -n --avoid '(^|/)(init|systemd.*|Xorg|sshd)$'"

More log:

Jun 12 06:38:31: mem avail:   607 of   950 MiB (63.93%), swap free: 1009 of 1023 MiB (98.61%)
Jun 12 06:39:16: mem avail:    19 of   950 MiB ( 2.10%), swap free:  153 of 1023 MiB (14.96%)
Jun 12 06:39:16: low memory! at or below SIGTERM limits: mem 10.00%, swap 15.00%
Jun 12 06:39:16: sending SIGTERM to process 696 uid 1001 "ipfs": badness 757, VmRSS 637 MiB
Jun 12 06:39:26: kill failed: Timer expired
Jun 12 06:39:26: mem avail:    24 of   950 MiB ( 2.58%), swap free:   89 of 1023 MiB ( 8.76%)
Jun 12 06:39:26: low memory! at or below SIGTERM limits: mem 10.00%, swap 15.00%
Jun 12 06:39:26: sending SIGTERM to process 696 uid 1001 "ipfs": badness 799, VmRSS 656 MiB
Jun 12 06:39:26: process exited after 0.1 seconds
Jun 12 06:39:26: mem avail:    23 of   950 MiB ( 2.47%), swap free:   87 of 1023 MiB ( 8.59%)
Jun 12 06:39:26: low memory! at or below SIGTERM limits: mem 10.00%, swap 15.00%
Jun 12 06:39:26: sending SIGTERM to process 324 uid 0 "rngd": badness 1, VmRSS 1 MiB
Jun 12 06:39:27: process exited after 0.5 seconds
Jun 12 06:39:27: mem avail:    23 of   950 MiB ( 2.45%), swap free:   83 of 1023 MiB ( 8.13%)
Jun 12 06:39:27: low memory! at or below SIGTERM limits: mem 10.00%, swap 15.00%
Jun 12 06:39:27: sending SIGTERM to process 396 uid 81 "dbus-daemon": badness 0, VmRSS 2 MiB
Jun 12 06:39:27: process exited after 0.2 seconds
Jun 12 06:39:27: mem avail:    21 of   950 MiB ( 2.27%), swap free:   81 of 1023 MiB ( 7.95%)
Jun 12 06:39:27: low memory! at or below SIGTERM limits: mem 10.00%, swap 15.00%
Jun 12 06:39:27: sending SIGTERM to process 397 uid 61876 "earlyoom": badness 0, VmRSS 1 MiB

@hakavlad
Copy link
Contributor

hakavlad commented Jun 12, 2020

Could you provide the output with the -d option, please?

@rfjakob
Copy link
Owner

rfjakob commented Jun 12, 2020

I'm looking at the log, and something is going wrong before. Killing ipfs should have freed 656 megabytes, but the amount of available mem and swap did not change!

@hakavlad
Copy link
Contributor

hakavlad commented Jun 12, 2020

It seems that the ipfs process has become invisible for earlyoom. earlyoom is trying to kill other processes, although it seems that the ipfs process has not yet terminated.

Maybe ipfs became a zombie and has not yet freed his memory, but earlyoom considers it already dead.

@hakavlad
Copy link
Contributor

See #128 (comment)

@rfjakob
Copy link
Owner

rfjakob commented Jun 13, 2020

I guess we should wait until we see available memory rise

@jcaesar
Copy link
Author

jcaesar commented Jun 13, 2020

Could you provide the output with the -d option, please?

Unlikely that I can reproduce the problem. I've since circumvented by adding more swap, and running gc/compaction once. Now, the usage doesn't go quite as high any more.
And even before, this wasn't a "happens every time" thing.

@rfjakob
Copy link
Owner

rfjakob commented Jun 14, 2020

I tried hard to reproduce this behavoir, on my PC, on a VM, and also a Raspberry Pi 4. But I always get the process exited after x seconds AFTER mem avail got back to normal. Example with -d:

sending SIGTERM to process 6428 uid 1001 "membomb": badness 870, VmRSS 1635 MiB
mem avail:    31 of  1848 MiB ( 1.71%), swap free:   49 of  499 MiB (10.00%)
process state: D
mem avail:    31 of  1848 MiB ( 1.71%), swap free:   49 of  499 MiB ( 9.85%)
process state: D
mem avail:    31 of  1848 MiB ( 1.72%), swap free:   47 of  499 MiB ( 9.50%)
process state: D
mem avail:    31 of  1848 MiB ( 1.69%), swap free:   45 of  499 MiB ( 9.20%)
process state: D
mem avail:    31 of  1848 MiB ( 1.71%), swap free:   44 of  499 MiB ( 8.95%)
process state: R
mem avail:    31 of  1848 MiB ( 1.71%), swap free:   43 of  499 MiB ( 8.80%)
process state: D
mem avail:    31 of  1848 MiB ( 1.71%), swap free:   42 of  499 MiB ( 8.50%)
process state: D
mem avail:    31 of  1848 MiB ( 1.69%), swap free:   40 of  499 MiB ( 8.20%)
process state: D
mem avail:    31 of  1848 MiB ( 1.70%), swap free:   39 of  499 MiB ( 8.00%)
process state: D
mem avail:    31 of  1848 MiB ( 1.73%), swap free:   28 of  499 MiB ( 5.65%)
process state: D
mem avail:    28 of  1848 MiB ( 1.57%), swap free:   24 of  499 MiB ( 4.85%)
escalating to SIGKILL after 1.0 seconds
process state: D
mem avail:    32 of  1848 MiB ( 1.78%), swap free:   21 of  499 MiB ( 4.35%)
process state: D
mem avail:    30 of  1848 MiB ( 1.63%), swap free:   12 of  499 MiB ( 2.45%)
process state: D
mem avail:    30 of  1848 MiB ( 1.63%), swap free:   12 of  499 MiB ( 2.45%)
process state: D
mem avail:    30 of  1848 MiB ( 1.63%), swap free:   12 of  499 MiB ( 2.45%)
process state: D
mem avail:    30 of  1848 MiB ( 1.63%), swap free:   12 of  499 MiB ( 2.45%)
process state: D
mem avail:    32 of  1848 MiB ( 1.77%), swap free:   12 of  499 MiB ( 2.45%)
process state: D
mem avail:    32 of  1848 MiB ( 1.77%), swap free:   12 of  499 MiB ( 2.45%)
process state: D
mem avail:    32 of  1848 MiB ( 1.77%), swap free:   18 of  499 MiB ( 3.75%)
process state: R
mem avail:   208 of  1848 MiB (11.28%), swap free:  403 of  499 MiB (80.75%)
process state: R
mem avail:   984 of  1848 MiB (53.29%), swap free:  403 of  499 MiB (80.75%)
process state: R
mem avail:  1643 of  1848 MiB (88.94%), swap free:  403 of  499 MiB (80.80%)
process exited after 2.1 seconds

@jcaesar
Copy link
Author

jcaesar commented Jun 15, 2020

Do you want to give me the source for membomb? I can try to reproduce this on something more like the original environment (An EC2 t3.micro running ami-0fcf6df96bd1453cc with a swap file on a 50 GB gp2 EBS vol.)

@jcaesar
Copy link
Author

jcaesar commented Jun 15, 2020

I reported this because I thought it would probably be trivial to add a check against getpid. It seems you already have, but that didn't help in this situation. Hmm.

@rfjakob
Copy link
Owner

rfjakob commented Jun 15, 2020

membomb is in contrib, but the easiest is just running

tail /dev/zero

Would be nice if you can reprodruce this, and then I'll maybe have to learn how to use aws :)

The getpid check we have is only for a specific case (hidepid), so it did not trigger here.

My reasoning for not preventing terminating ourselves is: what if there's a memory leak in earlyoom?

@hakavlad
Copy link
Contributor

hakavlad commented Jun 15, 2020

what if there's a memory leak in earlyoom?

Set in earlyoom.service, for example:

TasksMax=10
MemoryMax=50M
MemorySwapMax=0

#207

@jcaesar
Copy link
Author

jcaesar commented Jun 17, 2020

It doesn't seem to be reproducible with either membomb nor tail /dev/zero. I even tried blocking the membomb process by raise(SIGSTOP) or opening a file on an sshfs that has been frozen by pkil -STOP sftp-server in the SIGTERM handler. earlyoom tries to continuously terminate/kill the wrong process in that case, but as it doesn't succeed, it doesn't switch to killing itself. I'm not sure what exaclty go-ipfs does that will keep using memory after the process exited. :/

@jcaesar
Copy link
Author

jcaesar commented Aug 19, 2020

Is it possible that this happened because parse_meminfo ran and earlyoom detected oom, but the large process terminated itself right that moment, leaving kill_largest_process with no fodder? (I'm not sure how this could possibly happen multiple times on the same machine…)

@rfjakob
Copy link
Owner

rfjakob commented Aug 30, 2020

Hm, yes, sounds plausible.

@hakavlad
Copy link
Contributor

hakavlad commented Sep 1, 2020

@rfjakob Do you have any ideas how to fix it?

@jcaesar
Copy link
Author

jcaesar commented Sep 2, 2020

One could check whether the memory situation is still dire after picking a process.

But still, what earlyoom does is simply not possible to do entirely race-free, right? (With the larger pid space on newer kernels, it's certainly less troubling, though.)
Btw, is the linux kernel oom killer race free? Could this be done race-free if earlyoom was a kernel module? (Just curious, earlyoom actually being a kernel module is probably a bad idea.)

@rfjakob
Copy link
Owner

rfjakob commented Sep 2, 2020

One could check whether the memory situation is still dire after picking a process.

Yes, that's what I was thinking about as well. Because, if we look at the stats, kill_largest_process takes about 400x longer than parse_meminfo (check how free memory is doing). That 400x number is roughly the number of processes you have running, so it will be higher with lots of processes, because earlyoom has to look at each of them.

$ make bench
go test -run=NONE -bench=.
goos: linux
goarch: amd64
pkg: github.com/rfjakob/earlyoom
Benchmark_parse_meminfo-4          	  199910	      6511 ns/op
Benchmark_kill_largest_process-4   	     534	   2468413 ns/op
Benchmark_get_oom_score-4          	  195270	      5567 ns/op
Benchmark_get_oom_score_adj-4      	  228506	      5071 ns/op
Benchmark_get_vm_rss_kib-4         	  211107	      5487 ns/op
Benchmark_get_comm-4               	  162548	      6702 ns/op
PASS
ok  	github.com/rfjakob/earlyoom	7.660s

I think the kernel oom killer is race-free because it freezes all processes (or freezes all allocations?)

rfjakob added a commit that referenced this issue Sep 5, 2020
Split kill_largest_process into find_largest_process + kill_process.

#205
@rfjakob rfjakob closed this as completed in 2d0b8d2 Sep 5, 2020
rfjakob added a commit that referenced this issue Sep 5, 2020
The run time of find_largest_process is proportional to the number
of processes, and takes 2.5ms on my box with a running Gnome desktop (try "make bench").
This is long enough that the situation may have changed in the meantime,
so we double-check if we still need to kill anything.
The run time of parse_meminfo is only 6us on my box and independent of the number
of processes (try "make bench").

Fixes #205
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants