-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: memory leak in test_cluster_rpunit #3338
Comments
While testing
I ran There wasn't a lot of load on my laptop while I was running the test, but there was some. |
Suggestion: re-run with lots of load and run overnight |
Check John's stat script this looks like it hasn't happened in 1/1000 |
179 Unstable test: ctest.test_cluster_rpunit (1/1306 runs failed) Unfortunately this one still looks like it is around. |
Suggestion: enhance addr2lines logic in cmake_test.py to also resolve additional stacktraces for ASAN, then wait for a reproduction. |
The most recent failure (https://buildkite.com/redpanda/redpanda/builds/6926#24fa4c6e-d33e-4262-8aab-9592908d28ae) looks like leaks of seastar internals, very similar to what was seen here: #3626 in a ducktape test. It seems like there is a very rare case where seastar shutdown is triggering LeakSanitizer, even outside of the unit test fixtures. |
No reports of this in several months... ACTION ITEM: asan in ducktape seems to be broken and the latest examples of this leak don't have backtraces decoded. gotta fix that. Read on for interesting stuff related to other LeakSan stuff: I've discovered in the past few days that seastar has a category of memory leak on shutdown related to the interaction between timers and sleeps. It's fairly easy to recreate: use a future timeout in combination with a sleep and the freeing of the sleep control structure appears to be skipped (either the futures are dangling or the then block doesn't fire because a exception is delivered). ok the traces posted in this ticket don't look like the same thing, but the category of of problem may be the same: weird stuff happens on shutdown. |
The latest reports do not show any backtrace decoding, which makes things impossible without having the binaries. And we don't seem to package up the test binaries in buildkite. Currently trying to figure out why backtrace decoding is not working. |
Two things helping here:
Next time we hit this I think we should try to hammer whichever smaller test (from the split up test suite) to see if we can reproduce it more eaisly and in addition we could send an updated backtrace over to seastar-dev because it does look like it is originating from seastar. |
I think i've observed this in a PR build here: https://buildkite.com/redpanda/vtools/builds/3791#0183f014-2b8b-40f7-b8fc-cb4684173ca2 Notably now that the cluster tests have been broken up the exact test that was run can be narrowed down:
Seems like the file in question is |
A couple years ago a similar leak was reported to seastar in this issue scylladb/seastar#738. I've added a simple reproducer test in a comment on that upstream seastar issue scylladb/seastar#738 (comment). What is happening?In the simplest for of a net server we create a seastar server socket on core 0 and call accept on the socket to handle new connections. In reality we want to be able to spread connection handling across cores. So we can ask seastar to accept connections on different cores based on various policies. However, under the hood there is still a single socket and seastar virtualizes some of the handling. There are advanced uses, but in the case of redpanda all new connections are accepted internally on core 0, and if the core that is supposed to handle this connection is different then internally seastar puts the connection onto a thread local list on the target core. Then, when the target core calls accept (or accept_abort) it removes the connection from the list. It's like a rednezvous data structure. it's this thing
the details of what all these things are isn't really important. the important part is that thread local data is destroyed when a thread exits. in the case of this data structure the salient piece is this:
so when a thread exits anything in that conn_q data structure will cause a future to be scheduled. what is this thread? it's a reactor thread. you might see where this is starting to get weird (this is also visible from the tls_dtor call in the memleak backtrace in the linked seastar issue). needless to say this scheduled continuation being created near the exit of the reactor doesn't get cleaned up / waited on and results in a memory leak. so how do we keep this conn_q structure clean? redpanda solution
the one race condition that appears to exist is as follows
then a new conn_q entry could be enqued by core-0 immediately prior to shuting down its own socket and after the target core (say core N shut down its socket). this would lead to conn_q having a orphaned entry. on DEBUG builds futures are scheduled in a shuffled order. so this may be more likely, in addition to races being more common. Solution: if that race is the cause then the issue can be slightly mitigated by
finally there are some new things like
which I don't fully understand. like, are there scenarios where input and abort sources aren't shutdown? ConclusionMy opinion is that the race is fundamental and needs to be fixed in seastar. For example, the mitigation described above would only work if waiting on core 0 to shutdown guaranteed that the scheduled cross-core future which updates the conn_q container had also run, and that cannot be guaranteed. the property that we want is that something like calling abort_accept woudl prevent future conn_q additions. unfrotunately, it doesn't really seem to be connected to a specific socket. so this might be hard. it may just mean that tls destructor or something should disable the scheduling of the backgroudn future. |
Potential fix scylladb/seastar#1265 |
I think we're seeing this in
Has a similar trace pointing at posix-stack |
Yes, this is the same leak @andrwng. One option is that we can merge my "fix". Seastar doesn't want to merge it because it doesn't really fix the leak, it just makes ASAN happy by freeing the data if it leaked, before Seastar exits. |
A memory leak pops up in our CI from time to time (redpanda-data#3338). It is caused by a very narrow race within Seastar when incoming connections arrive while a socket server is being shutdown. I've posted a "fix" for this in upstream Seastar (link below) which is effectively rejected because it doesn't really fix the leak, but merely cleans up on exit to make LSAN happy. AFAICT this leak can only happen when a socket server is being shut down, and in Redpanda, this only happens when Redpanda is shutting down. In this case it is harmless, but annoying because it will occassionally fail our CI. Adding an LSAN suppressions file allows us to instruct LSAN to ignore this particular leak when it occassionally does trigger. Fixes: redpanda-data#3338 See scylladb/seastar#1265 for more discussion and a link to an issue with a reproducer. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
/backport v22.3.x |
A memory leak pops up in our CI from time to time (redpanda-data#3338). It is caused by a very narrow race within Seastar when incoming connections arrive while a socket server is being shutdown. I've posted a "fix" for this in upstream Seastar (link below) which is effectively rejected because it doesn't really fix the leak, but merely cleans up on exit to make LSAN happy. AFAICT this leak can only happen when a socket server is being shut down, and in Redpanda, this only happens when Redpanda is shutting down. In this case it is harmless, but annoying because it will occassionally fail our CI. Adding an LSAN suppressions file allows us to instruct LSAN to ignore this particular leak when it occassionally does trigger. Fixes: redpanda-data#3338 See scylladb/seastar#1265 for more discussion and a link to an issue with a reproducer. Signed-off-by: Noah Watkins <noahwatkins@gmail.com> (cherry picked from commit 778704b)
A memory leak pops up in our CI from time to time (redpanda-data#3338). It is caused by a very narrow race within Seastar when incoming connections arrive while a socket server is being shutdown. I've posted a "fix" for this in upstream Seastar (link below) which is effectively rejected because it doesn't really fix the leak, but merely cleans up on exit to make LSAN happy. AFAICT this leak can only happen when a socket server is being shut down, and in Redpanda, this only happens when Redpanda is shutting down. In this case it is harmless, but annoying because it will occassionally fail our CI. Adding an LSAN suppressions file allows us to instruct LSAN to ignore this particular leak when it occassionally does trigger. Fixes: redpanda-data#3338 See scylladb/seastar#1265 for more discussion and a link to an issue with a reproducer. Signed-off-by: Noah Watkins <noahwatkins@gmail.com> (cherry picked from commit 778704b)
Splitting this specific case out of #2909.
Examples
3 cases of test_cluster_rpunit leak all for debug builds.
Last reported case Nov 10
The text was updated successfully, but these errors were encountered: