-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: avoid this
capture in lambda coroutine
#6769
Conversation
- default the ctor - explicitly delete copy ctor - don't move trivial type (address) - mark move assignment as noexcept Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
There is some non-trivial things happening via the units destructor in conn_quota::put. the dtor might already be implicitly noexcept, but lets make that explicit just in case. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
It still is not clear that capturing `this` in a lambda coroutine is safe. The accept handler in the net server captures `this`. Additionally, we've been dealing for some time with this crash: ``` TRACE 2022-10-13 08:55:06,465 [shard 0] rpc - conn_quota.cc:118 - put(172.16.16.27) TRACE 2022-10-13 08:55:06,465 [shard 0] rpc - conn_quota.cc:130 - leaving put(172.16.16.27) AddressSanitizer:DEADLYSIGNAL ================================================================= ==16388==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7fa76bfb218f bp 0x7ffddbd1a140 sp 0x7ffddbd19ea0 T0) ==16388==The signal is caused by a READ memory access. ==16388==Hint: this fault was caused by a dereference of a high value address (see register values below). Disassemble the provided pc to learn which register was used. AddressSanitizer:DEADLYSIGNAL AddressSanitizer: nested bug in the same thread, aborting. ``` The only place where: ``` TRACE 2022-10-13 08:55:06,465 [shard 0] rpc - conn_quota.cc:130 - leaving put(172.16.16.27) ``` This is happening is in and around this lambda. However, nothing else seems obviously problematic which might lead to dereferencing a junk pointer. So this change is not definitely going to fix it as we don't really have proof that the problem is `this` capture being clobbered. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, although struggling to see how the this
capture would be problematic, because server
is a long lived object & istr the failures happening not during shutdown. Still coroutines lambdas are subtle and quick to anger.
Because the pointer to the long lived object is not a long lived object, technically it goes out of scope at the first I thought that a read somewhere that capturing this in a coroutine lambda on clang was ok, but we should not depend on chance, it may have changed, it may change in the future. |
Ben is right. The lambda transformation in cppinsights shows that So, this might not fix the issue, unfortunately. |
failure was #6997 |
Cover letter
A shot in the dark for #5575
This needs to bake in the tree for a while to be certain because of how rare we see this crash. If we can conclude at some point that the issue is gone, then we might consider backporting / closing the issue.
Also, if we don't find that the issue is resolved, then we might consider looking more closely at the connection_rate_limit feature that is applied in the net connection accept handler.
Backport Required
Release notes