-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix readers cache truncation deadlock #5681
Fix readers cache truncation deadlock #5681
Conversation
Fixed move constructor and operator of `readers_cache::range_lock_holder`. Previously moving the `range_lock_holder` would result in lock being released. Signed-off-by: Michal Maslanka <michal@redpanda.com>
_locked_offset_ranges.end(), | ||
[reader_base_offset, reader_end_offset](const offset_range& range) { | ||
return reader_base_offset <= range.second | ||
&& reader_end_offset >= range.first; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Story time: When I finished grad school and interviewed at my first big job, this was the last coding problem in a long day of interviews. This part in particular is error prone. Also, the guy who gave me this problem is now making autonomous robots that kill things (weeds) with lasers.
_locked_offset_ranges.begin(), | ||
_locked_offset_ranges.end(), | ||
[reader_base_offset, reader_end_offset](const offset_range& range) { | ||
return reader_base_offset <= range.second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find. This condition looks correct to me. Previous one didn't make any sense to me. Good call putting it in a helper function.
} // namespace storage | ||
using namespace storage; | ||
|
||
FIXTURE_TEST(test_range_is_correctly_locked, readers_cache_test_fixture) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding a unit test.
}); | ||
}); | ||
|
||
produce.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions about this integration test.. nothing that should block this from being merged, IMO. We can discuss separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 (pending CI): The new range overlap condition makes much more sense than before. Thanks for adding tests. For "continuous improvement", let's remind each other during review that all new APIs / modules should come with a unit test.
auto lock_it = std::find_if( | ||
_locked_offset_ranges.begin(), | ||
_locked_offset_ranges.end(), | ||
[&reader](const offset_range& range) { | ||
return reader->lease_range_base_offset() > range.second | ||
|| reader->lease_range_end_offset() < range.first; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! The fix seems reasonable, though I'm curious how else this bug may manifest itself. Is the worst that can happen a minute-long deadlock? Or is there any way this can result in a segfault or segments being deleted incorrectly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one minute deadlock is how it manifests, it can not lead to any other more serious issue as read/write/truncate concurrency control is separate from readers cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm pending the CI failure audit
nice catch. that's great find.
std::erase(_cache->_locked_offset_ranges, _range); | ||
if (_range) { | ||
std::erase(_cache->_locked_offset_ranges, _range.value()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gosh. nice catch
auto holder = cache.evict_prefix_truncate(model::offset(10)).get(); | ||
|
||
// [0,1] is in [0,10] | ||
test_within_range(0, 1, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Fixed condition checking if range is locked. Incorrect check resulted in situations in which log truncation was blocked by pending readers. Log truncation is multi step process that ends with grabbing necessary segments write logs and data deletion. In order for truncation to grab the write lock all readers which own read lock units from range being truncated must be evicted from `readers_cache`. Since log truncation contains multiple scheduling points it may interleave with another fiber creating a reader for log that is currently being truncated. This reader MUST not be cached as truncation would need to wait for it to be evicted. Additionally no new readers can be created as truncation related write lock request is waiting in the `read_write_lock` underlying semaphore waiters queue. In order to prevent readers requesting truncated range from being cached readers cache maintain list of locked ranges i.e. ranges for which readers can not be cached. Previously an incorrect condition checking if reader belongs to locked range allow it to be cached preventing the `truncate` action to continue. This stopped all other writes and truncation for 60 seconds, after this duration the reader was evicted from the cache, its lease was released and trucation was able to finish. Fixed incorrect condition checking if reader is within the locked range. Fixes: redpanda-data#5510 Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
9a9f31a
to
1c31f91
Compare
Cover letter
Fixed condition checking if range is locked. Incorrect check resulted in
situations in which log truncation was blocked by pending readers.
Log truncation is multi step process that ends with grabbing necessary
segments write logs and data deletion. In order for truncation to grab
the write lock all readers which own read lock units from range being
truncated must be evicted from
readers_cache
. Since log truncationcontains multiple scheduling points it may interleave with another fiber
creating a reader for log that is currently being truncated. This reader
MUST not be cached as truncation would need to wait for it to be
evicted. Additionally no new readers can be created as truncation
related write lock request is waiting in the
read_write_lock
underlying semaphore waiters queue.
In order to prevent readers requesting truncated range from being cached
readers cache maintain list of locked ranges i.e. ranges for which
readers can not be cached.
Previously an incorrect condition checking if reader belongs to locked
range allow it to be cached preventing the
truncate
action tocontinue. This stopped all other writes and truncation for 60 seconds,
after this duration the reader was evicted from the cache, its lease was
released and truncation was able to finish.
Fixed incorrect condition checking if reader is within the locked
range.
Fixes: #5510
Release notes