-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test BadLogLines failures with uncaught raft::offset_monitor::wait_aborted (KgoVerifierWithSiTestLargeSegments.test_si_with_timeboxed, PartitionBalancerTest.test_fuzz_admin_ops) #5154
Comments
This uncaught exception is still in the code. I'm currently seeing it around the same time as I start a bunch of clients doing idempotent writes. I have a mixture of caught wait_aborted exceptions coming from the id_allocator machinery, and then some uncaught ones making it up to the RPC handler that's logging these as ERROR:
|
FAIL test: PartitionBalancerTest.test_fuzz_admin_ops (2/37 runs) |
Add a "raft::offset_monitor::wait_aborted" message to allow list redpanda-data#5154 is fixed
Add a "raft::offset_monitor::wait_aborted" message to allow list redpanda-data#5154 is fixed
Relevant discussion about the |
Aborts should be propagated as the standard ss::abort_requested_exception type which is understood by handlers to be ignored silently, as it occurs during normal shutdown. Timeouts remain specific exception type in offset_monitor, and in locations that used to catch + swallow both aborts and timeouts, timeouts are logged at WARN severity, as they are not necessarily indicative of a fault, but may indicate a system not operating at its best. Fixes: redpanda-data#5154
Add a "raft::offset_monitor::wait_aborted" message to allow list redpanda-data#5154 is fixed (cherry picked from commit db0ded6)
Add a "raft::offset_monitor::wait_aborted" message to allow list redpanda-data#5154 is fixed
Aborts should be propagated as the standard ss::abort_requested_exception type which is understood by handlers to be ignored silently, as it occurs during normal shutdown. Timeouts remain specific exception type in offset_monitor, and in locations that used to catch + swallow both aborts and timeouts, timeouts are logged at WARN severity, as they are not necessarily indicative of a fault, but may indicate a system not operating at its best. Fixes: redpanda-data#5154
Aborts should be propagated as the standard ss::abort_requested_exception type which is understood by handlers to be ignored silently, as it occurs during normal shutdown. Timeouts remain specific exception type in offset_monitor, and in locations that used to catch + swallow both aborts and timeouts, timeouts are logged at WARN severity, as they are not necessarily indicative of a fault, but may indicate a system not operating at its best. Fixes: redpanda-data#5154 (cherry picked from commit 927ea66)
This is similar to #4489; both cases have the
offset monitor wait aborted
exception.Reproduced in CDT here.
The text was updated successfully, but these errors were encountered: