Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure in test_cluster_rpunit - abandoned failed future #4807

Closed
r-vasquez opened this issue May 18, 2022 · 10 comments · Fixed by #4876 or #4895
Closed

Failure in test_cluster_rpunit - abandoned failed future #4807

r-vasquez opened this issue May 18, 2022 · 10 comments · Fixed by #4876 or #4895
Assignees
Labels

Comments

@r-vasquez
Copy link
Contributor

r-vasquez commented May 18, 2022

Build: https://buildkite.com/redpanda/redpanda/builds/10292#da32a2bb-c4ab-40b6-8c91-27b60adf6f11

Error:

DEBUG 2022-05-18 20:52:12,752 [shard 0] cluster - min by % 1610612736, min bytes 1073741824, disk.free 1073740800 -> alert true
--
  | ERROR 2022-05-18 20:52:12,752 [shard 0] cluster - storage space alert: free space at 3.333% on cluster_test.zB5y8O.2: 30.000GiB total, 1023.999MiB free, min. free 1024.000MiB. Please adjust retention policies as needed to avoid running out of space.
  | INFO  2022-05-18 20:52:12,752 [shard 0] cluster - Updated local state.
  | INFO  2022-05-18 20:52:12,752 [shard 0] cluster - ~local_monitor_fixture: destroy
  |  
  | *** No errors detected
  | *** 1 abandoned failed future(s) detected
  | Failing the test because fail was requested by --fail-on-abandoned-failed-futures
  | Test Exit code 3

<...>
--
  | Total Test time (real) = 570.60 sec
  |  
  | The following tests FAILED:
  | 39 - test_cluster_rpunit (Failed)
  | Errors while running CTest
@r-vasquez r-vasquez added kind/bug Something isn't working ci-failure labels May 18, 2022
@r-vasquez
Copy link
Contributor Author

Build log doesn't have the

==1624==ERROR: LeakSanitizer: detected memory leaks

As in the ones in #3338

@dimitriscruz
Copy link
Contributor

@mmaslankaprv
Copy link
Member

@dimitriscruz
Copy link
Contributor

@dotnwat
Copy link
Member

dotnwat commented May 22, 2022

This looks like it is originating from the metrics reporter. The background work that is run under the spawn_with_gate does a lot of stuff that might be exceptional. but the default spawn helper only ignores a few common exceptional cases.

DEBUG 2022-05-21 20:30:26,260 [shard 5] raft - [group_id:17, {kafka/test-2/4}] consensus.cc:1997 - Replicating group configuration {current: {voters: {}, learners: {{id: {3}, revision: {75}}}}, old:{{voters: {{id: {2}, revision: {18}}}, learners: {}}}, revision: 75, brokers: {{id: 2, kafka_advertised_listeners: {{:{host: 127.0.0.1, port: 9094}}
}, rpc_address: {host: 127.0.0.1, port: 11002}, rack: {{i-am-rack}}, properties: {cores 48, mem_available 173, disk_available 93}, membership_state: active}, {id: 3, kafka_advertised_listeners: {{:{host: 127.0.0.1, port: 9095}}}, rpc_address: {host: 127.0.0.1, port: 11003}, rack: {{i-am-rack}}, properties: {cores 48, mem_available 173, disk_ava
ilable 93}, membership_state: active}}}                                                                                                                                                                                                                                                                                                                   
WARN  2022-05-21 20:30:26,248 [shard 0] seastar - Entering Exceptional future ignored: std::exception (std::exception), backtrace: 0x4487e24 0x41769fe 0x36e75e4 0x42099ef 0x420ce47 0x420a275 0x41590bc 0x4156da0 0x1d10ec7 0x41b93df /lib64/libpthread.so.0+0x9298 /lib64/libc.so.6+0x1006a2                                                            
   --------                                                                                                                                                                                                                                                                                                                                               
   seastar::continuation<seastar::internal::promise_base_with_type<void>,                                                                                                                                                                                                                                                                                 
   seastar::future<void> seastar::future<void>::handle_exception_type<auto                                                                                                                                                                                                                                                                                
   ssx::spawn_with_gate_then<cluster::metrics_reporter::report_metrics()::$_2>(seastar::gate&,                                                                                                                                                                                                                                                            
   cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(seastar::broken_condition_variable                                                                                                                                                                                                                                                       
   const&)>(cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(cluster::metrics_reporter::report_metrics()::$_2&&),                                                                                                                                                                                                                            
   seastar::futurize<cluster::metrics_reporter::report_metrics()::$_2>::type                                                                                                                                                                                                                                                                              
   seastar::future<void>::then_wrapped_nrvo<seastar::future<void>,                                                                                                                                                                                                                                                                                        
   seastar::future<void> seastar::future<void>::handle_exception_type<auto                                                                                                                                                                                                                                                                                
   ssx::spawn_with_gate_then<cluster::metrics_reporter::report_metrics()::$_2>(seastar::gate&,                                                                                                                                                                                                                                                            
   cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(seastar::broken_condition_variable                                                                                                                                                                                                                                                       
   const&)>(cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(cluster::metrics_reporter::report_metrics()::$_2&&)>(seastar::future<void>                                                                                                                                                                                                      
   seastar::future<void>::handle_exception_type<auto                                                                                                                                                                                                                                                                                                      
   ssx::spawn_with_gate_then<cluster::metrics_reporter::report_metrics()::$_2>(seastar::gate&,                                                                                                                                                                                                                                                            
   cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(seastar::broken_condition_variable                                                                                                                                                                                                                                                       
   const&)>(cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(cluster::metrics_reporter::report_metrics()::$_2&&)&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&,                                                                                                                                                             
   seastar::future<void> seastar::future<void>::handle_exception_type<auto                                                                                                                                                                                                                                                                                
   ssx::spawn_with_gate_then<cluster::metrics_reporter::report_metrics()::$_2>(seastar::gate&,                                                                                                                                                                                                                                                            
   cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(seastar::broken_condition_variable                                                                                                                                                                                                                                                       
   const&)>(cluster::metrics_reporter::report_metrics()::$_2&&)::'lambda'(cluster::metrics_reporter::report_metrics()::$_2&&)&,                                                                                                                                                                                                                           
   seastar::future_state<seastar::internal::monostate>&&), void>                                                                                                  

@dotnwat
Copy link
Member

dotnwat commented May 22, 2022

appears to be a bare std::exception originating from ss::future<> metrics_reporter::propagate_cluster_id() { .. but I haven't been able to get more info.

there were two recent things that went in related to this: @jcsp with metrics reporter changes and @mmaslankaprv with changes to some stop_signal changes in test fixtures.

jcsp added a commit to jcsp/redpanda that referenced this issue May 23, 2022
This can happen during shutdown, for the exception types
that ssx::spawn_with_gate doesn't already handle.  Rather
rare in real life but much more frequent in tests like
test_cluster_rpunit, resulting in an "ignored exceptional
future" error.

Fixes redpanda-data#4807
@ZeDRoman
Copy link
Contributor

@dotnwat
Copy link
Member

dotnwat commented May 23, 2022

backtracking the throw in this PR #4888

this smells like a destructor throwing :(


  • WARN 2022-05-23 16:24:03,024 [shard 0] cluster - metrics_reporter.cc:310 - XXX do_patch: std::exception (std::exception)

This exception is thrown here... digging deeper

    try {
        result = co_await cfe->do_patch(
          config_update_request{.upsert = {{"cluster_id", _cluster_uuid}}},
          model::timeout_clock::now() + 5s);
    } catch (...) {
        vlog(clusterlog.warn, "XXX do_patch 2: {}", std::current_exception());
        throw;
    }

@dotnwat
Copy link
Member

dotnwat commented May 23, 2022

adding try {} blocks deeper down to try to narrow down the source results in a heisenbug

@twmb
Copy link
Contributor

twmb commented May 25, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants