Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: Failures in test_cluster_rpunit #2175

Closed
dotnwat opened this issue Aug 26, 2021 · 7 comments · Fixed by #2641
Closed

raft: Failures in test_cluster_rpunit #2175

dotnwat opened this issue Aug 26, 2021 · 7 comments · Fixed by #2641
Assignees
Labels
area/raft ci-failure kind/bug Something isn't working

Comments

@dotnwat
Copy link
Member

dotnwat commented Aug 26, 2021

https://buildkite.com/vectorized/redpanda/builds/1812#b6cadff8-325a-4965-9b99-4568d0b8d9d0

TRACE 2021-08-26 17:59:05,998 [shard 0] raft - [group_id:0, {redpanda/controller/0}] consensus.cc:1396 - Append entries request: {raft_group:{0}, commit_index:{11}, term:{1}, prev_log_index:{11}, prev_log_term:{1}}
--
  | /vectorized/llvm/bin/../include/c++/v1/iterator:1488:42: runtime error: reference binding to null pointer of type 'seastar::promise<raft::append_entries_reply>'
  | SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /vectorized/llvm/bin/../include/c++/v1/iterator:1488:42 in
  | ../../../src/v/raft/append_entries_buffer.cc:139:24: runtime error: member call on null pointer of type 'seastar::promise<raft::append_entries_reply>'
  | SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../src/v/raft/append_entries_buffer.cc:139:24 in
  | ERROR 2021-08-26 17:59:05,999 [shard 41] rpc - Error dispatching client reads: seastar::broken_promise (broken promise)
  | /vectorized/include/seastar/core/future.hh:1013:10: runtime error: member call on null pointer of type 'seastar::promise<raft::append_entries_reply> *'
  | SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /vectorized/include/seastar/core/future.hh:1013:10 in
  | /vectorized/include/seastar/core/future.hh:923:10: runtime error: member call on null pointer of type 'seastar::internal::promise_base_with_type<raft::append_entries_reply> *'
  | SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /vectorized/include/seastar/core/future.hh:923:10 in
  | /vectorized/include/seastar/core/future.hh:896:19: runtime error: member call on null pointer of type 'seastar::internal::promise_base_with_type<raft::append_entries_reply> *'
  | SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /vectorized/include/seastar/core/future.hh:896:19 in
  | AddressSanitizer:DEADLYSIGNAL
  | =================================================================
  | DEBUG 2021-08-26 17:59:05,999 [shard 41] rpc - could not parse header from client: 127.0.0.1:53609
  | ==1421==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x56146ddfdbac bp 0x7f6589ab0420 sp 0x7f6589ab0400 T1)
  | ==1421==The signal is caused by a READ memory access.
  | ==1421==Hint: address points to the zero page.
  | DEBUG 2021-08-26 17:59:05,999 [shard 41] rpc - server.cc:94 - vectorized internal rpc protocol Closing: 127.0.0.1:53609
@jcsp jcsp added area/raft kind/bug Something isn't working labels Aug 27, 2021
@jcsp jcsp changed the title raft: AddressSanitizer: SEGV on unknown address raft: test_cluster_rpunit AddressSanitizer: SEGV on unknown address Sep 23, 2021
@jcsp
Copy link
Contributor

jcsp commented Sep 28, 2021

@jcsp jcsp changed the title raft: test_cluster_rpunit AddressSanitizer: SEGV on unknown address raft: Failures in test_cluster_rpunit Sep 28, 2021
jcsp added a commit to jcsp/redpanda that referenced this issue Sep 29, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
jcsp added a commit to jcsp/redpanda that referenced this issue Sep 29, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
jcsp added a commit to jcsp/redpanda that referenced this issue Sep 29, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
@mmaslankaprv
Copy link
Member

I think that we fixed all the sanitizer & UAF issues, the most recent one failed because of ignored failed future:


WARN  2021-09-28 07:09:57,229 [shard 0] seastar - Exceptional future ignored: std::__1::system_error (error system:104, Connection reset by peer), backtrace: 0x2fa1e3b 0x2cef953 0x15987af 0x2d7458b 0x2d76f9f 0x2d74dcf 0x2cd36b3 0x2cd1d8f 0x150045f 0x2d2f3ab /lib64/libpthread.so.0+0x7ff7 /lib64/libc.so.6+0xdb35b

   --------

   seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<coproc::wasm::event_listener::start()::$_0>(seastar::gate&, coproc::wasm::event_listener::start()::$_0&&)::'lambda'(), false>, seastar::futurize<coproc::wasm::event_listener::start()::$_0>::type seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<coproc::wasm::event_listener::start()::$_0>(seastar::gate&, coproc::wasm::event_listener::start()::$_0&&)::'lambda'(), false> >(seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<coproc::wasm::event_listener::start()::$_0>(seastar::gate&, coproc::wasm::event_listener::start()::$_0&&)::'lambda'(), false>&&)::'lambda'(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<auto seastar::internal::invoke_func_with_gate<coproc::wasm::event_listener::start()::$_0>(seastar::gate&, coproc::wasm::event_listener::start()::$_0&&)::'lambda'(), false>&, seastar::future_state<seastar::internal::monostate>&&), void>

jcsp added a commit to jcsp/redpanda that referenced this issue Sep 30, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
@jcsp
Copy link
Contributor

jcsp commented Sep 30, 2021

Yes, I looked at two recent cases and they were both Exceptional future ignored inside of create_single_topic_test_at_current_broker

@jcsp
Copy link
Contributor

jcsp commented Sep 30, 2021

Looks like there is at least one other issue here. On a branch where I had autocreate_tests.cc commented out of the test binary build, I got a sanitizer failure in test_creating_partitions: https://buildkite.com/vectorized/redpanda/builds/2763#99b0172f-bff0-410f-8d89-119866215569

jcsp added a commit to jcsp/redpanda that referenced this issue Sep 30, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
jcsp added a commit to jcsp/redpanda that referenced this issue Oct 1, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
jcsp added a commit to jcsp/redpanda that referenced this issue Oct 1, 2021
Pending redpanda-data#2175

Signed-off-by: John Spray <jcs@vectorized.io>
@jcsp
Copy link
Contributor

jcsp commented Oct 1, 2021

Another variant here: sanitizer reports memory leaks
https://buildkite.com/vectorized/redpanda/builds/2825#21d82e06-9846-41a9-9584-0c2c86e9f525

@mmaslankaprv
Copy link
Member

I will close this for now since it has been some time without any failures in the test_cluster_rpunit.

@jcsp
Copy link
Contributor

jcsp commented Oct 14, 2021

This one had a couple of source files commented out of the CMakeLists for cluster/tests.

Those lines are un-commented on test-staging, and this test_cluster_rpunit hasn't failed there in the last week, so I think we're good to reinstate them. @mmaslankaprv could you open a PR for that please?

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/raft ci-failure kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants