Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandaproxy: Fix a crash when publishing multiple protobof schema #3596

Merged

Conversation

BenPope
Copy link
Member

@BenPope BenPope commented Jan 24, 2022

Cover letter

Fix #3559

This is a logical continuation of work done in #3214 and addresses a similar bug.

Address some CPP Core guidelines:

Fix the bug:

  • Made a copy of a schema to prevent a crash.

Notes for reviewer

CP.51 was fixed by copying captures to the coroutine frame.

It's probably worth pointing out that this trick works because seastar awaitable are std::suspend_never initial_suspend() noexcept.

And for posterity: But what if I told you there was an easier way, where you can have your capturing lambda be a coroutine?

Changes in force-push

  • Add a smattering of std::moves to the sub captures.

Release notes

Improvements

@BenPope BenPope added kind/bug Something isn't working area/schema-registry Schema Registry service within Redpanda need-backport labels Jan 24, 2022
@BenPope BenPope requested a review from ivotron as a code owner January 24, 2022 19:27
@BenPope BenPope self-assigned this Jan 24, 2022
src/v/pandaproxy/schema_registry/sharded_store.cc Outdated Show resolved Hide resolved
Comment on lines -105 to +107
co_return co_await make_protobuf_schema_definition(
*this, std::move(schema));
co_return co_await make_protobuf_schema_definition(*this, schema);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the undefined behavior this looks ok. unless I'm missing something, does this change fixing the issue point at the possibility that there is a different issue being masked?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made all the CP.51 and CP.53 changes to eliminate me assuming this is a miscompile. I've probably missed some, but this appears to reliably fix the crash, for now. Very similar to the issue discussed here #3214 (comment)

Copy link
Member Author

@BenPope BenPope Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reliably create the crash:

  • With the test cases.
  • Manually running a curl 10 times.
  • In the Materialize test suite.

This one change fixes them all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically what happens is that it appears as though the schema type() is lost, the variant inside valid_schema_def then assumes the first type, avro, then a valid_protobuf schema is effectively reinterpret_cast as valid_avro, and it all goes very wrong from there, usually deep in the memory allocator.

It's a Heisenbug though, it only occurs in release mode, and adding instrumentation often fixes it. It does reproduce in the debugger, but lots of stuff has been inlined, so it's hard to figure out what went wrong.

@BenPope BenPope force-pushed the pandaproxy-coroutine-copy-capture branch from 6633940 to 913f19d Compare January 24, 2022 23:08
@BenPope BenPope requested a review from dotnwat January 24, 2022 23:45
@dotnwat
Copy link
Member

dotnwat commented Jan 26, 2022

With the test cases.

@BenPope which test will trigger the bug?

@BenPope
Copy link
Member Author

BenPope commented Jan 26, 2022

With the test cases.

@BenPope which test will trigger the bug?

Yeah, pretty reliably.

I usually just run a couple of iterations of this:

curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; ","schemaType":"PROTOBUF"}' http://localhost:8081/subjects/empty.proto/versions
curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; package google.protobuf; message Timestamp { int64 seconds = 1;  int32 nanos = 2; }","schemaType":"PROTOBUF"}' "http://localhost:8081/subjects/google%2Fprotobuf%2Ftimestamp.proto/versions"
curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; import \"google/protobuf/timestamp.proto\"; message Importee1 { bool b = 1; }; message Importee2 {google.protobuf.Timestamp ts = 3;}", "references": [{"name": "google/protobuf/timestamp.proto", "subject": "google/protobuf/timestamp.proto", "version": 1}],"schemaType":"PROTOBUF"}' "http://localhost:8081/subjects/importee.proto/versions"
curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; import \"empty.proto\"; import \"importee.proto\"; message Importer { Importee1 importee1 = 1; Importee2 importee2 = 2;}", "references": [{"name": "empty.proto", "subject": "empty.proto", "version": 1},{"name": "importee.proto", "subject": "importee.proto", "version": 1}],"schemaType":"PROTOBUF"}' "http://localhost:8081/subjects/importer.proto/versions"

@dotnwat
Copy link
Member

dotnwat commented Jan 26, 2022

@BenPope

I built this PR in release mode and applied this patch

nwatkins@gordon:~/src/redpanda/vbuild/release/clang$ git diff
diff --git a/src/v/pandaproxy/schema_registry/sharded_store.cc b/src/v/pandaproxy/schema_registry/sharded_store.cc
index ccb6df891..b1e85126e 100644
--- a/src/v/pandaproxy/schema_registry/sharded_store.cc
+++ b/src/v/pandaproxy/schema_registry/sharded_store.cc
@@ -102,7 +102,7 @@ sharded_store::make_valid_schema(canonical_schema schema) {
     case schema_type::avro:
         co_return make_avro_schema_definition(schema.def().raw()()).value();
     case schema_type::protobuf:
-        co_return co_await make_protobuf_schema_definition(*this, schema);
+        co_return co_await make_protobuf_schema_definition(*this, std::move(schema));
     case schema_type::json:
         throw as_exception(invalid_schema_type(schema.type()));
     }

then I ran this a bunch of times but things seemed to be ok

curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; ","schemaType":"PROTOBUF"}' http://localhost:8081/subjects/empty.proto/versions
curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; package google.protobuf; message Timestamp { int64 seconds = 1;  int32 nanos = 2; }","schemaType":"PROTOBUF"}' "http://localhost:8081/subjects/google%2Fprotobuf%2Ftimestamp.proto/versions"
curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; import \"google/protobuf/timestamp.proto\"; message Importee1 { bool b = 1; }; message Importee2 {google.protobuf.Timestamp ts = 3;}", "references": [{"name": "google/protobuf/timestamp.proto", "subject": "google/protobuf/timestamp.proto", "version": 1}],"schemaType":"PROTOBUF"}' "http://localhost:8081/subjects/importee.proto/versions"
curl -X POST -H "Content-Type: application/json" --data '{"schema":"syntax = \"proto3\"; import \"empty.proto\"; import \"importee.proto\"; message Importer { Importee1 importee1 = 1; Importee2 importee2 = 2;}", "references": [{"name": "empty.proto", "subject": "empty.proto", "version": 1},{"name": "importee.proto", "subject": "importee.proto", "version": 1}],"schemaType":"PROTOBUF"}' "http://localhost:8081/subjects/importer.proto/versions"

is there more to get it to reproduce?

@BenPope
Copy link
Member Author

BenPope commented Jan 26, 2022

is there more to get it to reproduce?

No, that was pretty reliable for me.

@BenPope
Copy link
Member Author

BenPope commented Jan 26, 2022

Backtrace:

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/include/seastar/util/backtrace.hh:59
 (inlined by) seastar::backtrace_buffer::append_backtrace() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:760
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:790
 (inlined by) seastar::print_with_backtrace(char const*, bool) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:802
 (inlined by) seastar::sigsegv_action() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3716
 (inlined by) operator() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3702
 (inlined by) __invoke at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3698
/lib/x86_64-linux-gnu/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b8037b6260865346802321dd2256b8ad1d857e63, for GNU/Linux 3.2.0, stripped

addr2line: DWARF error: section .debug_info is larger than its filesize! (0x5b43cc vs 0x429a58)
__sigaction at ??:?
pandaproxy::schema_registry::sharded_store::make_valid_schema(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>) at ??:?
 (inlined by) ~ValidSchema at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/rp_deps_install/include/avro/ValidSchema.hh:40
 (inlined by) ~avro_schema_definition at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/types.h:114
 (inlined by) ~value_storage_nontrivial at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/rp_deps_install/include/boost/outcome/policy/../detail/value_storage.hpp:903
 (inlined by) ~basic_result_storage at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/rp_deps_install/include/boost/outcome/detail/basic_result_storage.hpp:113
 (inlined by) pandaproxy::schema_registry::sharded_store::make_valid_schema(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/sharded_store.cc:109
pandaproxy::schema_registry::sharded_store::is_compatible(detail::base_named_type<int, pandaproxy::schema_registry::schema_version_tag, std::__1::integral_constant<bool, true> >, pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/sharded_store.cc:515
pandaproxy::schema_registry::sharded_store::project_ids(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/sharded_store.cc:128
operator() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/seq_writer.cc:143
seastar::future<boost::outcome_v2::basic_outcome<detail::base_named_type<int, pandaproxy::schema_registry::schema_id_tag, std::__1::integral_constant<bool, true> >, std::__1::error_code, std::exception_ptr, std::__1::conditional<false, boost::outcome_v2::policy::terminate, std::__1::conditional<true, boost::outcome_v2::policy::error_code_throw_as_system_error<detail::base_named_type<int, pandaproxy::schema_registry::schema_id_tag, std::__1::integral_constant<bool, true> >, std::__1::error_code, std::exception_ptr>, std::__1::conditional<true, boost::outcome_v2::policy::exception_ptr_rethrow<detail::base_named_type<int, pandaproxy::schema_registry::schema_id_tag, std::__1::integral_constant<bool, true> >, std::__1::error_code, std::exception_ptr>, boost::outcome_v2::policy::fail_to_compile_observers>::type>::type>::type> > pandaproxy::schema_registry::seq_writer::sequenced_write_inner<pandaproxy::schema_registry::seq_writer::write_subject_version(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>)::$_0, detail::base_named_type<int, pandaproxy::schema_registry::schema_id_tag, std::__1::integral_constant<bool, true> > >(pandaproxy::schema_registry::seq_writer::write_subject_version(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>)::$_0) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/seq_writer.h:115
std::experimental::coroutines_v1::coroutine_handle<void>::resume() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/llvm/install/bin/../include/c++/v1/experimental/coroutine:121
 (inlined by) seastar::internal::coroutine_traits_base<boost::outcome_v2::basic_outcome<detail::base_named_type<int, pandaproxy::schema_registry::schema_id_tag, std::__1::integral_constant<bool, true> >, std::__1::error_code, std::exception_ptr, boost::outcome_v2::policy::error_code_throw_as_system_error<detail::base_named_type<int, pandaproxy::schema_registry::schema_id_tag, std::__1::integral_constant<bool, true> >, std::__1::error_code, std::exception_ptr> > >::promise_type::run_and_dispose() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/rp_deps_install/include/seastar/core/coroutine.hh:74
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2378
 (inlined by) seastar::reactor::run_some_tasks() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2787
seastar::reactor::do_run() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2956
seastar::reactor::run() at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2839
seastar::app_template::run_deprecated(int, char**, std::__1::function<void ()>&&) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/app-template.cc:228
seastar::app_template::run(int, char**, std::__1::function<seastar::future<int> ()>&&) at /home/ben/development/src/github.com/BenPope/redpanda2/vbuild/release/clang/v_deps_build/seastar-prefix/src/seastar/src/core/app-template.cc:128

Which is showing that it's destructing an ~avro_schema_definition - which is highly unexpected, since we're dealing with protobuf. I added a whole bunch of asserts and traced the code, all observations suggest that it's a protobuf type, but it's a bit of a Heisenbug - it won't reproduce in my debugger or debug, for example.

Worthy of note is that the avro type is the first one in the variant inside valid_schema, so it's a default of sorts, as well as avro being the default type on any typed_schema.

@BenPope
Copy link
Member Author

BenPope commented Jan 26, 2022

I can't reproduce with clang-13.

Only broken on release build with clang-12.0.1

  1. I can fix it like this:
--- a/src/v/pandaproxy/schema_registry/sharded_store.cc
+++ b/src/v/pandaproxy/schema_registry/sharded_store.cc
@@ -105,9 +105,9 @@ sharded_store::make_valid_schema(canonical_schema schema) {
         co_return co_await make_protobuf_schema_definition(
           *this, std::move(schema));
     case schema_type::json:
-        throw as_exception(invalid_schema_type(schema.type()));
+        break;
     }
-    __builtin_unreachable();
+    throw as_exception(invalid_schema_type(schema.type()));
 }
  1. Or like this:
--- a/src/v/pandaproxy/schema_registry/sharded_store.cc
+++ b/src/v/pandaproxy/schema_registry/sharded_store.cc
@@ -104,10 +104,9 @@ sharded_store::make_valid_schema(canonical_schema schema) {
     case schema_type::protobuf:
         co_return co_await make_protobuf_schema_definition(
           *this, std::move(schema));
-    case schema_type::json:
+    default:
         throw as_exception(invalid_schema_type(schema.type()));
     }
-    __builtin_unreachable();
 }
  1. Or like this:
--- a/src/v/pandaproxy/schema_registry/sharded_store.cc
+++ b/src/v/pandaproxy/schema_registry/sharded_store.cc
@@ -99,8 +99,9 @@ ss::future<> sharded_store::validate_schema(canonical_schema schema) {
ss::future<valid_schema>
sharded_store::make_valid_schema(canonical_schema schema) {
    switch (schema.type()) {
-    case schema_type::avro:
+    case schema_type::avro: {
        co_return make_avro_schema_definition(schema.def().raw()()).value();
+    }
    case schema_type::protobuf:
        co_return co_await make_protobuf_schema_definition(
          *this, std::move(schema));
  1. Or like this:
--- a/src/v/pandaproxy/schema_registry/sharded_store.cc
+++ b/src/v/pandaproxy/schema_registry/sharded_store.cc
@@ -102,8 +102,7 @@ sharded_store::make_valid_schema(canonical_schema schema) {
     case schema_type::avro:
         co_return make_avro_schema_definition(schema.def().raw()()).value();
     case schema_type::protobuf:
-        co_return co_await make_protobuf_schema_definition(
-          *this, std::move(schema));
+        co_return co_await make_protobuf_schema_definition(*this, schema);
     case schema_type::json:
         throw as_exception(invalid_schema_type(schema.type()));
     }
  1. Or like this:
--- a/src/v/pandaproxy/schema_registry/sharded_store.cc
+++ b/src/v/pandaproxy/schema_registry/sharded_store.cc
@@ -96,7 +96,7 @@ ss::future<> sharded_store::validate_schema(canonical_schema schema) {
     __builtin_unreachable();
 }
 
-ss::future<valid_schema>
+__attribute__((optnone)) ss::future<valid_schema>
 sharded_store::make_valid_schema(canonical_schema schema) {
     switch (schema.type()) {
     case schema_type::avro:

@BenPope
Copy link
Member Author

BenPope commented Feb 1, 2022

I've failed to meaningfully reduce this. Not sure what to do next.

The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

There seems to be a miscompilation in sharded_store::make_valid_schema.
See redpanda-data#3596 for details. Copying the schema is a workaround.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
@BenPope BenPope force-pushed the pandaproxy-coroutine-copy-capture branch from 913f19d to 5b5b9cf Compare February 2, 2022 19:51
BenPope added a commit to BenPope/redpanda that referenced this pull request Feb 2, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

There seems to be a miscompilation in sharded_store::make_valid_schema.
See redpanda-data#3596 for details. Copying the schema is a workaround.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
(cherry picked from commit 5b5b9cf)
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                          ___                       ___          ┃
┃                         /  /\          ___        /__/\         ┃
┃                        /  /:/_        /  /\      |  |::\        ┃
┃        ___     ___    /  /:/ /\      /  /:/      |  |:|:\       ┃
┃       /__/\   /  /\  /  /:/_/::\    /  /:/     __|__|:|\:\      ┃
┃       \  \:\ /  /:/ /__/:/__\/\:\  /  /::\    /__/::::| \:\     ┃
┃        \  \:\  /:/  \  \:\ /~~/:/ /__/:/\:\   \  \:\~~\__\/     ┃
┃         \  \:\/:/    \  \:\  /:/  \__\/  \:\   \  \:\           ┃
┃          \  \::/      \  \:\/:/        \  \:\   \  \:\          ┃
┃           \__\/        \  \::/          \__\/    \  \:\         ┃
┃                         \__\/                     \__\/         ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Comment on lines +310 to +312
shard_for(sub),
_smp_opts,
[marker, sub{std::move(sub)}, permanent](store& s) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use after move?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change this in a separate PR

@BenPope BenPope merged commit b099a6e into redpanda-data:dev Feb 3, 2022
BenPope added a commit to BenPope/redpanda that referenced this pull request Feb 3, 2022
Using a belt-and-braces approach to convince clang not
to miscompile this function.

* Introduce an artificial scope for co_await blocks
* Avoid `__builtin_unreachable()`

The backtrace on arm is:
```
void std::__1::__libcpp_operator_delete<void*>(void*) at /vectorized/llvm/bin/../include/c++/v1/new:245
 (inlined by) void std::__1::__do_deallocate_handle_size<>(void*, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/new:269
 (inlined by) std::__1::__libcpp_deallocate(void*, unsigned long, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/new:285
 (inlined by) std::__1::allocator<char>::deallocate(char*, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/memory:874
 (inlined by) std::__1::allocator_traits<std::__1::allocator<char> >::deallocate(std::__1::allocator<char>&, char*, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/__memory/allocator_traits.h:280
 (inlined by) ~basic_string at /vectorized/llvm/bin/../include/c++/v1/string:2237
 (inlined by) ~error_info at /var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-058d4ead6b38020f3-1/vectorized/redpanda/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/errors.h:25
 (inlined by) ~basic_result_storage at /vectorized/include/boost/outcome/detail/basic_result_storage.hpp:113
 (inlined by) pandaproxy::schema_registry::sharded_store::make_valid_schema(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>) at /var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-058d4ead6b38020f3-1/vectorized/redpanda/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/sharded_store.cc:110
```

Which implies it's destructing the frame for the avro
coroutine, not the protobuf one.

See redpanda-data#3596 for details

Signed-off-by: Ben Pope <ben@vectorized.io>
ajfabbri pushed a commit to ajfabbri/redpanda that referenced this pull request Feb 3, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

There seems to be a miscompilation in sharded_store::make_valid_schema.
See redpanda-data#3596 for details. Copying the schema is a workaround.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
BenPope added a commit to BenPope/redpanda that referenced this pull request Feb 11, 2022
Using a belt-and-braces approach to convince clang not
to miscompile this function.

* Introduce an artificial scope for co_await blocks
* Avoid `__builtin_unreachable()`

The backtrace on arm is:
```
void std::__1::__libcpp_operator_delete<void*>(void*) at /vectorized/llvm/bin/../include/c++/v1/new:245
 (inlined by) void std::__1::__do_deallocate_handle_size<>(void*, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/new:269
 (inlined by) std::__1::__libcpp_deallocate(void*, unsigned long, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/new:285
 (inlined by) std::__1::allocator<char>::deallocate(char*, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/memory:874
 (inlined by) std::__1::allocator_traits<std::__1::allocator<char> >::deallocate(std::__1::allocator<char>&, char*, unsigned long) at /vectorized/llvm/bin/../include/c++/v1/__memory/allocator_traits.h:280
 (inlined by) ~basic_string at /vectorized/llvm/bin/../include/c++/v1/string:2237
 (inlined by) ~error_info at /var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-058d4ead6b38020f3-1/vectorized/redpanda/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/errors.h:25
 (inlined by) ~basic_result_storage at /vectorized/include/boost/outcome/detail/basic_result_storage.hpp:113
 (inlined by) pandaproxy::schema_registry::sharded_store::make_valid_schema(pandaproxy::schema_registry::typed_schema<pandaproxy::schema_registry::canonical_schema_defnition_tag>) at /var/lib/buildkite-agent/builds/buildkite-arm64-builders-i-058d4ead6b38020f3-1/vectorized/redpanda/vbuild/release/clang/../../../src/v/pandaproxy/schema_registry/sharded_store.cc:110
```

Which implies it's destructing the frame for the avro
coroutine, not the protobuf one.

See redpanda-data#3596 for details

Signed-off-by: Ben Pope <ben@vectorized.io>
(cherry picked from commit 231febc)
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/redpanda area/schema-registry Schema Registry service within Redpanda kind/bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash while publishing to the schema registry
2 participants