Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash while publishing to the schema registry #3559

Closed
zimbatm opened this issue Jan 20, 2022 · 4 comments · Fixed by #3596
Closed

Crash while publishing to the schema registry #3559

zimbatm opened this issue Jan 20, 2022 · 4 comments · Fixed by #3596
Assignees
Labels
area/schema-registry Schema Registry service within Redpanda community kind/bug Something isn't working
Milestone

Comments

@zimbatm
Copy link
Contributor

zimbatm commented Jan 20, 2022

Version & Environment

Redpanda version: vectorized/redpanda:v21.11.2 (docker image)

What went wrong?

The service crashed while publishing a new schema to the registry.

What should have happened instead?

Not crash? :)

How to reproduce the issue?

It seems to only happen when writing a lot of schemas in short successions in the registry. I will update the issue once I get a better repro.

Additional information

Please attach any relevant logs, backtraces, or metric charts.

Backtrace:

redpanda                | redpanda: /vectorized/include/boost/container/vector.hpp:1622: boost::container::vector::reference boost::container::vector<std::unique_ptr<seastar::reactor::task_queue>, boost::container::dtl::static_storage_allocator<std::unique_ptr<seastar::reactor::task_queue>, 16, 0, true>>::operator[](boost::container::vector::size_type) [T = std::unique_ptr<seastar::reactor::task_queue>, A = boost::container::dtl::static_storage_allocator<std::unique_ptr<seastar::reactor::task_queue>, 16, 0, true>, Options = void]: Assertion `this->m_holder.m_size > n' failed.
redpanda                | Aborting on shard 0.
redpanda                | Backtrace:
redpanda                |   0x380b3e8
redpanda                |   0x2a552ba699ff
redpanda                |   /opt/redpanda/lib/libc.so.6+0x3d291
redpanda                |   /opt/redpanda/lib/libc.so.6+0x268a3
redpanda                |   /opt/redpanda/lib/libc.so.6+0x26788
redpanda                |   /opt/redpanda/lib/libc.so.6+0x35a15
redpanda                |   0x37b964b
redpanda                |   0x1c11b15
redpanda                |   0x37ba9f4
redpanda                |   0x37bdeb7
redpanda                |   0x37bb295
redpanda                |   0x370d0b9
redpanda                |   0x370b150
redpanda                |   0x116ee94
redpanda                |   0x3b01d3c
redpanda                |   /opt/redpanda/lib/libc.so.6+0x27b74
redpanda                |   0x116bced
redpanda                | Segmentation fault on shard 0.
redpanda                | Backtrace:
redpanda                |   0x380acb8
redpanda                |   0x2a552ba699ff
redpanda                |   /opt/redpanda/lib/libc.so.6+0x26962
redpanda                |   /opt/redpanda/lib/libc.so.6+0x26788
redpanda                |   /opt/redpanda/lib/libc.so.6+0x35a15
redpanda                |   0x37b964b
redpanda                |   0x1c11b15
redpanda                |   0x37ba9f4
redpanda                |   0x37bdeb7
redpanda                |   0x37bb295
redpanda                |   0x370d0b9
redpanda                |   0x370b150
redpanda                |   0x116ee94
redpanda                |   0x3b01d3c
redpanda                |   /opt/redpanda/lib/libc.so.6+0x27b74
redpanda                |   0x116bced
@zimbatm zimbatm added the kind/bug Something isn't working label Jan 20, 2022
@BenPope
Copy link
Member

BenPope commented Jan 20, 2022

If you're publishing protobuf schema, then I'm 95% sure I have a reproducer for this, I'm working on a fix as top priority.

@BenPope BenPope self-assigned this Jan 20, 2022
@BenPope BenPope added the area/schema-registry Schema Registry service within Redpanda label Jan 20, 2022
@flokli
Copy link
Contributor

flokli commented Jan 20, 2022

Yes, this is protobuf schemas.

@zimbatm
Copy link
Contributor Author

zimbatm commented Jan 20, 2022

I wasn't able to build a minimal repro so far but can reproduce this locally very easily.

What I was trying to do is dump and restore the schema registry so it can be migrated from cp-schema-registry to redpanda. Here are the two scripts that I am using: https://gist.github.com/b235775c6191abfee075ac4c4fd3d9af

There are currently 39 files generated by the sc-dump.sh script, each with different IDs and at version 1. Then I run the sc-restore.sh script and it fails randomly. If I add a sleep between each curl call the script doesn't fail (so far). Note that since I'm loading the same schema, I don't think it's generating new versions either.

@BenPope
Copy link
Member

BenPope commented Jan 21, 2022

I wasn't able to build a minimal repro so far but can reproduce this locally very easily.

What I was trying to do is dump and restore the schema registry so it can be migrated from cp-schema-registry to redpanda. Here are the two scripts that I am using: https://gist.github.com/b235775c6191abfee075ac4c4fd3d9af

There are currently 39 files generated by the sc-dump.sh script, each with different IDs and at version 1. Then I run the sc-restore.sh script and it fails randomly. If I add a sleep between each curl call the script doesn't fail (so far). Note that since I'm loading the same schema, I don't think it's generating new versions either.

@zimbatm As a workaround, you may be able to dump and restore the raw records (perhaps something like MirrorMaker) on the _schemas topic. The "other" binary format should be supported for common message types, but there are some unsupported messages, such as those for the /mode endpoints. I'd be interested if you encounter any issues doing it that way.

@BenPope BenPope added this to the v21.11.4 milestone Jan 22, 2022
BenPope added a commit to BenPope/redpanda that referenced this issue Jan 24, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

Copy the schema to prevent UB.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
BenPope added a commit to BenPope/redpanda that referenced this issue Jan 24, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

Copy the schema to prevent UB.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
BenPope added a commit to BenPope/redpanda that referenced this issue Jan 24, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

Copy the schema to prevent UB.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
BenPope added a commit to BenPope/redpanda that referenced this issue Feb 2, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

There seems to be a miscompilation in sharded_store::make_valid_schema.
See redpanda-data#3596 for details. Copying the schema is a workaround.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
BenPope added a commit to BenPope/redpanda that referenced this issue Feb 2, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

There seems to be a miscompilation in sharded_store::make_valid_schema.
See redpanda-data#3596 for details. Copying the schema is a workaround.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
(cherry picked from commit 5b5b9cf)
ajfabbri pushed a commit to ajfabbri/redpanda that referenced this issue Feb 3, 2022
The previous commits address Core CPP guideline CP.51 and CP.53,
but there is still a crash when producing schema in quick succession.

There seems to be a miscompilation in sharded_store::make_valid_schema.
See redpanda-data#3596 for details. Copying the schema is a workaround.

Fix redpanda-data#3559

Signed-off-by: Ben Pope <ben@vectorized.io>
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema-registry Schema Registry service within Redpanda community kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants