-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RpkException in PartitionBalancerTest.test_maintenance_mode.kill_same_node=True #6352
Comments
The crash:
|
Of note:
So there was a cross-core operation for partition 26, and the crash occurred on the originating core. |
@BenPope: has this been hit again? Are there logs or any such thing we want to add here for the next instance when this is hit? If no, then I wonder what options we have here as next step? |
I can only realistically search the dev branch. Of all the SEGV failures in the last 35 days, this is the only one I found that's the same:
And also contains no backtrace. Almost all of the other ones are confined to a small window of the first 3 days of October and related to SI. Issue #6391 mentions the same problem, which has been closed as duplicate of #5575. But #5575 does appear to be different to this one. I also think coredumps are the only reasonable way forward. |
Probably a duplicate of #6973 |
I have been seeing a similar issue (may be something different) while working on #7060 - it is reliably reproduced on CDT when running the debug build and a newly added test listed in that issue. I have not been able to capture a core dump yet from the node where redpanda has this error but will try to capture it.
|
I reran the test in CDT after rebasing on dev hoping that the recent changes for lifetimes would improve the situation, but I still see this error, eg
|
In last 30d we have not seen another case of the I don't think we have any clues to work with here, and there have been fixes for general lifetime issues in coroutine lambdas (via #6973) since, so closing. |
saw in: https://buildkite.com/redpanda/redpanda/builds/15168#01832300-bb99-4544-82aa-c885c3f12780
Similar to #6033 but it's a different method & error message.
The text was updated successfully, but these errors were encountered: