-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure of test_coproc_delete_topic
unit test
#3384
Comments
I found an older issue #2613 that seems to have described the same failure. |
Another occurrence of the same failure with the same git commit f0a443d: https://buildkite.com/vectorized/redpanda/builds/5857#8893589b-556e-44dd-8a78-18b0330356e4/6-6014
There were also green builds with the same git commit: |
Unfortunately ctest isn't that helpful when it comes to debugging what exact test within the suite of tests failed. It will mention that the failure is in the file
Looks like this is the cause. I actually have a PR open for this now that never managed to go in before the winter break: #3340 |
coproc_fixture_rpunit
testcoproc_fixture_rpunit
test
Closing as was resolved by #3340 |
Just hit a very similar CI failure here. |
Double-checked I have the commit from #3340:
|
Ok leaving this PR open |
This is still one of the more frequent failures (https://buildkite.com/redpanda/redpanda/builds/7908#fc5ec8e2-c5e2-4208-be8e-77220257956e last night). @graphcareful do you have a sense of what is going wrong here? |
I have investigated and have found the culprit is a single test within the I believe the reason the test is failing is due to not accounting for a particular edge case where coproc will attempt to re-create a deleted log. |
https://buildkite.com/redpanda/redpanda/builds/8249#89788d01-31d9-4859-add0-258ccaa221c9/6-6195
|
@gousteris this seems like a new issue, a memory leak with the test |
Filed here #4053 |
@graphcareful did you mean to close this issue? I only see a PR disabling the test. |
I filed #4053 as discussion in this issue pertains to a different, already resolved, fix |
I get that #4053 is for find_coordinator_for_non_replicatable_topic -- but this ticket's last activity related to test_copro_delete_topic. That test is still disabled, so unless there's another ticket elsewhere for test_copro_delete_topic, then this ticket is still live. |
Ok good points, then to avoid confusion in will re-open this and change the title of the issue |
coproc_fixture_rpunit
testtest_coproc_delete_topic
unit test
- These tests all attempt to remove a materialized topic while the coprocessor is still running. - However due to the initial design of the system, coproc will attempt to recreate the topic and re-populate it up until the previous high watermark. If this was not performed there would be an inconsistency between the coprocessors defined metadata and random commands sent to the cluster by the user. - The tests have been beneficial in understanding that this type of concurrent delete can occur without any crashes. - If a user wants to truly delete a materialized topic he/she must shutdown the coprocessor first. - Fixes: redpanda-data#3384 (cherry picked from commit 964bb41)
Version & Environment
Nightly runs of tests using code from
dev
branch.What went wrong?
Buildkite jobs are red and logs indicate failure in
coproc_fixture_rpunit
test.What should have happened instead?
Awesomeness.
How to reproduce the issue?
Since I've seen buildkite jobs run the same tests with the same git commit and result in no failures, this looks like a flakey test failure to me.
Additional information
Example buildkite job log of failure with git commit f0a443d:
https://buildkite.com/vectorized/redpanda/builds/5850#a4eeb029-c210-4436-87d9-88ac956ec94f/6-6019
Earlier in the same buildkite job log, it says:
The text was updated successfully, but these errors were encountered: