-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporarily disable coproc on ci #4557
Conversation
- Many obscure wasm test failures can be traced to a root issue of a previous wasm engine still existing and listening on port 43189. - When this occurs only some of the deployed wasm engines will be started and the test output will confusingly show not enough data was read from redpanda.
- Killing the wasm_engine should be a part of the clean_node() routine - Fixes: redpanda-data#4038
- These tests all attempt to remove a materialized topic while the coprocessor is still running. - However due to the initial design of the system, coproc will attempt to recreate the topic and re-populate it up until the previous high watermark. If this was not performed there would be an inconsistency between the coprocessors defined metadata and random commands sent to the cluster by the user. - The tests have been beneficial in understanding that this type of concurrent delete can occur without any crashes. - If a user wants to truly delete a materialized topic he/she must shutdown the coprocessor first. - Fixes: redpanda-data#3384
- Disabling temporarily due to a difficult to debug memory leak detected by ASAN on rare occassions. - Marking issue for tracking here: redpanda-data#4053
|
||
@ok_to_fail # https://github.com/redpanda-data/redpanda/issues/3745 | ||
@cluster(num_nodes=4) | ||
def verify_materialized_topics_test(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about the gap left in test coverage after this commit is merged. Should we document somewhere that we need a WasmDeleteTopics test in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left the explanation in the commit message, is there somewhere else better we could put this? Maybe a link in the source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when we re-pick up the WASM workstream, we will want to look at what all tests are required. That said, I wonder if we leave such tests...commented out rather than delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically deleted it as stated in the commit because it was working as expected. Deleting materialized topics while coprocessors are still processing inputs may re-create said topics
- tests/wasm_topics_test.py | ||
- tests/wasm_identity_test.py | ||
- tests/wasm_partition_movement_test.py | ||
- tests/wasm_redpanda_failure_recovery_test.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why aren't we using @ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to clutter ci output with more coproc stuff
Attempting to backport this to clear up CI issues affecting 22.1.4 backport PRs, e.g. #5012 |
/backport v22.1.x |
Although this PR solves some outstanding issues, there are still a few leftover, notably they are:
WasmRPMeshFailureRecoveryTest.verify_materialized_topics_test
#4286rptest.tests.wasm_partition_movement_test.WasmPartitionMovementTest.test_dynamic_with_failure
#4052