Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled #15204

Merged
merged 5 commits into from
Sep 3, 2024

Conversation

sumitasr
Copy link
Member

@sumitasr sumitasr commented Aug 12, 2024

Description

Fix flaky test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled

  • Fixed assertions.

Related Issues

Resolves #15117

Check List

  • [NA] Functionality includes testing.
  • [NA] API changes companion pull request created, if applicable.
  • [Yes] Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sumitasr sumitasr changed the title Fix test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled Fix flaky test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled Aug 12, 2024
Copy link
Contributor

✅ Gradle check result for 338a7f6: SUCCESS

Copy link

codecov bot commented Aug 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.97%. Comparing base (e5fadba) to head (54625f3).
Report is 31 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #15204      +/-   ##
============================================
+ Coverage     71.89%   71.97%   +0.08%     
- Complexity    63496    63523      +27     
============================================
  Files          5244     5244              
  Lines        296836   296836              
  Branches      42860    42860              
============================================
+ Hits         213401   213652     +251     
+ Misses        65966    65650     -316     
- Partials      17469    17534      +65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andrross
Copy link
Member

@sumitasr I don't think we saw this test failing in the past. What changed? I would feel better about removing the assertions here if we fully understand what changed to make this test intermittently fail when in the past it never failed.

@sumitasr
Copy link
Member Author

sumitasr commented Aug 29, 2024

@sumitasr I don't think we saw this test failing in the past. What changed? I would feel better about removing the assertions here if we fully understand what changed to make this test intermittently fail when in the past it never failed.

These 2 assertions were flaky in the past as well in a different test reference - #10899 and engineer moved the assertions from different test into logging verification test assuming it will be stable here. And recently we created 2 tests to verify trace and debug logging behavior independently. This test started getting tagged as flaky.

I ran assertions offline to verify before assertion both getUpdateSuccess and getUpdateFailed is reported as 0 during flaky behavior. The ClusterStateStats is updated in async path and can be delayed. Added assertsBusy to make sure we get consistent behavior in the test which was not added earlier.

Ran the test 5000 times.

Copy link
Contributor

❌ Gradle check result for 98ffaa7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sumitasr sumitasr marked this pull request as draft August 29, 2024 06:58
…bled

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
@sumitasr
Copy link
Member Author

sumitasr commented Aug 29, 2024

❌ Gradle check result for 98ffaa7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Fixed. Incorrect assert was added.

Copy link
Contributor

❌ Gradle check result for afa9b91: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sumitasr
Copy link
Member Author

❌ Gradle check result for afa9b91: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Test Result (no failures)

Looks like need to rerun again

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
Copy link
Contributor

❌ Gradle check result for 30e379a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sumitasr
Copy link
Member Author

❌ Gradle check result for 30e379a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Following is the failure log in build. Looks like issue with test run infra setup.

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':distribution:bwc:minor:buildBwcLinuxTar'.
> JAVA17_HOME required

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
Copy link
Contributor

❌ Gradle check result for 2819a90: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for a32ae3a: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@sumitasr
Copy link
Member Author

❕ Gradle check result for a32ae3a: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Failed due to Test Result (2 failures / +2)

[org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"false"}}](https://build.ci.opensearch.org/job/gradle-check/45765/testReport/junit/org.opensearch.search/SearchTimeoutIT/testSimpleTimeout__p0___search_concurrent_segment_search_enabled___false___/)
[org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"true"}}](https://build.ci.opensearch.org/job/gradle-check/45765/testReport/junit/org.opensearch.search/SearchTimeoutIT/testSimpleTimeout__p0___search_concurrent_segment_search_enabled___true___/)

Need to retry build

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
Copy link
Contributor

✅ Gradle check result for 54625f3: SUCCESS

@shwetathareja shwetathareja added skip-changelog backport 2.x Backport to 2.x branch labels Sep 3, 2024
@shwetathareja shwetathareja merged commit b23c9de into opensearch-project:main Sep 3, 2024
43 of 45 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 3, 2024
…bugEnabled (#15204)

* Fix test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
(cherry picked from commit b23c9de)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
dblock pushed a commit that referenced this pull request Sep 3, 2024
…bugEnabled (#15204) (#15597)

* Fix test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled


(cherry picked from commit b23c9de)

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
akolarkunnu pushed a commit to akolarkunnu/OpenSearch that referenced this pull request Sep 10, 2024
…bugEnabled (opensearch-project#15204)

* Fix test MasterServiceTests.testClusterStateUpdateLoggingWithDebugEnabled

Signed-off-by: Sumit Bansal <sumitsb@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants