Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DiskThresholdDeciderIT.testIndexCreateBlockWithAReadOnlyBlock is flaky #5956

Closed
andrross opened this issue Jan 20, 2023 · 6 comments
Closed
Labels
bug Something isn't working distributed framework flaky-test Random test failure that succeeds on second run

Comments

@andrross
Copy link
Member

See https://build.ci.opensearch.org/job/gradle-check/9659/

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testIndexCreateBlockWithAReadOnlyBlock" -Dtests.seed=47FA74E827DEE212 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=th-TH -Dtests.timezone=Africa/Djibouti -Druntime.java=19

org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT > testIndexCreateBlockWithAReadOnlyBlock FAILED
    java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([47FA74E827DEE212:57BCB87090067887]:0)
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.lambda$testIndexCreateBlockWithAReadOnlyBlock$11(DiskThresholdDeciderIT.java:297)
        at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1049)
        at org.opensearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testIndexCreateBlockWithAReadOnlyBlock(DiskThresholdDeciderIT.java:295)

This was introduced in #5852. Maybe this is what #5952 is trying to fix? @RS146BIJAY

@andrross andrross added bug Something isn't working untriaged flaky-test Random test failure that succeeds on second run distributed framework and removed untriaged labels Jan 20, 2023
@RS146BIJAY
Copy link
Contributor

Yes. It seems like the test cases are flaky even now. I am trying to fix those flaky test cases as a part of #5952 .

@sachinpkale
Copy link
Member

Build failure due to these tests - https://build.ci.opensearch.org/job/gradle-check/9749/

@andrross
Copy link
Member Author

andrross commented Feb 1, 2023

Seems like this isn't completely fixed: https://build.ci.opensearch.org/job/gradle-check/10405/

/cc @RS146BIJAY

@RS146BIJAY
Copy link
Contributor

Unable to reproduce this failure on local. Validating what else may be causing race condition scenario which is causing test cases to fail.

@RS146BIJAY
Copy link
Contributor

Seems race condition is happening between data populated on the cluster and the actual free space reflected by the file system. Removing the indexing call and explicitly setting free and total space as done inside MockDiskUsagesIT seems to fix this race condition.

@andrross
Copy link
Member Author

I believe this was fixed by #6277. Will reopen if it reoccurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework flaky-test Random test failure that succeeds on second run
Projects
None yet
Development

No branches or pull requests

3 participants