Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ClusterRerouteIT.testDelayWithALargeAmountOfShards (random test failure) #1561

Closed
CEHENKLE opened this issue Nov 16, 2021 · 6 comments
Closed
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run

Comments

@CEHENKLE
Copy link
Member

CEHENKLE commented Nov 16, 2021

Describe the bug
Random test failure. :( Please dig in and figure out what went wrong :(

https://ci.opensearch.org/logs/ci/workflow/OpenSearch_CI/PR_Checks/Gradle_Check/gradle_check_992_reports.zip

@CEHENKLE CEHENKLE added bug Something isn't working untriaged flaky-test Random test failure that succeeds on second run and removed untriaged labels Nov 16, 2021
@tlfeng
Copy link
Collaborator

tlfeng commented Mar 13, 2022

Add more information:

https://ci.opensearch.org/logs/ci/workflow/OpenSearch_CI/PR_Checks/Gradle_Check/gradle_check_992.log

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards" -Dtests.seed=3ADB1F144D29EAD6 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=America/Lower_Princes -Druntime.java=17

org.opensearch.cluster.allocation.ClusterRerouteIT > testDelayWithALargeAmountOfShards FAILED
    java.lang.AssertionError: AcknowledgedResponse failed - not acked
    Expected: <true>
         but: was <false>
        at __randomizedtesting.SeedInfo.seed([3ADB1F144D29EAD6:16AA15F55C37C034]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked(OpenSearchAssertions.java:128)
        at org.opensearch.test.hamcrest.OpenSearchAssertions.assertAcked(OpenSearchAssertions.java:132)
        at org.opensearch.test.TestCluster.wipeIndices(TestCluster.java:172)
        at org.opensearch.test.TestCluster.wipe(TestCluster.java:95)
        at org.opensearch.test.OpenSearchIntegTestCase.afterInternal(OpenSearchIntegTestCase.java:634)
        at org.opensearch.test.OpenSearchIntegTestCase.cleanUpCluster(OpenSearchIntegTestCase.java:2377)
        at jdk.internal.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
        at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
        at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
        at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
        at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
        at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
        at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
        at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
        at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
        at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
        at java.base/java.lang.Thread.run(Thread.java:833)

@owaiskazi19
Copy link
Member

owaiskazi19 commented Dec 23, 2022

Getting suite timeout exceeded (>= 1200000 msec) after runnings tests in isolation for 100 times

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}" -Dtests.seed=3ADB1F144D29EAD6 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=America/Lower_Princes -Druntime.java=17
  2> java.lang.Exception: Test abandoned because suite timeout was reached.
        at __randomizedtesting.SeedInfo.seed([3ADB1F144D29EAD6]:0)
  2> Dec 22, 2022 9:17:59 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
  2> WARNING: Will linger awaiting termination of 116 leaked thread(s).
  1> [2022-12-22T21:17:59,558][INFO ][o.o.c.r.a.AllocationService] [node_t1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test0][3]]]).
  1> [2022-12-22T21:17:59,590][INFO ][o.o.c.a.ClusterRerouteIT ] [testDelayWithALargeAmountOfShards] [ClusterRerouteIT#testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}]: cleaning up after test
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test0/VepohJQqRvarqGLqSwBYFw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test9/SxH8hc6zSESXnPeMYJp-6Q] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test6/HA4_mZpiROiLYzqx9VC7Kg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test7/pM1V0JX6SuucR5DNEkTjFQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test3/Z1tvBXl7TySIeFhAbXb8ww] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test20/kbGlnkDBQ7KaYoS31YW02w] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test5/ebgd0T27ThS_ajBG5UsgOQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test8/WKjzotVySRKgGq1SyKTomA] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test22/sYPtZ_dMRRaW0RZCNXHpVQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test14/Dvhjy7g1TACjA-QzLR_0Aw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test16/QmG0tCMvSaWlzJ0KnAO4Tw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test15/pJDKm-gaSqWGaSZSg41HsQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test10/ALLIikLBT76tlKiteIcNhQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test13/ApnsFVhnQduGaS8f5vILew] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test12/Es2Xhfn0Rp6goCKE7vnWJw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test11/sGeGcFb6SLSAQqmDBePdkQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test17/zdq7ZqCrTseUS0Y8tfRluA] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test24/Iysq8r6jQnGXUZXZIaSOQA] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test4/MsWmOzxuREO30_C6qKlppg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test18/UjkmZnKyRPyyS4iRKI5ZyQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test21/HKIcHxuMREmSUHm2t5j0xg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test1/22jzj3yzTdCd3pZefwktng] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test19/zDN65v3uTuKT6piXiR_Q9w] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test2/3W4Cf0AtQdaEuvZwww02Pg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test23/Zy6NpNInQ5CdCK95yJIHpA] deleting index
  1> [2022-12-22T21:18:00,678][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopping ...
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [testDelayWithALargeAmountOfShards] FollowerChecker{discoveryNode={node_t3}{DKp-hs3cSj6Fx1RyPWWiQA}{i4ISKMSyTk-_OvUueJwlIQ}{127.0.0.1}{127.0.0.1:39795}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} disconnected
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [testDelayWithALargeAmountOfShards] FollowerChecker{discoveryNode={node_t2}{rKCaSyUcTtGiHjro4I-ivQ}{e6gAXyAPSqqjfWfH1dTPqg}{127.0.0.1}{127.0.0.1:34401}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} disconnected
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [node_t1] FollowerChecker{discoveryNode={node_t3}{DKp-hs3cSj6Fx1RyPWWiQA}{i4ISKMSyTk-_OvUueJwlIQ}{127.0.0.1}{127.0.0.1:39795}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} marking node as faulty
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [node_t1] FollowerChecker{discoveryNode={node_t2}{rKCaSyUcTtGiHjro4I-ivQ}{e6gAXyAPSqqjfWfH1dTPqg}{127.0.0.1}{127.0.0.1:34401}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} marking node as faulty
  1> [2022-12-22T21:18:00,680][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopped
  1> [2022-12-22T21:18:00,680][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closing ...
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.Coordinator      ] [node_t3] cluster-manager node [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}] failed, restarting discovery
  1> org.opensearch.transport.NodeDisconnectedException: [node_t1][127.0.0.1:46329][disconnected] disconnected
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.Coordinator      ] [node_t2] cluster-manager node [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}] failed, restarting discovery
  1> org.opensearch.transport.NodeDisconnectedException: [node_t1][127.0.0.1:46329][disconnected] disconnected
  1> [2022-12-22T21:18:00,680][INFO ][o.o.c.s.ClusterApplierService] [node_t2] cluster-manager node changed {previous [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}], current []}, term: 3, version: 334, reason: becoming candidate: onLeaderFailure
  1> [2022-12-22T21:18:00,680][INFO ][o.o.c.s.ClusterApplierService] [node_t3] cluster-manager node changed {previous [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}], current []}, term: 3, version: 334, reason: becoming candidate: onLeaderFailure
  1> [2022-12-22T21:18:00,680][WARN ][o.o.c.NodeConnectionsService] [node_t3] failed to connect to {node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_t1][127.0.0.1:46329] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1076) ~[main/:?]
  1>    at org.opensearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:215) ~[main/:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:55) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:70) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at sun.nio.ch.Net.pollConnectNow(Net.java:672) ~[?:?]
  1>    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2022-12-22T21:18:00,680][WARN ][o.o.c.NodeConnectionsService] [node_t2] failed to connect to {node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_t1][127.0.0.1:46329] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1076) ~[main/:?]
  1>    at org.opensearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:215) ~[main/:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:55) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:70) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at sun.nio.ch.Net.pollConnectNow(Net.java:672) ~[?:?]
  1>    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2022-12-22T21:18:00,681][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closed
  1> [2022-12-22T21:18:00,682][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopping ...
  1> [2022-12-22T21:18:00,682][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopped
  1> [2022-12-22T21:18:00,682][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closing ...
  1> [2022-12-22T21:18:00,684][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closed
  1> [2022-12-22T21:18:00,684][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopping ...
  1> [2022-12-22T21:18:00,685][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopped
  1> [2022-12-22T21:18:00,685][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closing ...
  1> [2022-12-22T21:18:00,686][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closed
  1> [2022-12-22T21:18:00,686][INFO ][o.o.c.a.ClusterRerouteIT ] [testDelayWithALargeAmountOfShards] [ClusterRerouteIT#testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}]: cleaned up after test
  1> [2022-12-22T21:18:00,687][INFO ][o.o.c.a.ClusterRerouteIT ] [testDelayWithALargeAmountOfShards] [seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]] after test
  2> Dec 23, 2022 1:18:04 AM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
  2> SEVERE: 1 thread leaked from SUITE scope at org.opensearch.cluster.allocation.ClusterRerouteIT: 
  2>    1) Thread[id=34, name=SUITE-ClusterRerouteIT-seed#[3ADB1F144D29EAD6]-worker, state=RUNNABLE, group=TGRP-ClusterRerouteIT]
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.getName(UnixPath.java:43)
  2>         at java.base@17.0.1/java.io.FilePermission.containsPath(FilePermission.java:744)
  2>         at java.base@17.0.1/java.io.FilePermission.impliesIgnoreMask(FilePermission.java:611)
  2>         at java.base@17.0.1/java.io.FilePermissionCollection.implies(FilePermission.java:1202)
  2>         at java.base@17.0.1/java.security.Permissions.implies(Permissions.java:177)
  2>         at app//org.opensearch.bootstrap.OpenSearchPolicy.implies(OpenSearchPolicy.java:133)
  2>         at app//org.opensearch.bootstrap.BootstrapForTesting$1.implies(BootstrapForTesting.java:167)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.implies(ProtectionDomain.java:325)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.impliesWithAltFilePerm(ProtectionDomain.java:357)
  2>         at java.base@17.0.1/java.security.AccessControlContext.checkPermission(AccessControlContext.java:463)
  2>         at java.base@17.0.1/java.security.AccessController.checkPermission(AccessController.java:1068)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkPermission(SecurityManager.java:416)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkRead(SecurityManager.java:756)
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.checkRead(UnixPath.java:780)
  2>         at java.base@17.0.1/sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:408)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.ShuffleFS.newDirectoryStream(ShuffleFS.java:48)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at java.base@17.0.1/java.nio.file.Files.newDirectoryStream(Files.java:482)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:301)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.next(FileTreeWalker.java:374)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2845)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2882)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:352)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:330)
  2>         at app//org.apache.lucene.tests.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:209)
  2>         at app//com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  2>         at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  2>         at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  2>         at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  2>         at java.base@17.0.1/java.lang.Thread.run(Thread.java:833)
  2> Dec 23, 2022 1:18:04 AM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
  2> INFO: Starting to interrupt leaked threads:
  2>    1) Thread[id=34, name=SUITE-ClusterRerouteIT-seed#[3ADB1F144D29EAD6]-worker, state=RUNNABLE, group=TGRP-ClusterRerouteIT]
  2> Dec 23, 2022 1:18:07 AM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
  2> SEVERE: There are still zombie threads that couldn't be terminated:
  2>    1) Thread[id=34, name=SUITE-ClusterRerouteIT-seed#[3ADB1F144D29EAD6]-worker, state=RUNNABLE, group=TGRP-ClusterRerouteIT]
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.getName(UnixPath.java:43)
  2>         at java.base@17.0.1/java.io.FilePermission.containsPath(FilePermission.java:744)
  2>         at java.base@17.0.1/java.io.FilePermission.impliesIgnoreMask(FilePermission.java:611)
  2>         at java.base@17.0.1/java.io.FilePermissionCollection.implies(FilePermission.java:1202)
  2>         at java.base@17.0.1/java.security.Permissions.implies(Permissions.java:177)
  2>         at app//org.opensearch.bootstrap.OpenSearchPolicy.implies(OpenSearchPolicy.java:133)
  2>         at app//org.opensearch.bootstrap.BootstrapForTesting$1.implies(BootstrapForTesting.java:167)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.implies(ProtectionDomain.java:325)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.impliesWithAltFilePerm(ProtectionDomain.java:357)
  2>         at java.base@17.0.1/java.security.AccessControlContext.checkPermission(AccessControlContext.java:463)
  2>         at java.base@17.0.1/java.security.AccessController.checkPermission(AccessController.java:1068)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkPermission(SecurityManager.java:416)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkRead(SecurityManager.java:756)
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.checkRead(UnixPath.java:780)
  2>         at java.base@17.0.1/sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:408)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.ShuffleFS.newDirectoryStream(ShuffleFS.java:48)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at java.base@17.0.1/java.nio.file.Files.newDirectoryStream(Files.java:482)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:301)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.next(FileTreeWalker.java:374)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2845)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2882)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:352)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:330)
  2>         at app//org.apache.lucene.tests.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:209)
  2>         at app//com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  2>         at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  2>         at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  2>         at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  2>         at java.base@17.0.1/java.lang.Thread.run(Thread.java:833)
  2> java.lang.Exception: Suite timeout exceeded (>= 1200000 msec).
        at __randomizedtesting.SeedInfo.seed([3ADB1F144D29EAD6]:0)
  2> REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.allocation.ClusterRerouteIT" -Dtests.seed=3ADB1F144D29EAD6 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Druntime.java=17
  2> NOTE: test params are: codec=Asserting(Lucene95): {index_uuid=Lucene90, type=Lucene90}, docValues:{}, maxPointsInLeafNode=312, maxMBSortInHeap=7.363112477301868, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=en, timezone=America/Lower_Princes
  2> NOTE: Linux 5.11.0-1020-aws amd64/Oracle Corporation 17.0.1 (64-bit)/cpus=48,threads=2,free=415785912,total=2650800128
  2> NOTE: All tests run in this JVM: [ClusterRerouteIT]

Tests with failures:
 - org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}
 - org.opensearch.cluster.allocation.ClusterRerouteIT.classMethod

89 tests completed, 2 failed

> Task :server:internalClusterTest FAILED

FAILURE: Build failed with an exception.

@dreamer-89
Copy link
Member

dreamer-89 commented Dec 24, 2022

Getting suite timeout exceeded (>= 1200000 msec) after runnings tests in isolation for 100 times

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}" -Dtests.seed=3ADB1F144D29EAD6 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=America/Lower_Princes -Druntime.java=17
  2> java.lang.Exception: Test abandoned because suite timeout was reached.
        at __randomizedtesting.SeedInfo.seed([3ADB1F144D29EAD6]:0)
  2> Dec 22, 2022 9:17:59 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
  2> WARNING: Will linger awaiting termination of 116 leaked thread(s).
  1> [2022-12-22T21:17:59,558][INFO ][o.o.c.r.a.AllocationService] [node_t1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test0][3]]]).
  1> [2022-12-22T21:17:59,590][INFO ][o.o.c.a.ClusterRerouteIT ] [testDelayWithALargeAmountOfShards] [ClusterRerouteIT#testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}]: cleaning up after test
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test0/VepohJQqRvarqGLqSwBYFw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test9/SxH8hc6zSESXnPeMYJp-6Q] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test6/HA4_mZpiROiLYzqx9VC7Kg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test7/pM1V0JX6SuucR5DNEkTjFQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test3/Z1tvBXl7TySIeFhAbXb8ww] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test20/kbGlnkDBQ7KaYoS31YW02w] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test5/ebgd0T27ThS_ajBG5UsgOQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test8/WKjzotVySRKgGq1SyKTomA] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test22/sYPtZ_dMRRaW0RZCNXHpVQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test14/Dvhjy7g1TACjA-QzLR_0Aw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test16/QmG0tCMvSaWlzJ0KnAO4Tw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test15/pJDKm-gaSqWGaSZSg41HsQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test10/ALLIikLBT76tlKiteIcNhQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test13/ApnsFVhnQduGaS8f5vILew] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test12/Es2Xhfn0Rp6goCKE7vnWJw] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test11/sGeGcFb6SLSAQqmDBePdkQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test17/zdq7ZqCrTseUS0Y8tfRluA] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test24/Iysq8r6jQnGXUZXZIaSOQA] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test4/MsWmOzxuREO30_C6qKlppg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test18/UjkmZnKyRPyyS4iRKI5ZyQ] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test21/HKIcHxuMREmSUHm2t5j0xg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test1/22jzj3yzTdCd3pZefwktng] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test19/zDN65v3uTuKT6piXiR_Q9w] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test2/3W4Cf0AtQdaEuvZwww02Pg] deleting index
  1> [2022-12-22T21:17:59,655][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_t1] [test23/Zy6NpNInQ5CdCK95yJIHpA] deleting index
  1> [2022-12-22T21:18:00,678][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopping ...
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [testDelayWithALargeAmountOfShards] FollowerChecker{discoveryNode={node_t3}{DKp-hs3cSj6Fx1RyPWWiQA}{i4ISKMSyTk-_OvUueJwlIQ}{127.0.0.1}{127.0.0.1:39795}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} disconnected
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [testDelayWithALargeAmountOfShards] FollowerChecker{discoveryNode={node_t2}{rKCaSyUcTtGiHjro4I-ivQ}{e6gAXyAPSqqjfWfH1dTPqg}{127.0.0.1}{127.0.0.1:34401}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} disconnected
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [node_t1] FollowerChecker{discoveryNode={node_t3}{DKp-hs3cSj6Fx1RyPWWiQA}{i4ISKMSyTk-_OvUueJwlIQ}{127.0.0.1}{127.0.0.1:39795}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} marking node as faulty
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.FollowersChecker ] [node_t1] FollowerChecker{discoveryNode={node_t2}{rKCaSyUcTtGiHjro4I-ivQ}{e6gAXyAPSqqjfWfH1dTPqg}{127.0.0.1}{127.0.0.1:34401}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} marking node as faulty
  1> [2022-12-22T21:18:00,680][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopped
  1> [2022-12-22T21:18:00,680][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closing ...
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.Coordinator      ] [node_t3] cluster-manager node [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}] failed, restarting discovery
  1> org.opensearch.transport.NodeDisconnectedException: [node_t1][127.0.0.1:46329][disconnected] disconnected
  1> [2022-12-22T21:18:00,679][INFO ][o.o.c.c.Coordinator      ] [node_t2] cluster-manager node [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}] failed, restarting discovery
  1> org.opensearch.transport.NodeDisconnectedException: [node_t1][127.0.0.1:46329][disconnected] disconnected
  1> [2022-12-22T21:18:00,680][INFO ][o.o.c.s.ClusterApplierService] [node_t2] cluster-manager node changed {previous [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}], current []}, term: 3, version: 334, reason: becoming candidate: onLeaderFailure
  1> [2022-12-22T21:18:00,680][INFO ][o.o.c.s.ClusterApplierService] [node_t3] cluster-manager node changed {previous [{node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true}], current []}, term: 3, version: 334, reason: becoming candidate: onLeaderFailure
  1> [2022-12-22T21:18:00,680][WARN ][o.o.c.NodeConnectionsService] [node_t3] failed to connect to {node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_t1][127.0.0.1:46329] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1076) ~[main/:?]
  1>    at org.opensearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:215) ~[main/:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:55) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:70) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at sun.nio.ch.Net.pollConnectNow(Net.java:672) ~[?:?]
  1>    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2022-12-22T21:18:00,680][WARN ][o.o.c.NodeConnectionsService] [node_t2] failed to connect to {node_t1}{1RONYZA5QUikxcI2qRVQdA}{Ask-JuqIROS9KYTN3m_ikw}{127.0.0.1}{127.0.0.1:46329}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_t1][127.0.0.1:46329] connect_exception
  1>    at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1076) ~[main/:?]
  1>    at org.opensearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:215) ~[main/:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:55) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
  1>    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
  1>    at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:70) ~[opensearch-core-3.0.0-SNAPSHOT.jar:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:160) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.EventHandler.handleConnect(EventHandler.java:130) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.transport.nio.TestEventHandler.handleConnect(TestEventHandler.java:139) ~[framework-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.attemptConnect(NioSelector.java:446) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.registerChannel(NioSelector.java:469) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.setUpNewChannels(NioSelector.java:458) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.preSelect(NioSelector.java:279) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.singleLoop(NioSelector.java:172) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at org.opensearch.nio.NioSelector.runLoop(NioSelector.java:148) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    at java.lang.Thread.run(Thread.java:833) [?:?]
  1> Caused by: java.net.ConnectException: Connection refused
  1>    at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
  1>    at sun.nio.ch.Net.pollConnectNow(Net.java:672) ~[?:?]
  1>    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) ~[?:?]
  1>    at org.opensearch.nio.SocketChannelContext.connect(SocketChannelContext.java:157) ~[opensearch-nio-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
  1>    ... 9 more
  1> [2022-12-22T21:18:00,681][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closed
  1> [2022-12-22T21:18:00,682][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopping ...
  1> [2022-12-22T21:18:00,682][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopped
  1> [2022-12-22T21:18:00,682][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closing ...
  1> [2022-12-22T21:18:00,684][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closed
  1> [2022-12-22T21:18:00,684][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopping ...
  1> [2022-12-22T21:18:00,685][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] stopped
  1> [2022-12-22T21:18:00,685][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closing ...
  1> [2022-12-22T21:18:00,686][INFO ][o.o.n.Node               ] [testDelayWithALargeAmountOfShards] closed
  1> [2022-12-22T21:18:00,686][INFO ][o.o.c.a.ClusterRerouteIT ] [testDelayWithALargeAmountOfShards] [ClusterRerouteIT#testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}]: cleaned up after test
  1> [2022-12-22T21:18:00,687][INFO ][o.o.c.a.ClusterRerouteIT ] [testDelayWithALargeAmountOfShards] [seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]] after test
  2> Dec 23, 2022 1:18:04 AM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
  2> SEVERE: 1 thread leaked from SUITE scope at org.opensearch.cluster.allocation.ClusterRerouteIT: 
  2>    1) Thread[id=34, name=SUITE-ClusterRerouteIT-seed#[3ADB1F144D29EAD6]-worker, state=RUNNABLE, group=TGRP-ClusterRerouteIT]
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.getName(UnixPath.java:43)
  2>         at java.base@17.0.1/java.io.FilePermission.containsPath(FilePermission.java:744)
  2>         at java.base@17.0.1/java.io.FilePermission.impliesIgnoreMask(FilePermission.java:611)
  2>         at java.base@17.0.1/java.io.FilePermissionCollection.implies(FilePermission.java:1202)
  2>         at java.base@17.0.1/java.security.Permissions.implies(Permissions.java:177)
  2>         at app//org.opensearch.bootstrap.OpenSearchPolicy.implies(OpenSearchPolicy.java:133)
  2>         at app//org.opensearch.bootstrap.BootstrapForTesting$1.implies(BootstrapForTesting.java:167)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.implies(ProtectionDomain.java:325)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.impliesWithAltFilePerm(ProtectionDomain.java:357)
  2>         at java.base@17.0.1/java.security.AccessControlContext.checkPermission(AccessControlContext.java:463)
  2>         at java.base@17.0.1/java.security.AccessController.checkPermission(AccessController.java:1068)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkPermission(SecurityManager.java:416)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkRead(SecurityManager.java:756)
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.checkRead(UnixPath.java:780)
  2>         at java.base@17.0.1/sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:408)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.ShuffleFS.newDirectoryStream(ShuffleFS.java:48)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at java.base@17.0.1/java.nio.file.Files.newDirectoryStream(Files.java:482)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:301)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.next(FileTreeWalker.java:374)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2845)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2882)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:352)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:330)
  2>         at app//org.apache.lucene.tests.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:209)
  2>         at app//com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  2>         at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  2>         at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  2>         at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  2>         at java.base@17.0.1/java.lang.Thread.run(Thread.java:833)
  2> Dec 23, 2022 1:18:04 AM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
  2> INFO: Starting to interrupt leaked threads:
  2>    1) Thread[id=34, name=SUITE-ClusterRerouteIT-seed#[3ADB1F144D29EAD6]-worker, state=RUNNABLE, group=TGRP-ClusterRerouteIT]
  2> Dec 23, 2022 1:18:07 AM com.carrotsearch.randomizedtesting.ThreadLeakControl tryToInterruptAll
  2> SEVERE: There are still zombie threads that couldn't be terminated:
  2>    1) Thread[id=34, name=SUITE-ClusterRerouteIT-seed#[3ADB1F144D29EAD6]-worker, state=RUNNABLE, group=TGRP-ClusterRerouteIT]
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.getName(UnixPath.java:43)
  2>         at java.base@17.0.1/java.io.FilePermission.containsPath(FilePermission.java:744)
  2>         at java.base@17.0.1/java.io.FilePermission.impliesIgnoreMask(FilePermission.java:611)
  2>         at java.base@17.0.1/java.io.FilePermissionCollection.implies(FilePermission.java:1202)
  2>         at java.base@17.0.1/java.security.Permissions.implies(Permissions.java:177)
  2>         at app//org.opensearch.bootstrap.OpenSearchPolicy.implies(OpenSearchPolicy.java:133)
  2>         at app//org.opensearch.bootstrap.BootstrapForTesting$1.implies(BootstrapForTesting.java:167)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.implies(ProtectionDomain.java:325)
  2>         at java.base@17.0.1/java.security.ProtectionDomain.impliesWithAltFilePerm(ProtectionDomain.java:357)
  2>         at java.base@17.0.1/java.security.AccessControlContext.checkPermission(AccessControlContext.java:463)
  2>         at java.base@17.0.1/java.security.AccessController.checkPermission(AccessController.java:1068)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkPermission(SecurityManager.java:416)
  2>         at java.base@17.0.1/java.lang.SecurityManager.checkRead(SecurityManager.java:756)
  2>         at java.base@17.0.1/sun.nio.fs.UnixPath.checkRead(UnixPath.java:780)
  2>         at java.base@17.0.1/sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:408)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at app//org.apache.lucene.tests.mockfile.ShuffleFS.newDirectoryStream(ShuffleFS.java:48)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.HandleTrackingFS.newDirectoryStream(HandleTrackingFS.java:299)
  2>         at app//org.apache.lucene.tests.mockfile.FilterFileSystemProvider.newDirectoryStream(FilterFileSystemProvider.java:234)
  2>         at java.base@17.0.1/java.nio.file.Files.newDirectoryStream(Files.java:482)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:301)
  2>         at java.base@17.0.1/java.nio.file.FileTreeWalker.next(FileTreeWalker.java:374)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2845)
  2>         at java.base@17.0.1/java.nio.file.Files.walkFileTree(Files.java:2882)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:352)
  2>         at app//org.apache.lucene.util.IOUtils.rm(IOUtils.java:330)
  2>         at app//org.apache.lucene.tests.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:209)
  2>         at app//com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  2>         at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  2>         at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  2>         at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  2>         at app//org.junit.rules.RunRules.evaluate(RunRules.java:20)
  2>         at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  2>         at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  2>         at java.base@17.0.1/java.lang.Thread.run(Thread.java:833)
  2> java.lang.Exception: Suite timeout exceeded (>= 1200000 msec).
        at __randomizedtesting.SeedInfo.seed([3ADB1F144D29EAD6]:0)
  2> REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.allocation.ClusterRerouteIT" -Dtests.seed=3ADB1F144D29EAD6 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Druntime.java=17
  2> NOTE: test params are: codec=Asserting(Lucene95): {index_uuid=Lucene90, type=Lucene90}, docValues:{}, maxPointsInLeafNode=312, maxMBSortInHeap=7.363112477301868, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=en, timezone=America/Lower_Princes
  2> NOTE: Linux 5.11.0-1020-aws amd64/Oracle Corporation 17.0.1 (64-bit)/cpus=48,threads=2,free=415785912,total=2650800128
  2> NOTE: All tests run in this JVM: [ClusterRerouteIT]

Tests with failures:
 - org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards {seed=[3ADB1F144D29EAD6:EF02520AD8D95D8A]}
 - org.opensearch.cluster.allocation.ClusterRerouteIT.classMethod

89 tests completed, 2 failed

> Task :server:internalClusterTest FAILED

FAILURE: Build failed with an exception.

@owaiskazi19 : With 1200000ms (20 mins), each test is getting 12 sec to finish which is very less. If you want to run this in continuation, suggest you to change the timeout to a higher value.

Agree this is problematic as test timeout should consider the iteration count as well and should have 100 * 120000 ms as total run timeout.

@owaiskazi19
Copy link
Member

Increasing the test timeout to 30 mins worked. Thanks @dreamer-89 for the input.
Ran the tests for 100 times in isolation and wasn't able to reproduce the flaky test

./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards" -Dtests.seed=3ADB1F144D29EAD6 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=America/Lower_Princes -Druntime.java=17 -Dtests.iters=100

> Configure project :qa:os
Cannot add task 'destructiveDistroTest.docker' as a task with that name already exists.
=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.6
  OS Info               : Linux 5.11.0-1020-aws (amd64)
  Runtime JDK Version   : 17 (Oracle JDK)
  Runtime java.home     : /usr/lib/jvm/java-17/jdk-17.0.1
  Gradle JDK Version    : 17 (Oracle JDK)
  Gradle java.home      : /usr/lib/jvm/java-17/jdk-17.0.1
  Random Testing Seed   : 3ADB1F144D29EAD6
  In FIPS 140 mode      : false
=======================================

> Task :test:framework:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:compileInternalClusterTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :server:internalClusterTest
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.opensearch.bootstrap.BootstrapForTesting (file:/home/ubuntu/OpenSearch/test/framework/build/distributions/framework-3.0.0-SNAPSHOT.jar)
WARNING: Please consider reporting this to the maintainers of org.opensearch.bootstrap.BootstrapForTesting
WARNING: System::setSecurityManager will be removed in a future release
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.gradle.api.internal.tasks.testing.worker.TestWorker (file:/home/ubuntu/.gradle/wrapper/dists/gradle-7.6-all/9f832ih6bniajn45pbmqhk2cw/gradle-7.6/lib/plugins/gradle-testing-base-7.6.jar)
WARNING: Please consider reporting this to the maintainers of org.gradle.api.internal.tasks.testing.worker.TestWorker
WARNING: System::setSecurityManager will be removed in a future release

BUILD SUCCESSFUL in 24m 14s
43 actionable tasks: 5 executed, 38 up-to-date
ubuntu@ip-172-31-56-214:~/OpenSearch$ 

@gaobinlong
Copy link
Collaborator

@andrross
Copy link
Member

Tracked by autocut: #14298

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run
Projects
None yet
Development

No branches or pull requests

7 participants