[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreClusterStateRestoreIT #14326

opensearch-ci-bot · 2024-06-13T21:37:20Z

Flaky Test Report for `RemoteStoreClusterStateRestoreIT`

Noticed the RemoteStoreClusterStateRestoreIT has some flaky, failing tests that failed during post-merge actions.

Details

Git Reference	Merged Pull Request	Build Details	Test Name
`ccf5289`	14221	40884	`org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterRestoreGlobalMetadata`

The other pull requests, besides those involved in post-merge actions, that contain failing tests with the RemoteStoreClusterStateRestoreIT class are:

For more details on the failed tests refer to OpenSearch Gradle Check Metrics dashboard.

The text was updated successfully, but these errors were encountered:

shiv0408 · 2024-06-17T13:09:46Z

This test should not be flaky now, the fix was merged in #14230

shiv0408 · 2024-06-17T13:27:18Z

@prudhvigodithi Wanted to confirm if a test becomes flaky again after the issue has been resolved, does our workflow create a new issue or open the previous closed issue?

prudhvigodithi · 2024-06-17T15:50:48Z

Hey @shiv0408 yes the automation will re-open the issue if it was closed within 3 days (which is configurable) else it will re-create a new issue. In both cases the issue body will have the latest flaky test information.
Thanks
@getsaurabh02

shiv0408 · 2024-06-17T21:22:09Z

@andrross Do you know how this issue got re-opened? I closed the issue as the fix was made, do we need to keep the issue open?

andrross · 2024-06-17T21:36:11Z

@shiv0408 This is the PR that caused it to reopen: #14345

I think @prudhvigodithi is looking into a similar case. It might be that a PR is open that hasn't rebased with your fix, and that causes it to reopen if the test fails.

shiv0408 · 2024-06-17T21:51:16Z

This PR was merged 3 days ago, but the CI bot opened is couple of hours ago.

prudhvigodithi · 2024-06-17T21:55:10Z

The automation looks for last 30 days build data in post merge action and if found RemoteStoreClusterStateRestoreIT will re-open/create the issue. So even though the fix was pushed, the RemoteStoreClusterStateRestoreIT was found failing in past 30 days hence it re-opened the existing issue.

How about we allow the automation to close the issue? Since now we have all the issues created for the flaky tests, we can reduce to identify the flaky tests in last 15 days and auto-close the issue if not failing in last 15 days, this way the user need not worry about closing the issue ? WDYT @andrross @shiv0408
Adding @dblock @reta @getsaurabh02

shiv0408 · 2024-06-18T12:49:08Z

How about we allow the automation to close the issue?

This might present incorrect information that test is still flaky, while it might have been resolved. Anyway, we are going to reopen the issue if test turns to be flaky again.

andrross · 2024-06-18T15:26:51Z

@prudhvigodithi If an issue exists and it is closed, and none of the failures are newer than the close date, then can we keep it closed?

prudhvigodithi · 2024-06-18T17:29:39Z

Sure, what we can do it the following:

The automation will go back 30 days and check for failures in post merge action, flag the tests and create a detailed issue report. The issue body will be updated with the latest information as and when it has the latest information which will be indexed through the Gradle Check build. This is the current state.
If the issue is closed (considering the flaky test is fixed by the user) the automation should not re-open unless the data is different from what shown in the issue body, if anything (the issue body) is different after closed then it should re-open the issue. Here the data to compare is the markdown table and not the linked PR's as during the PR creation the failures sometimes could be genuine. So re-open when seen a new failure (with a different post merge commit) after the issue is closed. This should also solve the problem where sometimes we think the Flaky test is fixed but would re-occur and with new reoccurrence the issue should re-open with new data.

peterzhuamazon · 2024-06-18T17:37:17Z

Can we compare the base of the PRs to determine if such issue is legit or we inform user to rebase?

It is possible that an issue can re-surface due to regression.

At least if we see the PR has a updated base, we can re-open the old issue if needed. Else, inform user to rebase their branch. Thanks.

shiv0408 · 2024-06-18T18:35:20Z

@prudhvigodithi we should create an issue to add your second point as an enhancement on our current state.

dblock · 2024-07-01T16:34:26Z

[Catch All Triage - Attendees 1, 2, 3, 4, 5]

opensearch-ci-bot added >test-failure Test failure from CI, local build, etc. autocut untriaged labels Jun 13, 2024

prudhvigodithi mentioned this issue Jun 13, 2024

Add additional details on Gradle Check failures autocut issues #13950

Closed

prudhvigodithi added the flaky-test Random test failure that succeeds on second run label Jun 14, 2024

prudhvigodithi mentioned this issue Jun 14, 2024

[BUG] org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterRestoreGlobalMetadata is flaky #14275

Closed

shiv0408 closed this as completed Jun 17, 2024

opensearch-ci-bot reopened this Jun 17, 2024

andrross added the ClusterManager:RemoteState label Jun 17, 2024

reta mentioned this issue Jun 19, 2024

[BUG] org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testDataStreamPostRemoteStateRestore is flaky #11483

Closed

prudhvigodithi mentioned this issue Jun 20, 2024

[Automation Enhancement] Mechanism to close the created Gradle Check AUTOCUT flaky test issues. #14475

Closed

dblock removed the untriaged label Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreClusterStateRestoreIT #14326

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreClusterStateRestoreIT #14326

opensearch-ci-bot commented Jun 13, 2024 •

edited

Loading

shiv0408 commented Jun 17, 2024

shiv0408 commented Jun 17, 2024

prudhvigodithi commented Jun 17, 2024

shiv0408 commented Jun 17, 2024

andrross commented Jun 17, 2024 •

edited

Loading

shiv0408 commented Jun 17, 2024

prudhvigodithi commented Jun 17, 2024

shiv0408 commented Jun 18, 2024

andrross commented Jun 18, 2024

prudhvigodithi commented Jun 18, 2024

peterzhuamazon commented Jun 18, 2024

shiv0408 commented Jun 18, 2024 •

edited

Loading

dblock commented Jul 1, 2024

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreClusterStateRestoreIT #14326

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreClusterStateRestoreIT #14326

Comments

opensearch-ci-bot commented Jun 13, 2024 • edited Loading

Flaky Test Report for RemoteStoreClusterStateRestoreIT

Details

shiv0408 commented Jun 17, 2024

shiv0408 commented Jun 17, 2024

prudhvigodithi commented Jun 17, 2024

shiv0408 commented Jun 17, 2024

andrross commented Jun 17, 2024 • edited Loading

shiv0408 commented Jun 17, 2024

prudhvigodithi commented Jun 17, 2024

shiv0408 commented Jun 18, 2024

andrross commented Jun 18, 2024

prudhvigodithi commented Jun 18, 2024

peterzhuamazon commented Jun 18, 2024

shiv0408 commented Jun 18, 2024 • edited Loading

dblock commented Jul 1, 2024

opensearch-ci-bot commented Jun 13, 2024 •

edited

Loading

Flaky Test Report for `RemoteStoreClusterStateRestoreIT`

andrross commented Jun 17, 2024 •

edited

Loading

shiv0408 commented Jun 18, 2024 •

edited

Loading