-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix restart HCAD detector bug #460
Conversation
To prevent repeatedly cold starting a model due to sparse data, HCAD has a cache that remembers we have done cold start for a model. A second attempt to cold start will need to wait for 60 detector intervals. Previously, when stopping a detector, I forgot to clean the cache. So the cache remembers the model and won’t retry cold start after some time. This PR fixes the bug by cleaning the cache when stopping a detector. Testing done: 1. added unit and integration tests. 2. manually reproduced the issue and verified the fix. Signed-off-by: Kaituo Li <kaituo@amazon.com>
waitAllSyncheticDataIngested(data.size(), datasetName, client); | ||
} | ||
|
||
private void waitAllSyncheticDataIngested(int expectedSize, String datasetName, RestClient client) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but could we do something similar for historical tests that are flaky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we could
src/test/java/org/opensearch/ad/e2e/DetectionResultEvalutationIT.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this and adding tests
Signed-off-by: Kaituo Li <kaituo@amazon.com>
Codecov Report
@@ Coverage Diff @@
## main #460 +/- ##
============================================
+ Coverage 78.33% 78.35% +0.02%
- Complexity 4172 4176 +4
============================================
Files 296 296
Lines 17657 17661 +4
Branches 1879 1879
============================================
+ Hits 13832 13839 +7
+ Misses 2945 2940 -5
- Partials 880 882 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
* Fix restart HCAD detector bug To prevent repeatedly cold starting a model due to sparse data, HCAD has a cache that remembers we have done cold start for a model. A second attempt to cold start will need to wait for 60 detector intervals. Previously, when stopping a detector, I forgot to clean the cache. So the cache remembers the model and won’t retry cold start after some time. This PR fixes the bug by cleaning the cache when stopping a detector. Testing done: 1. added unit and integration tests. 2. manually reproduced the issue and verified the fix. Signed-off-by: Kaituo Li <kaituo@amazon.com> (cherry picked from commit 9dd9718)
* Fix restart HCAD detector bug To prevent repeatedly cold starting a model due to sparse data, HCAD has a cache that remembers we have done cold start for a model. A second attempt to cold start will need to wait for 60 detector intervals. Previously, when stopping a detector, I forgot to clean the cache. So the cache remembers the model and won’t retry cold start after some time. This PR fixes the bug by cleaning the cache when stopping a detector. Testing done: 1. added unit and integration tests. 2. manually reproduced the issue and verified the fix. Signed-off-by: Kaituo Li <kaituo@amazon.com> (cherry picked from commit 9dd9718)
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.2 1.2
# Navigate to the new working tree
cd .worktrees/backport-1.2
# Create a new branch
git switch --create backport/backport-460-to-1.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9dd9718748cd8d6917b10d66f56ca9e8ed117d7e
# Push it to GitHub
git push --set-upstream origin backport/backport-460-to-1.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.2 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.1 1.1
# Navigate to the new working tree
cd .worktrees/backport-1.1
# Create a new branch
git switch --create backport/backport-460-to-1.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9dd9718748cd8d6917b10d66f56ca9e8ed117d7e
# Push it to GitHub
git push --set-upstream origin backport/backport-460-to-1.1
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.1 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.0 1.0
# Navigate to the new working tree
cd .worktrees/backport-1.0
# Create a new branch
git switch --create backport/backport-460-to-1.0
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9dd9718748cd8d6917b10d66f56ca9e8ed117d7e
# Push it to GitHub
git push --set-upstream origin backport/backport-460-to-1.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.0 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.x 1.x
# Navigate to the new working tree
cd .worktrees/backport-1.x
# Create a new branch
git switch --create backport/backport-460-to-1.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9dd9718748cd8d6917b10d66f56ca9e8ed117d7e
# Push it to GitHub
git push --set-upstream origin backport/backport-460-to-1.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.x Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.x 1.x
# Navigate to the new working tree
cd .worktrees/backport-1.x
# Create a new branch
git switch --create backport/backport-460-to-1.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9dd9718748cd8d6917b10d66f56ca9e8ed117d7e
# Push it to GitHub
git push --set-upstream origin backport/backport-460-to-1.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.x Then, create a pull request where the |
* Fix restart HCAD detector bug To prevent repeatedly cold starting a model due to sparse data, HCAD has a cache that remembers we have done cold start for a model. A second attempt to cold start will need to wait for 60 detector intervals. Previously, when stopping a detector, I forgot to clean the cache. So the cache remembers the model and won’t retry cold start after some time. This PR fixes the bug by cleaning the cache when stopping a detector. Testing done: 1. added unit and integration tests. 2. manually reproduced the issue and verified the fix. Signed-off-by: Kaituo Li <kaituo@amazon.com>
* Fix restart HCAD detector bug (#460) * Fix restart HCAD detector bug * Adding test-retry plugin (#456) * backport cve fix and improve restart IT To prevent repeatedly cold starting a model due to sparse data, HCAD has a cache that remembers we have done cold start for a model. A second attempt to cold start will need to wait for 60 detector intervals. Previously, when stopping a detector, I forgot to clean the cache. So the cache remembers the model and won’t retry cold start after some time. This PR fixes the bug by cleaning the cache when stopping a detector. Testing done: 1. added unit and integration tests. 2. manually reproduced the issue and verified the fix.
Description
To prevent repeatedly cold starting a model due to sparse data, HCAD has a cache that remembers we have done cold start for a model. A second attempt to cold start will need to wait for 60 detector intervals. Previously, when stopping a detector, I forgot to clean the cache. So the cache remembers the model and won’t retry cold start after some time. This PR fixes the bug by cleaning the cache when stopping a detector.
Testing done:
Signed-off-by: Kaituo Li kaituo@amazon.com
Issues Resolved
#400
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.