Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ShardNotFoundException during IndicesRequestCache clean up #14190

Closed
sgup432 opened this issue Jun 11, 2024 · 0 comments · Fixed by #14219
Closed

[BUG] ShardNotFoundException during IndicesRequestCache clean up #14190

sgup432 opened this issue Jun 11, 2024 · 0 comments · Fixed by #14219
Assignees
Labels
bug Something isn't working v2.15.0 Issues and PRs related to version 2.15.0

Comments

@sgup432
Copy link
Contributor

sgup432 commented Jun 11, 2024

Describe the bug

We observed a bug in 2.13/2.14 where during RequestCache clean up, below stracktrace can be seen:

Exception during periodic indices request cache cleanup:
<shard_name> ShardNotFoundException[no such shard]
        at org.opensearch.index.IndexService.getShard(IndexService.java:351)
        at org.opensearch.indices.IndicesService.lambda$new$0(IndicesService.java:431)
        at org.opensearch.indices.IndicesRequestCache$IndicesRequestCacheCleanupManager.cleanCache(IndicesRequestCache.java:658)
        at org.opensearch.indices.IndicesRequestCache$IndicesRequestCacheCleanupManager.cleanCache(IndicesRequestCache.java:609)
        at org.opensearch.indices.IndicesRequestCache$IndicesRequestCacheCleanupManager$IndicesRequestCacheCleaner.run(IndicesRequestCache.java:737)
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:863)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)

This happens in case when we are trying to clean up the cached entries for an index shard which got allocated to another node or deleted from that node. And the above exception is thrown from here - https://github.com/opensearch-project/OpenSearch/blame/main/server/src/main/java/org/opensearch/indices/IndicesService.java#L407

Related component

Search:Performance

To Reproduce

  • Cache entries in request cache for indexShard A for node-1
  • Move indexShard A to another node-2
  • Try clearing up cache entries for node-1

Expected behavior

We should not see these exceptions during cache clean up. As it will then fail to clear up the stale entries from cache and thereby disallowing the new entries to be cache and causing performance impact indirectly.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v2.15.0 Issues and PRs related to version 2.15.0
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant