Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Snapshots using SLM when index name has a "_" #62164

Open
kuba2915 opened this issue Sep 9, 2020 · 12 comments
Open

Problem with Snapshots using SLM when index name has a "_" #62164

kuba2915 opened this issue Sep 9, 2020 · 12 comments
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team

Comments

@kuba2915
Copy link

kuba2915 commented Sep 9, 2020

Elasticsearch version (bin/elasticsearch --version): 7.9.0

Plugins installed: [] -

JVM version (java -version): docker container

OS version (uname -a if on a Unix-like system): docker container | RHEL7

Description of the problem including expected versus actual behavior: unable to create snapshot of index with "" in index name. When index has a name with "", expressions for date of snapped index doesn't working

Steps to reproduce: Create SLM policy for index with "_"

Please include a minimal but complete recreation of the problem,
including (e.g.) index creation, mappings, settings, query etc. The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.

  1. Have index which name have "_"
  2. Create snapshot policy using name of index, and day expression

Provide logs (if relevant):

<nginx%5Ftst-{now/d-1d}> In this case date expressions works properly

{
"type": "index_not_found_exception",
"reason": "no such index [nginx%5Ftst-2020.09.08]",
"resource.type": "index_or_alias",
"resource.id": " nginx%5Ftst-2020.09.08",
"index_uuid": "na",
"index": " nginx%5F tst-2020.09.08",
"stack_trace": "[nginx%5Ftst-2020.09.08] IndexNotFoundException[no such index [nginx%5Ftst-2020.09.08]]\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:191)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:171)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:76)\n\tat org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:402)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381)\n\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\n\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702)\n\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324)\n\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219)\n\tat org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73)\n\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151)\n\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\n\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"
}

<nginx_tst-{now/d-1d}>
{
“type": "index_not_found_exception",
"reason": "no such index [<nginx_tst-{now/d-1d}>]",
"resource.type": "index_or_alias",
"resource.id": "<nginx_tst-{now/d-1d}>",
"index_uuid": "na",
"index": "<nginx_tst-{now/d-1d}>",
"stack_trace": "[<nginx_tst-{now/d-1d}>] IndexNotFoundException[no such index [<nginx_tst-{now/d-1d}>]]\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.dataStreamNames(IndexNameExpressionResolver.java:122)\n\tat org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:405)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381)\n\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\n\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702)\n\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324)\n\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219)\n\tat org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73)\n\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151)\n\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\n\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"
}

@kuba2915 kuba2915 added >bug needs:triage Requires assignment of a team area label labels Sep 9, 2020
@dakrone
Copy link
Member

dakrone commented Sep 9, 2020

@kuba2915

  1. Create snapshot policy using name of index, and day expression

Can you provide the SLM policy you are using? It's difficult to tell from your sentence where you are using the day expression.

@kuba2915
Copy link
Author

kuba2915 commented Sep 9, 2020

{ "with_" : { "version" : 4, "modified_date_millis" : 1599681631964, "policy" : { "name" : "with_", "schedule" : "0 30 1 * * ?", "repository" : "nginx_test", "config" : { "indices" : [ "<nginx_tst-{now/d-1d}>" ] } }, "last_success" : { "snapshot_name" : "with_-2y2ftmxprmgdyrbcjpli5w", "time" : 1599681022586 }, "last_failure" : { "snapshot_name" : "with_-psxrru54rksiac5ztkcdna", "time" : 1599681635703, "details" : """{"type":"index_not_found_exception","reason":"no such index [<nginx_tst-{now/d-1d}>]","resource.type":"index_or_alias","resource.id":"<nginx_tst-{now/d-1d}>","index_uuid":"_na_","index":"<nginx_tst-{now/d-1d}>","stack_trace":"[<nginx_tst-{now/d-1d}>] IndexNotFoundException[no such index [<nginx_tst-{now/d-1d}>]]\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.dataStreamNames(IndexNameExpressionResolver.java:122)\n\tat org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:405)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381)\n\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\n\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702)\n\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324)\n\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219)\n\tat org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73)\n\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151)\n\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\n\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"}""" }, "next_execution_millis" : 1599701400000, "stats" : { "policy" : "with_", "snapshots_taken" : 0, "snapshots_failed" : 8, "snapshots_deleted" : 0, "snapshot_deletion_failures" : 0 } } }

@dakrone
Copy link
Member

dakrone commented Sep 9, 2020

@kuba2915 that is the problem, this line in your configuration:

"indices" : [ "<nginx_tst-{now/d-1d}>" ]

Snapshots do not support specifying the indices to be snapshotted using the date math format, only the snapshot_name field in SLM policies supports date math.

@kuba2915
Copy link
Author

kuba2915 commented Sep 9, 2020

SLM documentation says you can use Date math expression

indices
(Optional, array of strings) Array of data streams and indices to include in snapshots. Date math and wildcard (*) expressions are supported.

According to this site i use <nginx_tst-{now/d-1d}> as a snapshoted index name, and it should resolve into snapshot of index ngnix_tst-2020.09.08

But it doesn't work and I get the log as above.

In version 7.8.0 it worked as it should.

@kuba2915
Copy link
Author

kuba2915 commented Sep 9, 2020

Additionally, when the index does not exist, the function correctly generates the name of the snapped index just as in log above

{ "with_" : { "version" : 4, "modified_date_millis" : 1599681631964, "policy" : { "name" : "with_", "schedule" : "0 30 1 * * ?", "repository" : "nginx_test", "config" : { "indices" : [ "<nginx_tst-{now/d-1d}>" ] } }, "last_success" : { "snapshot_name" : "with_-2y2ftmxprmgdyrbcjpli5w", "time" : 1599681022586 }, "last_failure" : { "snapshot_name" : "with_-o36ly-cor4erwy0dr_kxva", "time" : 1599683078046, "details" : """{"type":"index_not_found_exception","reason":"no such index [nginx_tst-2020.09.08]","resource.type":"index_or_alias","resource.id":"nginx_tst-2020.09.08","index_uuid":"_na_","index":"nginx_tst-2020.09.08","stack_trace":"[nginx_tst-2020.09.08] IndexNotFoundException[no such index [nginx_tst-2020.09.08]]\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:191)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:171)\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:76)\n\tat org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:402)\n\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381)\n\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\n\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702)\n\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324)\n\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219)\n\tat org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73)\n\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151)\n\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\n\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\n\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\n\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\n\tat java.base/java.lang.Thread.run(Thread.java:832)\n"}""" }, "next_execution_millis" : 1599701400000, "stats" : { "policy" : "with_", "snapshots_taken" : 0, "snapshots_failed" : 8, "snapshots_deleted" : 0, "snapshot_deletion_failures" : 0 } } }

@dakrone
Copy link
Member

dakrone commented Sep 9, 2020

You are correct, I wasn't aware that we supported that.

I did just try this with 7.9.0 however (to reproduce it), and found that it worked correctly. Do you have an index that matches the <nginx_tst-{now/d-1d}> pattern? It sounds like it has failed because the index doesn't exist and "ignore_unavailable" is not set to true.

@kuba2915
Copy link
Author

kuba2915 commented Sep 9, 2020

This also applies to version 7.9.1, I have the same problem there. I also tested the behavior when the index matching the pattern exists, and when it is removed from the cluster

@matriv matriv added the :Data Management/ILM+SLM Index and Snapshot lifecycle management label Sep 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Sep 10, 2020
@ajoliveira
Copy link

This doesn't appear to be specific to the "_" character because I can reproduce the below behavior in both 7.9 and 7.9.1. It looks like it's more related to the date math pattern and ignore_unavailable setting, the odd thing is:

Testing using any date math pattern, with "ignore_unavailable": false, if the index does not exist, the error indicates that the date math substitution works (but of course the index does not exist so the snapshot should fail)?

[2020-09-10T19:35:00,005][WARN ][org.elasticsearch.snapshots.SnapshotsService] [instance-0000000000] [found-snapshots][nginx_tst-2020.09.09-6j854r97sa6fvdymxtq_dw] failed to create snapshot
org.elasticsearch.index.IndexNotFoundException: no such index [nginx-tst-2020.09.09]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:191) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:171) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:76) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:402) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) [elasticsearch-7.9.0.jar:7.9.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]

If the index does exist, the error appears to indicate that the date math substitution is not working but the snapshot is still not taken?

[2020-09-10T19:30:00,006][WARN ][org.elasticsearch.snapshots.SnapshotsService] [instance-0000000000] [found-snapshots][nginx_tst-2020.09.09-g4tzdljbstkrawee50iiig] failed to create snapshot
org.elasticsearch.index.IndexNotFoundException: no such index [<nginx-tst-{now/d-1d}>]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.dataStreamNames(IndexNameExpressionResolver.java:122) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:405) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324) ~[elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) [elasticsearch-7.9.0.jar:7.9.0]
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) [elasticsearch-7.9.0.jar:7.9.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]

@kuba2915
Copy link
Author

If the index does exist, the error appears to indicate that the date math substitution is not working but the snapshot is still not taken?
Still not taken, but with "ignore_unavailable": true snapshot is performed.

Now I don't think this is related to the underlining sign. More like issue with date math or something else.

Either the snapshot function has an incorrect condition because it also checks the data streams in new release?

https://www.elastic.co/guide/en/elasticsearch/reference/7.8/create-snapshot-api.html

ignore_unavailable
(Optional, boolean) If false, the request returns an error for any index that is missing or closed. Defaults to false.
If true, the request ignores and indices in indices that are missing or closed.

https://www.elastic.co/guide/en/elasticsearch/reference/7.9/create-snapshot-api.html

ignore_unavailable
(Optional, boolean) If false, the request returns an error for any data stream or index that is missing or closed. Defaults to false.
If true, the request ignores data streams and indices in indices that are missing or closed.

@jimczi jimczi removed the needs:triage Requires assignment of a team area label label Sep 17, 2020
@asabourin
Copy link

asabourin commented Sep 23, 2020

Can confirm we've noticed a similar issue after upgrading to 7.9 from 7.8, without any other change the following SLM policy is was then failing:

{
  "daily-snapshots" : {
    "version" : 12,
    "modified_date_millis" : 1600902333840,
    "policy" : {
      "name" : "<events-{now/d{yyyyMMdd}}>",
      "schedule" : "0 30 1 * * ?",
      "repository" : "gcs",
      "config" : {
        "include_global_state" : false,
        "ignore_unavailable" : false,
        "indices" : [
          "<events-{now/d{yyyyMMdd}}>"
        ]
      },
      "retention" : { }
    }
}
{"type":"index_not_found_exception","reason":"no such index [<events-{now/d{yyyyMMdd}}>]","resource.type":"index_or_alias","resource.id":"<events-{now/d{yyyyMMdd}}>","index_uuid":"_na_","index":"<events-{now/d{yyyyMMdd}}>","stack_trace":"[<events-{now/d{yyyyMMdd}}>] IndexNotFoundException[no such index [<events-{now/d{yyyyMMdd}}>]]\n\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882)\n\tat

We had to set ignore_unavailable to true to make it work.

@jctello
Copy link

jctello commented Oct 14, 2020

I can confirm this has been broken by 7.9.x
The exact steps to reproduce both the working behavior and the wrong behavior on a clean RPM-based machine are:

ES_version="7.9.2"

yum install https://artifacts.elastic.co/packages/7.x/yum/$ES_version/elasticsearch-$ES_version-x86_64.rpm -y
mkdir -p /mount/elasticsearch_backup
chown elasticsearch: /mount/elasticsearch_backup
echo 'path.repo: ["/mount/elasticsearch_backup"]' >> /etc/elasticsearch/elasticsearch.yml 
systemctl restart elasticsearch

curl -XPUT localhost:9200/_snapshot/elasticsearch_backup -H "Content-Type: application/json" -d'
{
  "type": "fs",
  "settings": {
    "delegate_type": "fs",
    "location": "/mount/elasticsearch_backup",
    "compress": true
  }
}
'
curl -XPUT "http://localhost:9200/%3Cmy-index-%7Bnow%2Fd-1d%7D%3E"
curl -XPUT localhost:9200/_slm/policy/nightly-snapshots -H "Content-Type: application/json" -d'
{
  "schedule": "0 30 1 * * ?", 
  "name": "<nightly-snap-{now/d-1d}>", 
  "repository": "elasticsearch_backup", 
  "config": { 
    "indices": ["<my-index-{now/d-1d}>"],
    "ignore_unavailable": true
  },
  "retention": { 
    "expire_after": "366d", 
    "min_count": 5, 
    "max_count": 370 
  }
}
'
curl -XPUT "http://localhost:9200/_slm/policy/nightly-snapshots/_execute"
curl -XGET "http://localhost:9200/_slm/policy/nightly-snapshots/?pretty"

Result for 7.9.x

{
  "nightly-snapshots" : {
    "version" : 1,
    "modified_date_millis" : 1602702532898,
    "policy" : {
      "name" : "<nightly-snap-{now/d-1d}>",
      "schedule" : "0 30 1 * * ?",
      "repository" : "elasticsearch_backup",
      "config" : {
        "indices" : [
          "<my-index-{now/d-1d}>"
        ]
      },
      "retention" : {
        "expire_after" : "366d",
        "min_count" : 5,
        "max_count" : 370
      }
    },
    "last_failure" : {
      "snapshot_name" : "nightly-snap-2020.10.13-v6tsuwumqyiukkq_mtforq",
      "time" : 1602702538132,
      "details" : "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [<my-index-{now/d-1d}>]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"<my-index-{now/d-1d}>\",\"index_uuid\":\"_na_\",\"index\":\"<my-index-{now/d-1d}>\",\"stack_trace\":\"[<my-index-{now/d-1d}>] IndexNotFoundException[no such index [<my-index-{now/d-1d}>]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:882)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:819)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:775)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.dataStreamNames(IndexNameExpressionResolver.java:122)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$2.execute(SnapshotsService.java:405)\\n\\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.execute(BlobStoreRepository.java:381)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219)\\n\\tat org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:651)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\\n\\tat java.base/java.lang.Thread.run(Thread.java:832)\\n\"}"
    },
    "next_execution_millis" : 1602725400000,
    "stats" : {
      "policy" : "nightly-snapshots",
      "snapshots_taken" : 0,
      "snapshots_failed" : 1,
      "snapshots_deleted" : 0,
      "snapshot_deletion_failures" : 0
    }
  }
}

Result for 7.8.1:

{
  "nightly-snapshots" : {
    "version" : 1,
    "modified_date_millis" : 1602703933918,
    "policy" : {
      "name" : "<nightly-snap-{now/d-1d}>",
      "schedule" : "0 30 1 * * ?",
      "repository" : "elasticsearch_backup",
      "config" : {
        "indices" : [
          "<my-index-{now/d-1d}>"
        ],
        "ignore_unavailable" : true
      },
      "retention" : {
        "expire_after" : "366d",
        "min_count" : 5,
        "max_count" : 370
      }
    },
    "last_success" : {
      "snapshot_name" : "nightly-snap-2020.10.13-_g0plwbvquqgoikfe8ocbw",
      "time" : 1602703934123
    },
    "next_execution_millis" : 1602725400000,
    "stats" : {
      "policy" : "nightly-snapshots",
      "snapshots_taken" : 1,
      "snapshots_failed" : 0,
      "snapshots_deleted" : 0,
      "snapshot_deletion_failures" : 0
    }
  }
}

Judging from the commits done to the slm folder before 7.9.0 and after 7.8.1, this may have been broken by #57629
Maybe @dakrone has better insight into the way this works and hopefully working towards a fix.

Thank you @ajoliveira and @asabourin for pointing me towards using ignore_unavailable.
It seems that setting this to true enables the policy to work without issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

8 participants