Adds search backpressure documentation #1790

kolchfa-aws · 2022-11-02T15:17:49Z

Fixes #795

Checklist

[ x] By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

ketanv3 · 2022-11-07T11:45:19Z

_opensearch/search-backpressure.md

+{
+  "error" : {
+    "root_cause" : [
+      {
+        "type" : "task_cancelled_exception",
+        "reason" : "Task is cancelled due to high resource consumption"
+      }
+    ],
+    "type" : "search_phase_execution_exception",
+    "reason" : "all shards failed",
+    "phase" : "query",
+    "grouped" : true,
+    "failed_shards" : [
+      {
+        "shard" : 0,
+        "index" : "nyc_taxis",
+        "node" : "MGkMkg9wREW3IVewZ7U_jw",
+        "reason" : {
+          "type" : "task_cancelled_exception",
+          "reason" : "Task is cancelled due to high resource consumption"
+        }
+      }
+    ]
+  },
+  "status" : 500
+}


Sharing an up-to-date sample cancellation response. Can you please update?

{ "error": { "root_cause": [ { "type": "task_cancelled_exception", "reason": "cancelled task with reason: cpu usage exceeded [17.9ms >= 15ms], elapsed time exceeded [1.1s >= 300ms]" }, { "type": "task_cancelled_exception", "reason": "cancelled task with reason: elapsed time exceeded [1.1s >= 300ms]" } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": 0, "index": "foobar", "node": "7yIqOeMfRyWW1rHs2S4byw", "reason": { "type": "task_cancelled_exception", "reason": "cancelled task with reason: cpu usage exceeded [17.9ms >= 15ms], elapsed time exceeded [1.1s >= 300ms]" } }, { "shard": 1, "index": "foobar", "node": "7yIqOeMfRyWW1rHs2S4byw", "reason": { "type": "task_cancelled_exception", "reason": "cancelled task with reason: elapsed time exceeded [1.1s >= 300ms]" } } ] }, "status": 500 }

ketanv3 · 2022-11-07T11:56:33Z

_opensearch/search-backpressure.md

+
+An observer thread tracks the resource consumption of each task thread. It measures the resource consumption at several checkpoints during the query phase of a shard search request. If the node is determined to be under duress based on the JVM memory pressure and CPU utilization, the server examines the resource consumption for each search task. It determines if the CPU usage and elapsed time are within their fixed thresholds, and it compares the heap usage against the rolling average of the heap usage of the 100 most recent tasks. If the task is among the most resource-intensive based on these criteria, the task in canceled. 
+
+Every minute OpenSearch can cancel at most 1% of the number of currently running search shard tasks. Once a task is canceled, OpenSearch monitors the node for the next two seconds to determine if it is still under duress.


I suggest re-wording this slightly.

OpenSearch limits the number of cancellations as a fraction of successful task completions and cancellations per unit time. It continues to monitor and cancel tasks until the node is out of duress.

ketanv3 · 2022-11-07T12:11:11Z

_opensearch/search-backpressure.md

+- Heap usage
+- Elapsed time 
+
+An observer thread tracks the resource consumption of each task thread. It measures the resource consumption at several checkpoints during the query phase of a shard search request. If the node is determined to be under duress based on the JVM memory pressure and CPU utilization, the server examines the resource consumption for each search task. It determines if the CPU usage and elapsed time are within their fixed thresholds, and it compares the heap usage against the rolling average of the heap usage of the 100 most recent tasks. If the task is among the most resource-intensive based on these criteria, the task in canceled. 


I suggest re-wording this slightly.

An observer thread periodically measures the resource usage of the node. If the node is determined to be under duress, then the resource usage of each search shard task is examined and compared against some tunable thresholds. CPU usage, heap usage and elapsed time are considered to give each task a cancellation score which is then used to cancel the most resource-intensive tasks.

ketanv3 · 2022-11-07T12:12:33Z

_opensearch/search-backpressure.md

+
+## Canceled queries
+
+If a query is canceled, instead of receiving search results you receive an error from the server similar to the error below:


Depending on what all shards failed, it may be possible that OpenSearch returns partial results. Can we re-word this slightly?

ketanv3 · 2022-11-07T12:15:51Z

_opensearch/search-backpressure.md

+To retrieve the stats, use the following request:
+
+```json
+GET _nodes/stats/search_backpressure


nit: The response below is with human-readable fields enabled. Can you update this to:

GET _nodes/stats/search_backpressure?human=true

ketanv3 · 2022-11-07T12:21:09Z

_opensearch/search-backpressure.md

+cancellation_count | Integer | The number of tasks canceled because of excessive heap usage since the node last restarted.
+current_max_bytes | Integer | The maximum heap usage for all tasks currently running on the node, in bytes.
+current_avg_bytes | Integer | The average heap usage for all tasks currently running on the node, in bytes.
+rolling_avg_bytes | Integer | The rolling average heap usage for the 100 most recent tasks, in bytes.


nit: Rolling average is not hard-coded to work with just 100 most recent tasks.

This is a configurable setting defined by search_backpressure.search_shard_task.heap_moving_average_window_size. 100 is just the default value for it.

ketanv3 · 2022-11-07T12:51:36Z

_opensearch/search-backpressure.md

+## Search backpressure settings
+
+Search backpressure adds several settings to the standard OpenSearch cluster settings. These settings are dynamic, so you can change the default behavior of this feature without restarting your cluster.
+
+Setting | Default | Description
+:--- | :--- | :---
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;mode | `monitor_only` | The [mode](#search-backpressure-modes) for search backpressure. Valid values are `monitor_only`, `enforced`, or `disabled`.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;interval | 1 second | The interval at which the observer thread measures the resource consumption and cancels tasks.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;cancellation_ratio | 10% | The maximum percentage of tasks to cancel out of the number of successful task completions.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;cancellation_rate | 0.003 | The maximum number of tasks to cancel per millisecond.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;cancellation_burst | 10 | The maximum number of tasks that can be canceled before no further cancellations are made.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;node_duress.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;num_consecutive_breaches | 3 | The number of consecutive limit breaches after which the node is marked in duress.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;node_duress.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cpu_threshold | 90% | The CPU usage threshold (in percentage) for a node to be considered in duress.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;node_duress.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;heap_threshold | 70% | The heap usage threshold (in percentage) for a node to be considered in duress.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_heap_threshold | 5% | The heap usage threshold (in percentage) for the sum of heap usages across all search tasks before server-side cancellation is applied.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_task_heap_threshold | 0.5% | The heap usage threshold (in percentage) for one task before it is considered for cancellation.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_task_heap_variance | 2 | The heap usage variance for one task before it is considered for cancellation. A task is considered for cancellation when `taskHeapUsage` is greater than or equal to `heapUsageMovingAverage` &middot; `variance`.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_task_cpu_time_threshold | 15 seconds | The CPU usage threshold (in milliseconds) for one task before it is considered for cancellation.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_task_elapsed_time_threshold | 30 seconds | The elapsed time threshold (in milliseconds) for one task before it is considered for cancellation.


Some of these settings have been renamed. Sharing the latest ones along with simplified descriptions.

search_backpressure.mode (default monitor_only)
The search backpressure mode. Valid values are monitor_only, enforced, or disabled.

search_backpressure.interval_millis (default 1 second)
The interval at which the observer thread measures the resource usage and cancels tasks.

search_backpressure.cancellation_ratio (default 10%)
The maximum number of tasks to cancel as a fraction of successful task completions.

search_backpressure.cancellation_rate (default 0.003)
The maximum number of tasks to cancel per millisecond of elapsed time.

search_backpressure.cancellation_burst (default 10)
The maximum number of tasks to cancel in a single iteration of the observer thread.

search_backpressure.node_duress.num_successive_breaches (default 3)
The number of of successive limit breaches after which the node is considered under duress.

search_backpressure.node_duress.cpu_threshold (default 90%)
The CPU usage threshold (in percentage) for a node to be considered in duress.

search_backpressure.node_duress.heap_threshold (default 70%)
The heap usage threshold (in percentage) for a node to be considered in duress.

search_backpressure.search_shard_task.total_heap_percent_threshold (default 5%)
The heap usage threshold (in percentage) for the sum of all search shard tasks before cancellation is applied.

search_backpressure.search_shard_task.heap_percent_threshold (default 0.5%)
The heap usage threshold (in percentage) for a single search shard task before it is considered for cancellation.

search_backpressure.search_shard_task.heap_variance (default 2.0)
The minimum variance of a single search shard task's heap usage usage compared to the rolling average of previously completed tasks before it is considered for cancellation.

search_backpressure.search_shard_task.heap_moving_average_window_size (default 100)
The number of previously completed search shard tasks to consider when calculating the moving average of heap usage.

search_backpressure.search_shard_task.cpu_time_millis_threshold (default 15 seconds)
The CPU usage threshold (in milliseconds) for a single search shard task before it is considered for cancellation.

search_backpressure.search_shard_task.elapsed_time_millis_threshold (default 30 seconds)
The elapsed time threshold (in milliseconds) for a single search shard task before it is considered for cancellation.

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

ketanv3 · 2022-11-07T14:04:06Z

Thanks for these changes. LGTM!

nssuresh2007

Looks good to me!

vagimeli

Minimal changes. LGTM.

vagimeli · 2022-11-07T21:12:14Z

_opensearch/search-backpressure.md

+- Heap usage
+- Elapsed time 
+
+An observer thread periodically measures the resource usage of the node. If the node is determined to be under duress, OpenSearch examines the resource usage of each search shard task and compares it against configurable thresholds. OpenSearch considers CPU usage, heap usage and elapsed time and assigns each task a cancellation score that is then used to cancel the most resource-intensive tasks.


Third sentence: insert comma after "heap usage."

vagimeli · 2022-11-07T21:19:33Z

_opensearch/search-backpressure.md

+
+## Canceled queries
+
+If a query is canceled, OpenSearch may return partial results in case some shards failed. If all shards failed, OpenSearch returns an error from the server similar to the error below:


To clarify the second part of this sentence, I bolded my suggestion: "If a query is canceled, OpenSearch may return partial results if some shards failed."

vagimeli · 2022-11-07T21:24:43Z

_opensearch/search-backpressure.md

+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;node_duress.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;heap_threshold | 70% | The heap usage threshold (in percentage) for a node to be considered in duress.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_shard_task.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;total_heap_percent_threshold | 5% | The heap usage threshold (in percentage) for the sum of heap usages of all search shard tasks before cancellation is applied.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_shard_task.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;heap_percent_threshold | 0.5% | The heap usage threshold (in percentage) for a single search shard task before it is considered for cancellation.
+search_backpressure.<br>&nbsp;&nbsp;&nbsp;&nbsp;search_shard_task.<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;heap_variance | 2.0 | The minimum variance of a single search shard task's heap usage usage compared to the rolling average of previously completed tasks before it is considered for cancellation.


Delete second "usage" after "heap usage."

vagimeli · 2022-11-07T21:38:36Z

_opensearch/search-backpressure.md

+
+Field Name | Data Type | Description
+:--- | :--- | :---
+search_backpressure | Object | Contains statistics about search backpressure.


For consistency, either delete the verb "contains" in line 173 and 174 or add a verb to each description in lines 175-177.

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

ariamarble

LGTM

natebower

@kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!

_opensearch/search-backpressure.md

natebower · 2022-11-08T18:28:16Z

_opensearch/search-backpressure.md

+- Heap usage
+- Elapsed time 
+
+An observer thread periodically measures the resource usage of the node. If the node is determined to be under duress, OpenSearch examines the resource usage of each search shard task and compares it against configurable thresholds. OpenSearch considers CPU usage, heap usage, and elapsed time and assigns each task a cancellation score that is then used to cancel the most resource-intensive tasks.


Is the node being "determined to be under duress" standard language in this context? It reads a bit strangely to me, as though we're anthropomorphizing the node. Do we mean something like "If the node continues to be under duress"?

It's not "continues", it is whether it becomes under duress. So, we're checking periodically for the node health. Normally, it's not under duress. If we determine that it is under duress, then we remediate.

natebower · 2022-11-08T18:29:25Z

_opensearch/search-backpressure.md

+
+An observer thread periodically measures the resource usage of the node. If the node is determined to be under duress, OpenSearch examines the resource usage of each search shard task and compares it against configurable thresholds. OpenSearch considers CPU usage, heap usage, and elapsed time and assigns each task a cancellation score that is then used to cancel the most resource-intensive tasks.
+
+OpenSearch limits the number of cancellations as a fraction of successful task completions and cancellations per unit time. It continues to monitor and cancel tasks until the node is no longer under duress.


Suggested change

OpenSearch limits the number of cancellations as a fraction of successful task completions and cancellations per unit time. It continues to monitor and cancel tasks until the node is no longer under duress.

OpenSearch limits the number of cancellations to a fraction of successful task completions and cancellations per unit time. It continues to monitor and cancel tasks until the node is no longer under duress.

Reworded for clarity.

_opensearch/search-backpressure.md

Co-authored-by: Nate Bower <nbower@amazon.com>

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

Adds search backpressure documentation

c338f85

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

kolchfa-aws self-assigned this Nov 2, 2022

kolchfa-aws added 3 commits November 2, 2022 13:28

Updated response

41ee97a

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

Removed last cancelled task

3d198ad

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

Fixed typo

45e496d

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

kolchfa-aws added 3 - Tech review PR: Tech review in progress v2.4.0 'Issues and PRs related to version v2.4.0' labels Nov 5, 2022

ketanv3 suggested changes Nov 7, 2022

View reviewed changes

ketanv3 reviewed Nov 7, 2022

View reviewed changes

Incorporated tech review feedback

c5fded6

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

ketanv3 approved these changes Nov 7, 2022

View reviewed changes

kolchfa-aws added 4 - Doc review PR: Doc review in progress and removed 3 - Tech review PR: Tech review in progress labels Nov 7, 2022

kolchfa-aws marked this pull request as ready for review November 7, 2022 14:31

kolchfa-aws requested a review from a team as a code owner November 7, 2022 14:31

nssuresh2007 approved these changes Nov 7, 2022

View reviewed changes

Naarcha-AWS self-requested a review November 7, 2022 20:33

vagimeli approved these changes Nov 7, 2022

View reviewed changes

Incorporated doc review feedback

aed77cb

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

ariamarble approved these changes Nov 8, 2022

View reviewed changes

natebower reviewed Nov 8, 2022

View reviewed changes

kolchfa-aws and others added 4 commits November 10, 2022 09:23

Update _opensearch/search-backpressure.md

3f9ac08

Co-authored-by: Nate Bower <nbower@amazon.com>

Apply suggestions from code review

e15b8bd

Co-authored-by: Nate Bower <nbower@amazon.com>

Rewording

37502a2

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

Minor rewording for clarity

678cd57

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

kolchfa-aws merged commit 91caf0c into main Nov 10, 2022

Naarcha-AWS deleted the server-side-cancellation branch December 13, 2022 19:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds search backpressure documentation #1790

Adds search backpressure documentation #1790

kolchfa-aws commented Nov 2, 2022

ketanv3 Nov 7, 2022

ketanv3 Nov 7, 2022

ketanv3 Nov 7, 2022

ketanv3 Nov 7, 2022

ketanv3 Nov 7, 2022

ketanv3 Nov 7, 2022

ketanv3 Nov 7, 2022

ketanv3 commented Nov 7, 2022

nssuresh2007 left a comment

vagimeli left a comment

vagimeli Nov 7, 2022

vagimeli Nov 7, 2022

vagimeli Nov 7, 2022

vagimeli Nov 7, 2022

ariamarble left a comment

natebower left a comment

natebower Nov 8, 2022

kolchfa-aws Nov 9, 2022

natebower Nov 8, 2022

kolchfa-aws Nov 10, 2022


		An observer thread tracks the resource consumption of each task thread. It measures the resource consumption at several checkpoints during the query phase of a shard search request. If the node is determined to be under duress based on the JVM memory pressure and CPU utilization, the server examines the resource consumption for each search task. It determines if the CPU usage and elapsed time are within their fixed thresholds, and it compares the heap usage against the rolling average of the heap usage of the 100 most recent tasks. If the task is among the most resource-intensive based on these criteria, the task in canceled.

		Every minute OpenSearch can cancel at most 1% of the number of currently running search shard tasks. Once a task is canceled, OpenSearch monitors the node for the next two seconds to determine if it is still under duress.


		## Canceled queries

		If a query is canceled, instead of receiving search results you receive an error from the server similar to the error below:


		## Canceled queries

		If a query is canceled, OpenSearch may return partial results in case some shards failed. If all shards failed, OpenSearch returns an error from the server similar to the error below:


		An observer thread periodically measures the resource usage of the node. If the node is determined to be under duress, OpenSearch examines the resource usage of each search shard task and compares it against configurable thresholds. OpenSearch considers CPU usage, heap usage, and elapsed time and assigns each task a cancellation score that is then used to cancel the most resource-intensive tasks.

		OpenSearch limits the number of cancellations as a fraction of successful task completions and cancellations per unit time. It continues to monitor and cancel tasks until the node is no longer under duress.

Adds search backpressure documentation #1790

Adds search backpressure documentation #1790

Conversation

kolchfa-aws commented Nov 2, 2022

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ketanv3 commented Nov 7, 2022

nssuresh2007 left a comment

Choose a reason for hiding this comment

vagimeli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariamarble left a comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment