Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request level latency tracking #10351

Merged
merged 1 commit into from
Oct 13, 2023

Conversation

dzane17
Copy link
Contributor

@dzane17 dzane17 commented Oct 4, 2023

Description

Per Request level tracking: Introduce a new field (phase_took) in search response which will give more insights/visibility into overall time taken by different search phases(query/fetch/canMatch etc) to the clients. This is tracked from the coordinator node, not shards.

Sample search response with phase_took field enabled:

% curl -XGET 'localhost:9200/_search?pretty&phase_took' -H 'Content-Type: application/json' -d'
{                                                
 "query": { "query_string": { "query": "value1" } }
}'
{
  "took" : 3439,
  "phase_took" : {
    "dfs_pre_query" : 0,
    "query" : 69,
    "fetch" : 22,
    "dfs_query" : 0,
    "expand" : 0,
    "can_match" : 0
  },
  "timed_out" : false,
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "test2",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "field1" : "value1"
        }
      }
    ]
  }
}

Related Issues

Resolves #9650
Documentation PR: opensearch-project/documentation-website#5154

Testing

  1. Launch local OpenSearch server with debugger
    https://github.com/opensearch-project/OpenSearch/blob/main/TESTING.md#launching-and-debugging-from-an-ide
./gradlew run --debug-jvm
  1. Create an index and ingest some data
curl -X PUT "localhost:9200/test2?pretty" -H 'Content-Type: application/json' -d' 
{                                                
  "settings": {
    "number_of_shards": 10
  }
}'

curl -X POST "localhost:9200/_bulk?pretty" -H 'Content-Type: application/json' -d'
{ "index" : { "_index" : "test2", "_id" : "1" } }
{ "field1" : "value1" }                          
{ "index" : { "_index" : "test2", "_id" : "2" } }
{ "field2" : "value2" }
{ "index" : { "_index" : "test2", "_id" : "3" } }
{ "field3" : "value3" }
{ "index" : { "_index" : "test2", "_id" : "4" } }
{ "field4" : "value4" }
'
  1. Search request with phase_took query parameter
curl -XGET 'localhost:9200/_search?pretty&phase_took' -H 'Content-Type: application/json' -d'
{
 "query": { "query_string": { "query": "abc" } }
}'
  1. Update and verify cluster setting
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d '{"transient" : {"search.phase_took_enabled" : true}}'
curl -X GET "localhost:9200/_cluster/settings?pretty&flat_settings"
  1. Search request without query parameter
curl -XGET 'localhost:9200/_search?pretty' -H 'Content-Type: application/json' -d'
{
 "query": { "query_string": { "query": "abc" } }
}'

Performance Tests

No significant impact to P50, P90, P99 latency observed.

Procedure:

  1. Spin up localhost OpenSearch cluster
./gradlew run
  1. Enable phase_took cluster setting
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d '{"transient" : {"search.phase_took_enabled" : true}}'
curl -X GET "localhost:9200/_cluster/settings?pretty&flat_settings"
  1. Run opensearch-benchmark nyc_taxis workload
opensearch-benchmark execute-test --workload-path=./nyc_taxis --pipeline=benchmark-only --client-options="basic_auth_user:'admin',basic_auth_password:'Admin_123'" --results-file=/tmp/osb_test.txt

Results

Control 50th percentile latency 90th percentile latency 99th percentile latency
1 5.82006 7.16735 13.9845
2 5.78643 8.14282 9.72852
3 6.80698 8.11144 11.3102
4 7.16427 8.25471 11.1239
5 6.95713 8.30096 11.2593
6 7.01127 8.34041 12.5661
Mean 6.59102 8.05295 11.66209
St Dev 0.62091 0.44286 1.45088
Test 50th percentile latency 90th percentile latency 99th percentile latency
1 5.43762 8.08449 10.0825
2 6.59225 8.27539 9.56204
3 7.0098 8.20397 10.0894
4 6.13892 8.01568 9.74957
5 6.23295 8.3008 14.9393
6 6.63027 7.28241 13.3747
7 6.82119 7.69677 13.5151
8 5.91673 7.38082 11.895
9 6.14394 7.83413 15.9442
Mean 6.32485 7.89716 12.12798
St Dev 0.52614 0.36954 2.24085

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • GitHub issue/PR created in OpenSearch documentation repo for the required public documentation changes (#[Issue/PR number])

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Compatibility status:

Checks if related components are compatible with change 551e828

Incompatible components

Incompatible components: [https://github.com/opensearch-project/performance-analyzer.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@dzane17
Copy link
Contributor Author

dzane17 commented Oct 4, 2023

@msfroh Can you take a look?

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc labels Oct 4, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT:
  • URL:
  • CommitID: 551e828
    Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
    Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testPrimaryStopped_ReplicaPromoted
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

@msfroh msfroh merged commit daf1350 into opensearch-project:main Oct 13, 2023
15 of 16 checks passed
@dzane17 dzane17 changed the title Request level latency tracking https://github.com/opensearch-project/OpenSearch/issues/9650 Oct 16, 2023
@dzane17 dzane17 changed the title https://github.com/opensearch-project/OpenSearch/issues/9650 Request level latency tracking Oct 16, 2023
@dzane17 dzane17 deleted the request-latency-final branch October 16, 2023 20:38
@kkhatua kkhatua added the backport 2.x Backport to 2.x branch label Oct 18, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10351-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 daf1350888f878868748172f576a0cdb3dc64b33
# Push it to GitHub
git push --set-upstream origin backport/backport-10351-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-10351-to-2.x.

deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 18, 2023
Signed-off-by: David Zane <davizane@amazon.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 19, 2023
Signed-off-by: David Zane <davizane@amazon.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 19, 2023
Signed-off-by: David Zane <davizane@amazon.com>
Signed-off-by: Siddhant Deshmukh <deshsid@amazon.com>
dzane17 added a commit to dzane17/OpenSearch that referenced this pull request Oct 20, 2023
Signed-off-by: David Zane <davizane@amazon.com>
(cherry picked from commit daf1350)
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Oct 23, 2023
Signed-off-by: David Zane <davizane@amazon.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Oct 23, 2023
Signed-off-by: David Zane <davizane@amazon.com>
msfroh pushed a commit that referenced this pull request Oct 23, 2023
* Per request phase latency (#10351)

Signed-off-by: David Zane <davizane@amazon.com>
(cherry picked from commit daf1350)

* Update SearchRequest version check

Signed-off-by: David Zane <davizane@amazon.com>

---------

Signed-off-by: David Z <38449481+dzane17@users.noreply.github.com>
Signed-off-by: David Zane <davizane@amazon.com>
@dzane17 dzane17 mentioned this pull request Oct 26, 2023
8 tasks
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Dec 13, 2023
Signed-off-by: David Zane <davizane@amazon.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Jan 19, 2024
Signed-off-by: David Zane <davizane@amazon.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Jan 19, 2024
Signed-off-by: David Zane <davizane@amazon.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Feb 6, 2024
Signed-off-by: David Zane <davizane@amazon.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Feb 6, 2024
Signed-off-by: David Zane <davizane@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Signed-off-by: David Zane <davizane@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc v2.12.0 Issues and PRs related to version 2.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search Latency Tracking - Per Request Phase Took Time
4 participants