-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jaeger-v2] add elasticsearch & opensearch e2e integration test #5345
Conversation
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Currently, the Traces are stored correctly in the database but the test is not able to read from it. ElasticSearch logs{"took":768,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1105,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"jaeger-main-jaeger-span-write","_type":"_doc","_id":"NBW1x44Bj8WH9-LVviap","_score":1.0,"_source":{"traceID":"0000000000000512","spanID":"0000000000000005","operationName":"query23-operation","references":[],"startTime":1712681191639875,"startTimeMillis":1712681191639,"duration":1,"tags":[{"key":"sameplacetag1","type":"string","value":"same*"}],"logs":[{"timestamp":1712681191639875,"fields":null},{"timestamp":1712681191639875,"fields":null}],"process":{"serviceName":"query23-service","tags":[]}}},{"_index":"jaeger-main-jaeger-span-write","_type":"_doc","_id":"NRW1x44Bj8WH9-LVvib_","_score":1.0,"_source":{"traceID":"0000000000000a12","spanID":"0000000000000004","operationName":"","references":[],"startTime":1712681191639875,"startTimeMillis":1712681191639,"duration":2,"tags":[{"key":"sameplacetag1","type":"string","value":"sameplacevalue1"}],"logs":[],"process":{"serviceName":"query24-service","tags":[]}}},{"_index":"jaeger-main-jaeger-span-write","_type":"_doc","_id":"NhW1x44Bj8WH9-LVvyYJ","_score":1.0,"_source":{"traceID":"0000000000001212","spanID":"0000000000000004","operationName":"","references":[],"startTime":1712681191639875,"startTimeMillis":1712681191639,"duration":2,"tags":[{"key":"sameplacetag1","type":"string","value":"sameplacevalue2"}],"logs":[],"process":{"serviceName":"query24-service","tags":[]}}},{"_index":"jaeger-main-jaeger-span-write","_type":"_doc","_id":"NxW1x44Bj8WH9-LVvyYi","_score":1.0,"_source":{"traceID":"0000000000000001","spanID":"0000000000000002","operationName":"some-operation","references":[],"startTime":1712681191639875,"startTimeMillis":1712681191639,"duration":7,"tags":[{"key":"sameplacetag1","type":"string","value":"sameplacevalue"},{"key":"sameplacetag2","type":"int64","value":"123"},{"key":"sameplacetag4","type":"bool","value":"true"},{"key":"sameplacetag3","type":"float64","value":"72.5"}] Error in test
|
Do you have a plan how to solve it? |
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
At first, I thought there might be some error in the rpc connection between the query and elastic search storage. After investigating, I learned the problem is in receiving spans from the stream ( As we know, in another test (e.g., grpc_test) for some early iterations, we get a similar error, and after that, we get our spans. But here, even after increasing it to 1000, it's not working. Currently, I have zero clues, but I am trying to figure out how to solve it. |
The ES client has a TraceLogger, which we do not initialize today. I suggest implementing some flag somewhere that would initialize this logger to be able to see which queries ES Storage is making to the ES Backend (this would be a separate PR). |
{"level":"info","ts":1712921641.4783256,"caller":"zapgrpc/zapgrpc.go:128","msg":"HTTP/1.1 200 OK\r\nContent-Type: application/json; charset=UTF-8\r\nX-Elastic-Product: Elasticsearch\r\n\r\n{\"took\":12,\"errors\":false,\"items\":[{\"index\":{\"_index\":\"jaeger-main-jaeger-service-write\",\"_type\":\"_doc\",\"_id\":\"5c78c55a6b1b86a1\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":1,\"failed\":0},\"_seq_no\":3,\"_primary_term\":1,\"status\":201}}]}\n"}
{"level":"info","ts":1712921641.4787142,"caller":"zapgrpc/zapgrpc.go:128","msg":"POST /_bulk HTTP/1.1\r\nHost: localhost:9200\r\nUser-Agent: elastic/6.2.37 (linux-amd64)\r\nContent-Length: 459\r\nAccept: application/json\r\nContent-Type: application/x-ndjson\r\nAccept-Encoding: gzip\r\n\r\n{\"index\":{\"_index\":\"jaeger-main-jaeger-span-write\"}}\n{\"traceID\":\"0000000000000011\",\"spanID\":\"0000000000000006\",\"operationName\":\"example-operation-3\",\"references\":[],\"startTime\":1712853991639875,\"startTimeMillis\":1712853991639,\"duration\":100,\"tags\":[{\"key\":\"span.kind\",\"type\":\"string\",\"value\":\"server\"}],\"logs\":[{\"timestamp\":1712853991639875,\"fields\":null},{\"timestamp\":1712853991639875,\"fields\":null}],\"process\":{\"serviceName\":\"example-service-1\",\"tags\":[]}}\n\n"}
integration.go:276: trace not found
{"level":"info","ts":1712921641.5270474,"caller":"zapgrpc/zapgrpc.go:128","msg":"GET /_msearch?rest_total_hits_as_int=true HTTP/1.1\r\nHost: localhost:9200\r\nUser-Agent: elastic/6.2.37 (linux-amd64)\r\nContent-Length: 403\r\nAccept: application/json\r\nContent-Type: application/json\r\nAccept-Encoding: gzip\r\n\r\n{\"ignore_unavailable\":true,\"index\":\"jaeger-main-jaeger-span-read\"}\n{\"query\":{\"bool\":{\"must\":[{\"bool\":{\"should\":[{\"term\":{\"traceID\":{\"boost\":2,\"value\":\"0000000000000011\"}}},{\"term\":{\"traceID\":\"11\"}}]}},{\"range\":{\"startTimeMillis\":{\"from\":136035241526,\"include_lower\":true,\"include_upper\":true,\"to\":1713008041526}}}]}},\"search_after\":[136118041526656],\"size\":10000,\"sort\":[{\"startTime\":{\"order\":\"asc\"}}]}\n\n"}
{"level":"info","ts":1712921641.5347877,"caller":"zapgrpc/zapgrpc.go:128","msg":"HTTP/1.1 200 OK\r\nContent-Type: application/json; charset=UTF-8\r\nX-Elastic-Product: Elasticsearch\r\n\r\n{\"took\":0,\"responses\":[{\"took\":0,\"timed_out\":false,\"_shards\":{\"total\":0,\"successful\":0,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":0,\"max_score\":0.0,\"hits\":[]},\"status\":200}]}\n"}
2024-04-12T17:04:01.534+0530 error app/grpc_handler.go:105 trace not found {"kind": "extension", "name": "jaeger_query", "error": "trace not found"}
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .geoip_databases 0z0sUHE3TaedeUallyGkMA 1 0 35 0 32.4mb 32.4mb
yellow open jaeger-main-jaeger-span-write 2AEUrJFkSEKX9SWzgv8wKg 1 1 63 0 152kb 152kb
yellow open jaeger-main-jaeger-service-write m8pO7MKwSXuOgtv9R5Tvlw 1 1 7 0 13.8kb 13.8kb``` |
Yes! Can you make a PR to support an option to turn this on?
Yep, this seems to be the root cause. Why doesn't it exist? When using read/write name it sounds like the aliasing solution with ILM enabled, maybe it's incorrectly enabled for the tests? |
Yes, I can add it. But I see; we can already enable this by setting log_level to
will take look into it. |
If it's already can be enabled via existing flags then don't change it |
one more interesting finding: |
made some progress : --- ❌ FAIL: TestESStorage (202.01s)
--- ✅ PASS: TestESStorage/GetServices (2.71s)
--- ❌ FAIL: TestESStorage/GetOperations (101.32s)
--- ✅ PASS: TestESStorage/GetTrace (3.27s)
--- ✅ PASS: TestESStorage/GetTrace/NotFound_error (0.01s)
--- ✅ PASS: TestESStorage/GetLargeSpans (89.15s)
--- ✅ FAIL: TestESStorage/FindTraces (4.44s)
--- ✅ FAIL: TestESStorage/FindTraces/Tag_escaped_operator_+_Operation_name_+_max_Duration (1.53s)
--- ✅ FAIL: TestESStorage/FindTraces/Tag_wildcard_regex (0.03s)
--- ✅ FAIL: TestESStorage/FindTraces/Tags_in_one_spot_-_Tags (0.02s)
--- ✅ FAIL: TestESStorage/FindTraces/Tags_in_one_spot_-_Logs (0.03s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_in_one_spot_-_Process (1.64s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_in_different_spots (0.06s)
--- ✅ PASS: TestESStorage/FindTraces/Trace_spans_over_multiple_indices (0.05s)
--- ✅ PASS: TestESStorage/FindTraces/Operation_name (0.03s)
--- ✅ PASS: TestESStorage/FindTraces/Operation_name_+_max_Duration (0.05s)
--- ✅ PASS: TestESStorage/FindTraces/Operation_name_+_Duration_range (0.04s)
--- ✅ PASS: TestESStorage/FindTraces/Duration_range (0.02s)
--- ✅ PASS: TestESStorage/FindTraces/max_Duration (0.04s)
--- ✅ PASS: TestESStorage/FindTraces/default (0.02s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_+_Operation_name (0.05s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_+_Operation_name_+_max_Duration (0.04s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_+_Operation_name_+_Duration_range (0.03s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_+_Duration_range (0.06s)
--- ✅ PASS: TestESStorage/FindTraces/Tags_+_max_Duration (0.12s)
--- ✅ PASS: TestESStorage/FindTraces/Multi-spot_Tags_+_Operation_name (0.03s)
--- ✅ PASS: TestESStorage/FindTraces/Multi-spot_Tags_+_Operation_name_+_max_Duration (0.05s)
--- ✅ PASS: TestESStorage/FindTraces/Multi-spot_Tags_+_Operation_name_+_Duration_range (0.04s)
--- ✅ PASS: TestESStorage/FindTraces/Multi-spot_Tags_+_Duration_range (0.03s)
--- ✅ PASS: TestESStorage/FindTraces/Multi-spot_Tags_+_max_Duration (0.03s)
--- ✅ PASS: TestESStorage/FindTraces/Multiple_Traces (0.02s) |
@james-ryans getting same issue here also |
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
still not able to fix this. |
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
that sounds like you either have data pollution across tests, or the opposite - a dependency of one test on another's input data (although in that case tests would not succeed individually). |
This might be interesting to find out. I still cannot find which line of code outputs this.
|
@james-ryans which line are you referring to? |
This happens sometimes, please try to run the test again. |
i am not able to clear cache yet. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5345 +/- ##
==========================================
- Coverage 95.19% 94.55% -0.65%
==========================================
Files 346 346
Lines 16916 16945 +29
==========================================
- Hits 16104 16022 -82
- Misses 610 722 +112
+ Partials 202 201 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
## Which problem is this PR solving? - part of #5345 ## Description of the changes - Added Purge method for ES/OS - optimized integration test for es/os storage ## How was this change tested? - `STORAGE=elasticsearch make storage-integration-test` ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
## Which problem is this PR solving? - part of jaegertracing#5345 ## Description of the changes - Added Purge method for ES/OS - optimized integration test for es/os storage ## How was this change tested? - `STORAGE=elasticsearch make storage-integration-test` ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
…ertracing#5345) ## Which problem is this PR solving? - part of jaegertracing#5254 ## Description of the changes - Utilizing existing `StorageIntegration` to test the jaeger-v2 OTel Collector and gRPC storage backend with the provided config file at `cmd/jaeger/config-elasticsearch.yaml`. ## How was this change tested? - Start a elasticsearch or opensearch docker instance. - Run `STORAGE=elasticsearch SPAN_STORAGE_TYPE=elasticsearch make jaeger-v2-storage-integration-test` ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Which problem is this PR solving?
Description of the changes
StorageIntegration
to test the jaeger-v2 OTel Collector and gRPC storage backend with the provided config file atcmd/jaeger/config-elasticsearch.yaml
.How was this change tested?
STORAGE=elasticsearch SPAN_STORAGE_TYPE=elasticsearch make jaeger-v2-storage-integration-test
Checklist
jaeger
:make lint test
jaeger-ui
:yarn lint
andyarn test