Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement anonymizer's main program #2621

Merged
merged 22 commits into from
Nov 12, 2020
Merged

Implement anonymizer's main program #2621

merged 22 commits into from
Nov 12, 2020

Conversation

Ashmita152
Copy link
Contributor

@Ashmita152 Ashmita152 commented Nov 8, 2020

Which problem is this PR solving?

Implement anonymizer's main program, per #2556

Short description of the changes

  • add option to specify query service gRPC address to load the trace
  • add option to specify trace ID to load from gRPC query service
  • add option to specify output directory to save the file
  • add options for Anonymizer configuration parameters
  • when saving anonymized trace to a file, it first must be converted to the UI model (see writer package)

Validation

  • the trace should be loadable by Jaeger UI (Load JSON tab in the Search panel)

Validation step is not working because the difference in the expected and actual json.

https://gist.github.com/Ashmita152/e9b421819571098d3c9de965b6ae689c

I am unable to figure out why this difference is happening and what should be the right way to fix.

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
@Ashmita152 Ashmita152 requested a review from a team as a code owner November 8, 2020 15:20
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "Validation step is not working because the difference in the expected and actual json"? Which validation?

cmd/anonymizer/app/anonymizer/anonymizer.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/flags.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/flags.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/writer/writer.go Outdated Show resolved Hide resolved
cmd/anonymizer/main.go Outdated Show resolved Hide resolved
cmd/opentelemetry/go.sum Outdated Show resolved Hide resolved
@Ashmita152
Copy link
Contributor Author

Hi @yurishkuro

Thank you so much for the review. I will fix them. By saying "validation step is not working", I meant I am unable to load the anonymised json from the json tab in the search panel due to the difference in the produced json. It keeps rotating while loading the json from the file. After manually modifying the json to the expected.json (https://gist.github.com/Ashmita152/e9b421819571098d3c9de965b6ae689c) it works as expected.

@yurishkuro
Copy link
Member

It keeps rotating while loading the json from the file

I assume you're talking about the UI. Is there an error in the Javascript console?

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
@Ashmita152
Copy link
Contributor Author

Hi @yurishkuro

I have updated the PR based on your feedbacks. Those were great suggestions. Hopefully I didn't miss any of those. It's ready for review again.

I assume you're talking about the UI. Is there an error in the Javascript console?

There is no error in javascript console. No error logs in the all-in-one docker image too.

Thank you once again for the help.

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
cmd/anonymizer/app/anonymizer/anonymizer.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/anonymizer/anonymizer_test.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/flags.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/options.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/options.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/query/query.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/query/query.go Show resolved Hide resolved
cmd/anonymizer/app/query/query.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/writer/writer.go Outdated Show resolved Hide resolved
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
@Ashmita152
Copy link
Contributor Author

Hi @yurishkuro

Thank you for the review again. Those definitely made the PR lot better. I have tried my best to fix those and it would be great to have your review again. Regards.

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
@codecov
Copy link

codecov bot commented Nov 10, 2020

Codecov Report

Merging #2621 (854ad35) into master (4e7935b) will decrease coverage by 0.00%.
The diff coverage is 97.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2621      +/-   ##
==========================================
- Coverage   95.07%   95.06%   -0.01%     
==========================================
  Files         209      210       +1     
  Lines        9368     9410      +42     
==========================================
+ Hits         8907     8946      +39     
- Misses        385      388       +3     
  Partials       76       76              
Impacted Files Coverage Δ
cmd/anonymizer/app/anonymizer/anonymizer.go 60.49% <80.00%> (-0.76%) ⬇️
cmd/anonymizer/app/flags.go 100.00% <100.00%> (ø)
cmd/query/app/server.go 88.52% <0.00%> (-1.64%) ⬇️
...lugin/sampling/strategystore/adaptive/processor.go 99.07% <0.00%> (-0.93%) ⬇️
plugin/storage/badger/spanstore/reader.go 96.08% <0.00%> (+0.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e7935b...854ad35. Read the comment docs.

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
cmd/anonymizer/app/anonymizer/anonymizer.go Show resolved Hide resolved
cmd/anonymizer/app/query/query.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/query/query.go Outdated Show resolved Hide resolved
cmd/anonymizer/app/writer/writer.go Outdated Show resolved Hide resolved
cmd/anonymizer/main.go Outdated Show resolved Hide resolved
cmd/anonymizer/main.go Outdated Show resolved Hide resolved
cmd/anonymizer/main.go Show resolved Hide resolved
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
…nto anonymizer-2

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
@Ashmita152
Copy link
Contributor Author

Hi @yurishkuro

Thank you for the feedbacks. I tried my best to update the PR based on my understanding. May I ask for your review again please. Regards.


// Close closes the captured and anonymized files.
func (w *Writer) Close() {
w.capturedFile.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're missing things like w.capturedFile.WriteString("\n]\n"). See L132 - that code should be moved into Close.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how did you test it? are you able to load the resulting fie in the UI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yurishkuro

This is the output from a sample run: https://gist.github.com/Ashmita152/68904c2ffe2dc40bf17984ffc981cace
It doesn't load in the UI since the json isn't as expected I guess, no errors in js console: https://gist.github.com/Ashmita152/e9b421819571098d3c9de965b6ae689c
I am unable to figure out the reason about this inconsistency but my guess is that UI accepts a complete trace json while we are uploading one particular span json.

I have a question: why do we write span by span while writing to file, shouldn't we be first constructing a trace out of those spans and then write to file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I see the confusion. The original purpose of the Writer was to use it while tailing a long stream of spans (in Kafka) to share them with researchers. It does not have the notion of a trace because spans are not ordered/grouped by trace ID in a stream. We had another internal binary that used these packages as a library.

For the purpose of this ticket, we know that we're dealing with a single trace. However, the Writer does not have the logic to write that trace in exactly the UI format, so you may need to add that. You can download a sample trace from the UI (via JSON button in the trace view) to see where the formatting is different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that clears my doubt. Sure I will work on implementing that in writer. Do we need to keep the functionality backward compatible ie. allowing current tailing stream of spans too ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please keep backward compatibility.

Copy link
Contributor Author

@Ashmita152 Ashmita152 Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about how to implement the logic to write the trace in the UI format. To me, it seems like we can't use WriteSpan function. I am thinking to do:

  • Implement a WriteTrace function in writer.go
  • Change QueryTrace in query.go which is now returning []model.Span. I am thinking to modify it to return model.Trace.
  • In main.go instead of calling writer.writeSpan, call writer.writerTrace.

I would like to get your opinion before start implementing this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, the writer is not exactly compatible with saving a single trace. Let's merge this PR, since it achieved the basic anonymization of a trace, just doesn't output UI-compatible file, which can be added separately.

TBH, our internal program worked differently, in two stages - first we'd run it on a stream extracting lots of spans into an anonymized file. Then we had a separate command in the binary that would take a trace ID, scan the previously save file, pick the spans for the given trace ID, and output just that trace in the UI format. It would be weird to do it the same way here because the first run is already taking a trace ID as input (unlike our case where input was undifferentiated stream of spans from many traces). However, we could still implement it the same way, just invoke the second step automatically on the file already generated by your current code.

Let me do it as a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd/anonymizer/app/query/query.go Outdated Show resolved Hide resolved
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>
@yurishkuro yurishkuro merged commit 1d22aaf into jaegertracing:master Nov 12, 2020
quinniup pushed a commit to k8battleship/jaeger that referenced this pull request Nov 23, 2020
quinniup added a commit to k8battleship/jaeger that referenced this pull request Nov 23, 2020
* Bump opentelemetry-collector to v0.14.0 (jaegertracing#2617)

* bump opentelemetry-collector to v0.14.0

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* initialize traceid and spanid explicitly

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* fix comments and empty parent span check

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* fasten convert traceID/spanID

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* fix convertTraceID + fix tests

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* Update CodeQL to latest best practices (jaegertracing#2615)

This will parallelize your analysis and speed things up a bunch.

Signed-off-by: jhutchings1 <jhutchings1@users.noreply.github.com>

Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Fix flaky TestReload (jaegertracing#2624)

Signed-off-by: albertteoh <albert.teoh@logz.io>

* Update x/text to v0.3.4 (jaegertracing#2625)

Signed-off-by: Gary Brown <gary@brownuk.com>

* Bump to latest UI for snapshot builds (jaegertracing#2626)

Signed-off-by: Yuri Shkuro <github@ysh.us>

* Implement anonymizer's main program (jaegertracing#2621)

* Preparing release 1.21.0 (jaegertracing#2630)

* updated changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Added ui changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Fixed UI changelog to point to 1.12.0

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Updated jaeger-ui to v1.12.0

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Resolving concerns

Signed-off-by: Joe Elliott <number101010@gmail.com>

* [anonymizer] Save trace in UI format (jaegertracing#2629)

* Use fossa-contrib/fossa-action instead (jaegertracing#2571)

* Use fossa-contrib/fossa-action instead

Signed-off-by: Sora Morimoto <sora@morimoto.io>

* Make step name clearer

Signed-off-by: Sora Morimoto <sora@morimoto.io>

Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>

* Update Makefile and Dockerfile for anonymizer (jaegertracing#2632)

Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>

* Fix listen IP in unit test (jaegertracing#2636)

Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>

* Bump opentelemetry to v0.15.0 (jaegertracing#2634)

* Bump opentelemetry to v0.15.0

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* add default value instead of nil value for jaegerreceiver config

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

* make lint

Signed-off-by: Pavel Kositsyn <kositsyn.pa@phystech.edu>

Co-authored-by: Kositsyn Pavel <vemmy124@gmail.com>
Co-authored-by: Justin Hutchings <jhutchings1@users.noreply.github.com>
Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de>
Co-authored-by: Albert <26584478+albertteoh@users.noreply.github.com>
Co-authored-by: Gary Brown <gary.pi4tech@gmail.com>
Co-authored-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
Co-authored-by: Ashmita <ashmita.bohara152@gmail.com>
Co-authored-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: Sora Morimoto <sora@morimoto.io>
Co-authored-by: ZouYu <zouy.fnst@cn.fujitsu.com>
Co-authored-by: Kositsyn Pavel <kositsyn.pa@phystech.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants