Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gRPC plugin proto changes #1323

Merged
merged 32 commits into from
Apr 3, 2019
Merged

Add gRPC plugin proto changes #1323

merged 32 commits into from
Apr 3, 2019

Conversation

chvck
Copy link
Contributor

@chvck chvck commented Feb 8, 2019

Which problem is this PR solving?

Short description of the changes

  • Split proto and Makefile changes out from previous review to enable further progress there.

@@ -0,0 +1,199 @@
syntax = "proto3";

package jaeger.api_v2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure that this is correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think jaeger.storage.v1

rpc GetTrace(GetTraceRequest) returns (GetTraceResponse);
rpc GetServices(GetServicesRequest) returns (GetServicesResponse);
rpc GetOperations(GetOperationsRequest) returns (GetOperationsResponse);
rpc FindTraces(FindTracesRequest) returns (stream Trace);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a stream here solves the issue surfaced in the other PR where grpc message limits were being hit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, not sure it completely solves those issues unless we allow multiple Trace entries in the batch for the same trace. A single trace can still easily be larger than the msg limit.

Since this is a storage API, I would rather go with returning stream FindTracesResponseChunk where each chunk contains []Span

Copy link
Contributor Author

@chvck chvck Feb 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity what sort of size (and number of spans) would you expect to see the largest of traces reach? As traces contain more than just spans I think I'll have to give this a bit more thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have seen traces in production with 100,000 spans. At average span size of around 500 bytes, that's 50M. Even 10k spans would add up to 5Mb.

I think eventually we'd want to make the chunk size configurable, but I would start with 1000 or even 100 spans.

@@ -0,0 +1,199 @@
syntax = "proto3";

package jaeger.api_v2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think jaeger.storage.v1


package jaeger.api_v2;

option go_package = "proto";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps proto -> api_v1, so that Go package is api_v1, similar to model/proto/api_v2.proto

google.protobuf.Timestamp end_timestamp = 1 [
(gogoproto.stdtime) = true,
(gogoproto.nullable) = false
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you use some auto-formatter that creates these indentations? in the model.proto we have it more readable

message Log {
  google.protobuf.Timestamp timestamp = 1 [
    (gogoproto.stdtime) = true,
    (gogoproto.nullable) = false
  ];
  repeated KeyValue fields = 2 [
    (gogoproto.nullable) = false
  ];
}

rpc GetTrace(GetTraceRequest) returns (GetTraceResponse);
rpc GetServices(GetServicesRequest) returns (GetServicesResponse);
rpc GetOperations(GetOperationsRequest) returns (GetOperationsResponse);
rpc FindTraces(FindTracesRequest) returns (stream Trace);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, not sure it completely solves those issues unless we allow multiple Trace entries in the batch for the same trace. A single trace can still easily be larger than the msg limit.

Since this is a storage API, I would rather go with returning stream FindTracesResponseChunk where each chunk contains []Span

message FindTraceIDsResponse {
oneof response {
FindTraceIDsSuccess success = 1;
GrpcStoragePluginError error = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't we instead use grpc Status? It already includes an error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to note here is that the gogo plugin of Status is go 1.11+ (https://github.com/gogo/status)

// rpc InsertProbabilitiesAndQPS(InsertProbabilitiesAndQPSRequest) returns (InsertProbabilitiesAndQPSResponse);
// rpc GetThroughput(GetThroughputRequest) returns (GetThroughputResponse);
// rpc GetProbabilitiesAndQPS(GetProbabilitiesAndQPSRequest) returns (GetProbabilitiesAndQPSResponse);
// rpc GetLatestProbabilities(GetLatestProbabilitiesRequest) returns (GetLatestProbabilitiesResponse);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove this

rpc WriteSpan(WriteSpanRequest) returns (WriteSpanResponse);
}

service GrpcSpanReaderStoragePlugin {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we need Grpc and StoragePlugin prefix/suffix, they will be obvious from the package name.

rpc FindTraceIDs(FindTraceIDsRequest) returns (FindTraceIDsResponse);
}

service GrpcDependenciesStoragePlugin {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason we don't want this split into reader and writer, similar to span storage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do that. Do we consider dependency store writer as out of scope for this changeset?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can skip it for now.

@yurishkuro
Copy link
Member

to resolve Gopkg.lock conflict you may want to merge master, git checkout --theirs Gopkg.lock, then rerun dep ensure --update

import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/empty.proto";
import "google/rpc/status.proto";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a go 1.11+ requirement (https://github.com/gogo/status)

message WriteSpanResponse {
oneof response {
google.protobuf.Empty success = 1;
google.rpc.Status error = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is not what I meant. This Status object is transported by gRPC natively, we don't need to use it in our response payloads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying to basically remove the *Response messages and simply return *Success from the rpc functions? I've just had a play doing this and it makes the jaeger plugin code that talks to the generated proto code much cleaner.

];
}

message GetDependenciesSuccess {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetDependenciesResponse ? Here and in all other responses.

message FindTracesResponseChunk {
repeated bytes spans = 1 [
(gogoproto.nullable) = false,
(gogoproto.customtype) = "github.com/jaegertracing/jaeger/model.Span"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right dependency. The type should be Span from model.proto, the fact that it is backed by some specific Go type is irrelevant. Otherwise how could one implement a plugin in Java?

option (gogoproto.goproto_registration) = true;

message GetDependenciesRequest {
google.protobuf.Timestamp end_timestamp = 1 [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, but I think we should clean up the API and use start_time and end_time (don't need to call them timestamp, it's just the type of the value, not the meaning of it).


service SpanWriterPlugin {
// spanstore/Writer
rpc WriteSpan(WriteSpanRequest) returns (google.protobuf.Empty);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's define an empty WriteSpanResponse instead of google.protobuf.Empty, it is better for extensibility.

@yurishkuro
Copy link
Member

looks good overall

@yurishkuro
Copy link
Member

please resolve conflicts, the DependencyLink type has been extended in 1.10

@@ -22,7 +22,7 @@ option (gogoproto.sizer_all) = true;
option (gogoproto.goproto_registration) = true;

message GetDependenciesRequest {
google.protobuf.Timestamp end_timestamp = 1 [
google.protobuf.Timestamp end_time = 1 [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant replace lookback with start_time (and swap the order). The Query Service API in the future will be converted to similar gRPC API, but it will need to work over JSON. It's less cognitive load to have two timestamp-formatted fields in JSON than to have a lookback in nanoseconds.

@@ -32,7 +32,7 @@ message GetDependenciesRequest {
];
}

message GetDependenciesSuccess {
message GetDependenciesResponse {
repeated bytes dependencies = 1 [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DependencyLink?

rpc GetTrace(GetTraceRequest) returns (GetTraceSuccess);
rpc GetServices(GetServicesRequest) returns (GetServicesSuccess);
rpc GetOperations(GetOperationsRequest) returns (GetOperationsSuccess);
rpc GetTrace(GetTraceRequest) returns (GetTraceResponse);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest implementing this similar to FindTraces, as a stream of chunks, because even a single trace could be larger than the default message limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that I'm unsure about and may need consideration is that having a Trace as a stream of Span means that the ProcessMap and Warnings fields are lost on the Trace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but these two concepts only apply to the current JSON API used by the UI, they don't apply to the storage API (even though the storage API returns model.Trace, it never actually populates ProcessMap or warnings, that's done post-retrieval).


package jaeger.storage.v1;

option go_package = "api_v1";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe storage_v1 would be better?

Makefile Outdated
protoc \
$(PROTO_INCLUDES) \
-I plugin/storage/grpc/proto \
--gogo_out=plugins=grpc,$(PROTO_GOGO_MAPPINGS):$(PWD)/plugin/storage/grpc/proto \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer folder name to match the package, e.g. plugin/storage/grpc/proto/storage_v1

}

message FindTraceIDsResponse {
repeated bytes trace_id = 1 [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace_ids

string child = 2;
uint64 call_count = 3;
string source = 4 [
(gogoproto.customtype) = "DependencyLinkSource",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that this would be the best of both worlds re: the conversation above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, if we do this, I would recommend adding a simple marshaling/unmarshaling test, similar to model/ids_test.go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not just for this, but also for importing model.Span. I found it not particularly useful just adding gogoproto annotations without testing that serialization works as expected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following what you mean by "importing model.Span". In the plugin proto file? I don't believe that we actually use gogoproto annotations for importing Span there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this:

message GetTraceRequest {
    bytes trace_id = 1 [
      (gogoproto.nullable) = false,
      (gogoproto.customtype) = "github.com/jaegertracing/jaeger/model.TraceID",
      (gogoproto.customname) = "TraceID"
    ];
}

From my previous experience with gogoproto, even though protoc may compile it, you won't know if it will serialize correctly until you try it in unit tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and similar for spans

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking at this every which way, and it just bothers me that we're introducing this complication of custom type without any benefits whatsoever. The field should be just a string, so that an external tool (e.g. a service mesh) could submit graph links with another value of Source field. Having strong type in the model doesn't do anything useful. The only thing we need is the constant JaegerDependencyLinkSource = "jaeger" to use as default.

@ghost ghost assigned yurishkuro Mar 14, 2019
@ghost ghost added the review label Mar 14, 2019
chvck added 10 commits March 22, 2019 16:08
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
@yurishkuro
Copy link
Member

You probably don't need to rebase, just squash & merge master. One other option could be to squash only commits starting from the one where the sign-off was missed, e.g. git reset --soft C where C is the commit before the one where you missed the sign-off. Then you can commit all subsequent commits as one.

@chvck
Copy link
Contributor Author

chvck commented Mar 26, 2019

I'm unsure why make proto is changing files in proto-gen, it doesn't seem to do that locally on my machine.

@yurishkuro
Copy link
Member

Are you using the same protocol? Did you run "make proto-install" locally before generating?

@chvck
Copy link
Contributor Author

chvck commented Mar 26, 2019

Are you using the same protocol? Did you run "make proto-install" locally before generating?

Thanks, that was the step I was missing. The plugin proto generated files are generating without the license header until make fmt is run which is why the build is flagging those as they've been included in the codebase with the license header. As these are generated files is it safe to leave the header off them?

@yurishkuro
Copy link
Member

We avoid changing generated files, even with tools (eg https://github.com/jaegertracing/jaeger/blob/master/thrift-gen/jaeger/constants.go). We need to add them to exclusion lists in the Makefile so that go fmt and lint don't run on them.

Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
@chvck
Copy link
Contributor Author

chvck commented Apr 2, 2019

Hi @yurishkuro before I fix this up by merging master in, are there any changes here that you're waiting on me for?

Makefile Outdated
$(PROTOC) \
$(PROTO_INCLUDES) \
-I plugin/storage/grpc/proto \
--gogo_out=plugins=grpc,$(PROTO_GOGO_MAPPINGS):$(PWD)/plugin/storage/grpc/proto/storage_v1 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to avoid code coverage drop, we should place the generated files under proto-gen/storage_v1/

Makefile Outdated
$(PROTO_INCLUDES) \
-I plugin/storage/grpc/proto \
--gogo_out=plugins=grpc,$(PROTO_GOGO_MAPPINGS):$(PWD)/plugin/storage/grpc/proto/storage_v1 \
plugin/storage/grpc/proto/storage.proto
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix indentation in these two lines to align with L395

Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Signed-off-by: Charles Dixon <chvckd@gmail.com>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 🎉

@yurishkuro yurishkuro merged commit c770640 into jaegertracing:master Apr 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants