Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map Error Status to error tag in Zipkin & Jaeger #1257

Merged
merged 16 commits into from
Dec 10, 2020
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions spec-compliance-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ status of the feature is not known.
|Boolean attributes | + | + | + | + | | + | + | + | | + |
|Array attributes | + | + | + | [-](https://github.com/open-telemetry/opentelemetry-python/issues/1110) | | + | + | + | | + |
|Status mapping | + | + | + | + | | + | + | + | | + |
|Error Status mapping | | | | | | | | | | + |
|Event attributes mapping to Annotations | + | + | + | + | | + | + | + | | + |
|Integer microseconds in timestamps | | + | | + | | | | | | + |
|[Jaeger](specification/trace/sdk_exporters/jaeger.md)|
Expand All @@ -167,6 +168,8 @@ status of the feature is not known.
|Service name mapping | | | | | | | | | | + |
|Resource to Process mapping | | | | | | | | | | + |
|InstrumentationLibrary mapping | | | | | | | | | | + |
|Status mapping | | | | | | | | | | + |
|Error Status mapping | | | | | | | | | | + |
|Events converted to Logs | | | | | | | | | | + |
|OpenCensus|
|TBD|
Expand Down
16 changes: 15 additions & 1 deletion specification/trace/sdk_exporters/jaeger.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,21 @@ TBD

### Status

TBD
Span `Status` MUST be reported as a key-value pair in `tags` to Jaeger, unless it is `UNSET`.
In the latter case it MUST NOT be reported.

The following table defines the OpenTelemetry `Status` to Jaeger `tags` mapping.

| Status|Tag Key| Tag Value |
|--|--|--|
|Code | `otel.status_code` | Name of the code, either `OK` or `ERROR`. MUST NOT be set if the code is `UNSET`. |
|Message *(optional)* | `otel.status_description` | `{message}` |

### Error flag

When Span `Status` is set to `ERROR` an `error` tag SHOULD be added with the
CodeBlanch marked this conversation as resolved.
Show resolved Hide resolved
`BOOL` value (`vBool`) set to `true`. The added `error` tag MAY override any
CodeBlanch marked this conversation as resolved.
Show resolved Hide resolved
previous value.

### Events

Expand Down
6 changes: 6 additions & 0 deletions specification/trace/sdk_exporters/zipkin.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,12 @@ The following table defines the OpenTelemetry `Status` to Zipkin `tags` mapping.
|Code | `otel.status_code` | Name of the code, either `OK` or `ERROR`. MUST NOT be set if the code is `UNSET`. |
|Message *(optional)* | `otel.status_description` | `{message}` |

### Error flag

When Span `Status` is set to `ERROR` an `error` tag SHOULD be added with an
empty string value (eg. `{"error":""}`). The added `error` tag MAY override any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused on the "override previous value" - does it mean that if the user set an error tag themselves, this may unset it if Status is UNSET? Or if a user sets error=cats, we override to error=? Just want to have an idea (maybe we can add some clarification)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to say that exporters can simply add the error tag on the end of the tag list which will effectively override anything already there (with the same error key). My motivation for that was 1) I didn't want to leave it undefined and 2) checking for an existing tag is particularly expensive (O(n)) in the .NET implementation so I was hoping to avoid needing to do that. Open to suggestions for change or improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we setting an error tag with empty string value and additionally add otel.status_description instead of just setting the error tag with a string value equal to Status message?

Doesn't Status message fit nicely in the error tag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan The only issue with that is error tag must ONLY be sent if there was an actual error. So what we would need to do is set error instead of otel.status_description when otel.status_code=ERROR otherwise send otel.status_description (if it has a value for OK or UNSET cases). I don't have strong feelings either way. I will say that as-is is easier to implement (always sending otel.status_description when it has a value and then adding the error tag (empty) for otel.status_code=ERROR) and it's non-breaking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Status.message is not supposed to be set for OK or UNSET cases. We should explicitly call it out in the spec. This means error tag will only be set equal to Spatus.message when Status.status==ERROR and for all other statuses there is no error message to record.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan I added some text on Status to clarify Description should only be used with Error and then updated Zipkin & Jaeger based on that.

Copy link

@ChrisJBurns ChrisJBurns Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just verifying that the otel-java code follows the specification and I've given the above a read and may have noticed a slight difference between the jaeger and zipkin exporter specs regarding the error Status.

In the jaeger specification it says

otel.status_description : Description of the Status if it has a value otherwise not set.

So this means that for jaeger it sets the otel.status_description if the status description is set - it does not mandate that it is ONLY set if there is an error. Additionally, if the status is an error, it shall add an error tag.

However for zipkin it says the following:

error : Description of the Status. MUST be set if the code is ERROR, use an empty string if Description has no value. MUST NOT be set for OK and UNSET codes.

This describes that the otel.status_description tag isn't set but instead the error tag is the only thing that is set providing that there is an ERROR code and not OK or UNSET.

Is there a reason why for jaeger we provide the otel.status_code, otel.status_description AND the error tag but for zipkin we only provide otel.status_code and the error tag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisJBurns Slight differences in the implementations, basically. Zipkin error is a string where you can put the description. Jaeger error is a bool (true/false) so it has otel.status_description for the description. The Zipkin text is a bit more specific because if you send something like error=OK or error=false to Zipkin, it's still going to treat that span as failed.

previous value.

### Events

OpenTelemetry `Event` has an optional `Attribute`(s) which is not supported by
Expand Down