Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Flagging with Status Codes #966

Merged
merged 9 commits into from
Sep 24, 2020

Conversation

tedsuo
Copy link
Contributor

@tedsuo tedsuo commented Sep 17, 2020

Changes

  • Changed StatusCodes to Unset, Error, and Ok
  • Added StatusSource
  • Updated HTTP status code mapping

Related issues #
#965

Related oteps #

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to ensure that we understand the implications of adding some things right now before GA, and the fact that this may cost us a lot of burden to maintain.

specification/trace/api.md Outdated Show resolved Hide resolved
@tigrannajaryan
Copy link
Member

tigrannajaryan commented Sep 17, 2020

@bogdandrutu I believe what you are saying is that changes to the Status code look good to you, however the addition of the StatusSource is not so clear. Do you suggest that we make changes to the Status code list now but leave out the StatusSource? We can add StatusSource later since it is an additive change.

I think that sounds reasonable to me, but there were some objections in the past that Status code alone is not enough. I am not sure I agree with the objections, but would like to understand them better.

@tedsuo
Copy link
Contributor Author

tedsuo commented Sep 18, 2020

@tigrannajaryan reposting my thoughts from the thread above:

My main request is that instrumentation libraries continue to produce errors. So I am also fine leaving the source portion of this for later. By changing the default from OK to UNSET, OK is now available to be used for error suppression, which is the feature everyone found to be the most useful.

But I agree that if others have a use case right now, we could add it. Otherwise we can come back to status source post-ga, and see if the new status codes are sufficient for now.

specification/trace/api.md Outdated Show resolved Hide resolved
@tedsuo
Copy link
Contributor Author

tedsuo commented Sep 18, 2020

Based on feedback, I've pushed a new version which removes span status source. Let me know what you think.

If anyone would still like to read the version with span status, it can be found here:
https://github.com/open-telemetry/opentelemetry-specification/blob/2dff27e893ce1584ef00df5a812b8367b4092e3e/specification/trace/api.md#statussource

@iNikem
Copy link
Contributor

iNikem commented Sep 18, 2020

Don't scare us, @tedsuo ! You removed span status source, not span status itself :)

@tigrannajaryan
Copy link
Member

One thing that is not defined in OTEP 136 is that Traces Protobufs in OTLP are already declared stable. Changing the list of status codes is a breaking change and is not allowed for "stable" maturity. We will need to introduce this change in a backwards compatible manner (from protocol perspective). I am not completely sure how it needs to be done, some thought needs to be put into this.

@Oberon00
Copy link
Member

Oberon00 commented Sep 21, 2020

I think for the protocol, unset and ok should both map to the old ok, error should map to unknown_error. The new status and the source boolean will be new fields (new fields should be backwards compatible in protobuf). This means that there will be a (significant) loss of information for old receivers, but at least they still get a binary ok/error discrimination which should be the most important thing.

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all this is what I expected after the OTEP 👍. The status source can be added later in a non-breaking way, leaving the default at "instrumentation".

@@ -650,9 +627,9 @@ Returns the `StatusCanonicalCode` of this `Status`.
Returns the description of this `Status`.
Languages should follow their usual conventions on whether to return `null` or an empty string here if no description was given.

### GetIsOk
### GetIsError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove all getters from the status API, we don't have getters anywhere else in the API. Although that should be done outside this PR. But I would remove this getter, or keep the name GetIsOK and return true unless set to Error. Otherwise I expect coding mistakes could result from inverting the meaning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Oberon00 @arminru can we also remove GetCanonicalCode and GetDescription? (In a separate PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can indeed, in a separate PR.
Thank's for resolving the original issue in this comment ✔️

specification/trace/semantic_conventions/http.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix other outstanding comments

@SergeyKanzhelev
Copy link
Member

@tedsuo can you please fix the build and remove the gRPC mention so this PR can be merged?

@SergeyKanzhelev
Copy link
Member

@tedsuo please fix linter issues for the merge

@tedsuo
Copy link
Contributor Author

tedsuo commented Sep 24, 2020

Ok, I believe this is ready to merge.

@SergeyKanzhelev
Copy link
Member

no significant semantics changes for two days, and enough approvals. Merging

@SergeyKanzhelev SergeyKanzhelev merged commit 30eb18a into open-telemetry:master Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants