Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove restriction that SpanContext must be a final class, specify creation. #969

Merged
merged 12 commits into from
Sep 26, 2020

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Sep 18, 2020

I know there's an OTEP that was rejected about this but I'm resurrecting this topic. The OTEP contends that it is important to allow extensibility to SpanContext to enable certain use cases such as lazy ID generation.

I'd like to propose that even outside of use-case concerns, this is too strict of a spec since it brings language concerns (class inheritance) into the cross-language spec. Languages may have a language-specific reason to allow a non-sealed SpanContext. For example, in Java, we have a need to hide the implementation of SpanContext from a user app to prevent compatibility issues between auto instrumentation and the app which without special care, can share classpath and have collisions. This can be done efficiently if SpanContext is an interface - two interfaces, one in the app, the other in the agent, but one implementation generated by the agent. Otherwise, we are stuck with two implementations, that copy between each other. In some sense, the fact that we rewrite the usage of SpanContext by a user through severe bytecode manipulation already seems to be incompliant with the spirit of this spec of wanting only one implementation of `SpanContext. This is an issue specific to Java instrumentation - I'd recommend the spec not be a blocker in creating solutions to such issues.

Additionally, I would be surprised if this can even be achieved in every language. Can JS or Ruby have sealed classes? I'd be surprised if we have very many compliant implementations - at first glance, I notice that JS actually uses an interface for SpanContext :) The spec should avoid digging too much into implementation details IMO because they often rely on a language's feature set and this should be left to the language authors to provide a good experience - the sealed nature of a class does not affect inter-language compatibility in any way.

Related issues #open-telemetry/opentelemetry-java#1543

Related OTEPS: open-telemetry/oteps#58

Fixes #970

@Oberon00
Copy link
Member

Can JS or Ruby have sealed classes?

They can rely in their usages of SpanContext that it is not inherited. E.g. you can write (Python):

return SpanContext(parent.trace_id, generate_span_id(), parent.trace_flags, parent.trace_state)

without having to worry about parent having a different class and the implications thereof.

@Oberon00
Copy link
Member

I recommend when removing this to think about ways to create SpanContexts, because this is closely related (maybe this was even what the spec wanted to say with this). Right now, you can just say "OK, I will just have a public constructor in my non-abstract SpanContext class". But after this, if there are potentially inherited classes, you will need to specify with which means new span contexts can be created. This is needed at least for propagators' extract which need to somehow fill in a new span context into the Context (although maybe we can resolve #949 in a way that they can create Spans instead).

After #875 (or even before) we could go one step further and say that SpanContext does not need to exist as its own type, but it's methods could instead be exposed directly on the Span, e.g. instead of span.getSpanContext().getSpanId(), you would have span.getSpanId(), etc. I think SpanContext is never used anywhere in the API except for the getSpanContext method of the Span.

@Oberon00 Oberon00 added area:api Cross language API specification issue spec:trace Related to the specification/trace directory labels Sep 18, 2020
@Oberon00
Copy link
Member

@anuraaga Please create an issue so this can be tracked.

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am definitely in favor of removing this restriciton, I think we need to specify how to create a SpanContext at the same time.

@anuraaga
Copy link
Contributor Author

without having to worry about parent having a different class and the implications thereof.

I don't quite understand the example - can't parent just be some random object which has those properties, or even implement trace_id as a @property with arbitrary logic? Since it's all duck typed, I wouldn't expect any way to enforce any sort of interface / sealing guarantee except with expensive runtime reflection checking of the concrete types. I suspect this is the same for any scripting language.

I think we need to specify how to create a SpanContext at the same time.

I agree this isn't well explained in the spec and could use some details. But is it really so related to this PR? This PR's main motivation is the current text burdens language authors in a way that I suspect most of the languages actually won't even follow this requirement. So it seems like that is enough motivation to proceed unrelated to general issues of clarity regards to SpanContext.

@Oberon00
Copy link
Member

Oberon00 commented Sep 18, 2020

I wouldn't expect any way to enforce any sort of interface / sealing guarantee except with expensive runtime reflection checking of the concrete types.

I am not worried about (not) enforcing it. I am worried that you may accidentally break the trace through. If a vendor implements a custom SpanContext with an additional property X and the user then wants to manipulate the SpanContext in some way, e.g. by creating a new SpanContext with the changed TraceState, you lose X. You could
solve this e.g. by providing a clone (or cloneWith(spanId, traceflags, tracestate)) method on the SpanContext.

EDIT: In C++, this problem is called "object slicing" https://en.wikipedia.org/wiki/Object_slicing

Probably/maybe we do not want to support this at this point. We need some spec wording to clarify what is expected to (not) work.

I agree this isn't well explained in the spec and could use some details. But is it really so related to this PR?

Yes! If SpanContext is a final class, it does not matter how creation is implemented, as its a simple value class. But this PR seems to open the door to custom implementations. If we clearly specify that an API-implemented API to create a SpanContext MUST exist, it is clear that the use of derived SpanContext types is very limited. Otherwise, one may come up with the idea that a SpanContext can only be created with a Tracer (which would make a lot of sense IMHO, but would be a larger change from the current spec)

@anuraaga anuraaga changed the title Remove restriction that SpanContext must be a final class Remove restriction that SpanContext must be a final class... Sep 19, 2020
@anuraaga
Copy link
Contributor Author

@Oberon00 Ok - I tried adding some description of creating SpanContext hope it's what you're thinking of.

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga

I tried adding some description of creating SpanContext hope it's what you're thinking of.

I was less concerned with the particular parameters but to where the creation must be implemented: In the API or SDK. I added two alternative suggestions, see #969 (comment)

Comment on lines 182 to 183
The API must provide methods to create `SpanContext`. There are three ways to create
a `SpanContext`
Copy link
Member

@Oberon00 Oberon00 Sep 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The API must provide methods to create `SpanContext`. There are three ways to create
a `SpanContext`
The API MUST implement methods to create `SpanContext` which MUST NOT require any other API/SDK object
(i.e. they SHOULD be provided as free functions, or as static functions/constructors in the ` SpanContext` interface/class).
The produced SpanContext implementation when no API implementation is available
MUST be interchangeable with the SpanContext produced after installing any SDK.
There are three ways to create a `SpanContext`:

or, what I would prefer but it's probably too late for that rather large change:

Suggested change
The API must provide methods to create `SpanContext`. There are three ways to create
a `SpanContext`
The API MUST provide methods to create a `SpanContext` object on the `Tracer`.
This MUST be the only way to create a `SpanContext`.
For example, there must not be publicly available constructors on the `SpanContext`.
This could, for example, be achieved by having the `SpanContext` type be an interface or abstract base class in the API.
There are three ways to create a `SpanContext`:

The API must provide methods to create `SpanContext`. There are three ways to create
a `SpanContext`

- By generating a new `SpanId` and `TraceId` when there is no in-process or remote parent.
Copy link
Member

@Oberon00 Oberon00 Sep 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a way to provide TraceFlags and TraceState for root spans too. Actually at least TraceFlags should be a required parameter (sampled vs unsampled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - yeah I added the point about TraceFlags, I don't think we currently support TraceState. I'd like to keep this PR to reflect what we currently do if it's ok. I'd also like to avoid getting into too much detail about implementation, for example, SamplingContext.newForSampler(Sampler) and generate all IDs / trace flags and then call SpanContext.of(...) are equivalent IMO as far as the spec is concerned. I have found a general theme of over-specifying implementation detail in the spec even though it's too different based on language idioms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we currently support TraceState

We absolutely do. @iNikem added some descriptions how a tracestate can be "modifified" recently (the actual object instances are immutable but you can create a modified copy). Of course there must be some way of associating a tracestate with a spancontext too.

a `SpanContext`

- By generating a new `SpanId` and `TraceId` when there is no in-process or remote parent.
- By generating a new `SpanId` and copying the `TraceId`, `TraceFlags`, and `TraceState` from
Copy link
Member

@Oberon00 Oberon00 Sep 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must definitely be a way to change the TraceFlags (sampled) at this point. I think there should also be a way to change the TraceState. Maybe we could cover this by adding an additional mode of creation that accepts only a TraceState + original SpanContext and leaves everything else (including the "remoteness") unchanged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup makes sense. I think #988 addresses being able to mutate TraceState, if it's ok want to keep this reflecting the current state.

Copy link
Member

@Oberon00 Oberon00 Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#998 would allow the sampler to choose a tracestate, but the SDK would then need to construct a new SpanContext with that tracestate, which will only work if there is an option to create a SpanContext with a tracestate. I also want to keep reflecting the current state, which is, for example in Java, that you have the option (actually the obligation) to pass a tracestate, no matter whether the spancontext is remote or local.

@Oberon00 Oberon00 changed the title Remove restriction that SpanContext must be a final class... Remove restriction that SpanContext must be a final class, describe how it is created. Sep 19, 2020
@Oberon00
Copy link
Member

I updated the PR title to fit in the title, hope that is OK.

@Oberon00 Oberon00 changed the title Remove restriction that SpanContext must be a final class, describe how it is created. Remove restriction that SpanContext must be a final class, specify creation. Sep 19, 2020
@jmacd
Copy link
Contributor

jmacd commented Sep 22, 2020

I agree with this change @anuraaga and also with @Oberon00's comments. Thanks!

@carlosalberto
Copy link
Contributor

Additionally, I would be surprised if this can even be achieved in every language. Can JS or Ruby have sealed classes?

I think this is intended more, for such languages, as 'recommendation' in order to motivate people to not extend it ;) I suggest focusing on the reasons why it makes sense to extend SpanContext. So far I've seen:

  1. Lazy id generation (could this be added to the base implementations perhaps?)
  2. (Auto) Instrumentation is forced to move data around because of this (mind elaborating here?)

Other cases perhaps?

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto My main point is in the description, "I'd like to propose that even outside of use-case concerns, this is too strict of a spec since it brings language concerns". The OTEP explored use cases but they were too unclear to pursue at the time. We can add more detail about Java use case (@trask volunteered :) ), but I wouldn't think too much on it because there is a higher level concern on reducing the ability to adopt language idioms.

In a very related note, and possibly the trigger for another PR in the same vein, I found our immutable recommendation for SpanContext to also cause issues when implemented in Python - open-telemetry/opentelemetry-python#1134. Different languages have different idioms on how to make things safe, or to just not do so - shouldn't the spec define functionality but leave details like that to the languages?

The API must provide methods to create `SpanContext`. There are three ways to create
a `SpanContext`

- By generating a new `SpanId` and `TraceId` when there is no in-process or remote parent.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - yeah I added the point about TraceFlags, I don't think we currently support TraceState. I'd like to keep this PR to reflect what we currently do if it's ok. I'd also like to avoid getting into too much detail about implementation, for example, SamplingContext.newForSampler(Sampler) and generate all IDs / trace flags and then call SpanContext.of(...) are equivalent IMO as far as the spec is concerned. I have found a general theme of over-specifying implementation detail in the spec even though it's too different based on language idioms.

a `SpanContext`

- By generating a new `SpanId` and `TraceId` when there is no in-process or remote parent.
- By generating a new `SpanId` and copying the `TraceId`, `TraceFlags`, and `TraceState` from
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup makes sense. I think #988 addresses being able to mutate TraceState, if it's ok want to keep this reflecting the current state.

@trask
Copy link
Member

trask commented Sep 23, 2020

For Java auto-instrumentation, we would really like for SpanContext to be an interface, because then our bridge can be optimized by backing both the user's SpanContext and our internal SpanContext by the same implementation which will avoid having to convert it back and forth.

@anuraaga
Copy link
Contributor Author

whether one is expected to work with custom implementations or not is a very fundamental design decision.

I think this can be left up to the language too? With Java 8+, it's considered good practice to exposes interfaces instead of classes, it keeps the API separate from implementation details to prevent leakage of concerns, other languages may prefer sealed classes. Users will still interact with the OTel API, but intripid users may find a use case for extending the interface and I see no reason to block this - it's their app, let them try out new things. We're just providing an API, and this issue is talking about functionality provided to apps, not to backends. None of the tracing functionality is affected by whether this is a sealed class or not. I'd expect the OTel API to generally provide specs on what is collected and how it's exported, but the actual API presented should be idiomatic for each language and details like sealing a class, or even making it immutable, seem overly restrictive to letting a language implementer provide a good experience to users. MAY could be fine as suggestions, not MUST though.

To be honest this seems to be appearing too late in the game

It's just removing a restriction that is probably not followed in most languages anyways. On the flip side, it can be considered that late in the game is precisely when to be doing it to reduce work on languages having to follow the spec to be compliant :P

@Oberon00
Copy link
Member

Oberon00 commented Sep 24, 2020

To be honest this seems to be appearing too late in the game

It's just removing a restriction that is probably not followed in most languages anyways. On the flip side, it can be considered that late in the game is precisely when to be doing it to reduce work on languages having to follow the spec to be compliant :P

I mostly agree with @anuraaga on that part. However, I think we need to word this carefully to not accidentally create a requirement / expectation for APIs to have a customizable SpanContext (like Span is customizable by SDKs).

@Oberon00
Copy link
Member

whether one is expected to work with custom implementations or not is a very fundamental design decision.

I think this can be left up to the language too?

I disagree. Using that logic you could also leave it up to the language to implement Span fully and not
overridable in the API. Furthermore, when we add no clarification here, I think we are implicitly creating a requirement for it to be overridable and not implemented in the API but the SDK (like everything else).

@anuraaga
Copy link
Contributor Author

@Oberon00 I see - I thought it being API-specific was clear from specifying the API has the methods but I realize we have other patterns where both API and SDK define them. I added a note that it should be API-only, how does that sound?

@Oberon00
Copy link
Member

This is still not addressing the issue at hand. We write the same for span creation. And yet we all agree that span creation is to be defined in SDKs. Vendors are expected to have their own span types if they do their own API implementation at all. Are vendors implementing the SDK expected to provide their own implementation of a SpanCotnext interface as well? I don't think that is the status quo, and we should specify that.

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend adding a sentence similar to #994: "This functionality MUST be fully implemented in the API, and SHOULD NOT be overridable." (or #969 (comment) ofc)

specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
Anuraag Agrawal and others added 3 commits September 24, 2020 23:21
Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
@anuraaga
Copy link
Contributor Author

Sure, added that

@Oberon00 Oberon00 dismissed their stale review September 24, 2020 14:37

Main design point now clarified 👍

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it would be OK to remove the whole bullet point list and "These methods SHOULD be the only way to create a SpanContext:", so that only "The API must implement methods to create a SpanContext." and "This functionality MUST be fully implemented in the API, and SHOULD NOT be overridable." remains, replacing "SpanContext MUST be a final (sealed) class." Details of creation parameters can then be specified in a separate PR.

specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
specification/trace/api.md Outdated Show resolved Hide resolved
@anuraaga
Copy link
Contributor Author

@Oberon00 Sure, less is always nicer.

Beware :) With #994 and some of the discussion here, I find a wilder idea - that we could make SpanContext itself optional. Since there's no way to do anything with a SpanContext without putting it in a Span, a separate SpanContext object is just a wrapper right? Not sure if we even need to require SpanContext as a separate object in the first place.

Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
@Oberon00
Copy link
Member

that we could make SpanContext itself optional.

I also had been thinking about that and I think this makes much sense. 👍 The only place where you would come into contact with SpanContext since OTEP 66 (and even more since #875) is when you implement a propagator (or when you want for some reason to log the span/trace ID -- but that would work with APIs on the (PropagationOnly)Span just as well. Maybe create an issue for that.

@andrewhsu
Copy link
Member

from the issue triage mtg today with TC, need resolution on this PR by end of day today else it will be merged @bogdandrutu

@jmacd
Copy link
Contributor

jmacd commented Sep 26, 2020

@bogdandrutu wrote his position here: #970 (comment)

The specification already says the length of the values returned by SpanID() and TraceID() methods, so I think the concerns are addressed. Since @andrewhsu said this should either merge or not merge by end-of-day, I will merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:api Cross language API specification issue spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: remove restriction that SpanContext must be a sealed/final class
6 participants