Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Exemplar be enabled by default? #2118

Closed
cijothomas opened this issue Nov 12, 2021 · 10 comments · Fixed by #2414
Closed

Should Exemplar be enabled by default? #2118

cijothomas opened this issue Nov 12, 2021 · 10 comments · Fixed by #2414
Assignees
Labels
area:sdk Related to the SDK release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory

Comments

@cijothomas
Copy link
Member

Originally opened PR to improve wording (#2105), but was clarified in the PR that, Exemplar was intentionally marked as a enabled-by-default feature.
Opening this issue, to discuss the question:
Should exemplar be enabled by default?

Since the feature has some perf penalty (at the minimum, check the current span from the context, and check if it is sampled in), I'd propose to make this as an opt-in feature. (I do not know if most metric backends support Exemplar features, so it'd be good to make it opt-in, as opposed to opt-out.) It was mentioned in this comment that Prometheus's Exemplar is an opt-in feature

@cijothomas cijothomas added the spec:metrics Related to the specification/metrics directory label Nov 12, 2021
@yurishkuro
Copy link
Member

+1

@reyang
Copy link
Member

reyang commented Nov 15, 2021

Discussed during 11/11/2021 Metrics SIG. So far the suggestion is to make it OFF by default.

@jonatan-ivanov mentioned during the SIG meeting: I thinks it's better to be off by default, because the only popular backend that has exemplar is Prometheus, and it needs to be enabled explicitly.

@reyang reyang added this to the Metrics API/SDK Stable Release milestone Nov 15, 2021
@reyang reyang added the area:sdk Related to the SDK label Nov 15, 2021
@jsuereth
Copy link
Contributor

I'm copying over what I wrote in that PR:

TL;DR: I think focusing Exemplars on Traces and tying its default to trace sampling is the right default, and promotes one of the core tenants to OTel -> promoting correlated telemetry.

I think there's an important distinction here that's possibly missing. By Default, Exemplar sampling is on only if there exists a Sampled Trace. The default requires:

  • Users have configured tracing
  • Tracing is in context
  • Tracing is sampling

Exemplar sampling, by default, is controlled via Trace sampling. The ExemplarFilter/ExemplarResorvoir hooks are only for advanced Exemplar usage.

This is unlike other systems where Exemplar sampling may or may not correlate with the existence of Traces. I'd expect a user of just the otel metrics SDK to never see an exemplar (by default). It's only once they've also instrumented traces that they see exemplars be generated.

I have a slightly connected question: should OTel Metrics support any Exemplar source or only OTel Tracing?

An Exemplar != a trace. The exemplar is the measurement, we just attach the trace_id to it. OTel is the only library (I know of) where attaching traces is first-class, e.g. in prometheus exemplars just attach labels to measurements.

That said, to the extent there's another system instrumenting traces, I'd hope they expose OTEL-friendly hooks for interacting with trace. I think it's out-of-scope for OTEL (which already provides pure-apis for tracing) to support backends which don't leverage those APIs. Instead we should just provide a bridge (similar to what we've done for OpenTracing + OpenCensus).

@reyang reyang added the release:required-for-ga Must be resolved before GA release, or nice to have before GA label Nov 23, 2021
@reyang
Copy link
Member

reyang commented Nov 23, 2021

From my understanding there are two major things that worth discussion:

  1. If both traces and metrics are enabled, do we want to have exemplars on-by-default?
    • If we look at the systems today, most of them do not support exemplars (just like in the old days most logging systems do not support correlation ids).
    • We (the Metrics SIG) believe that the design should be forward looking. It should be on-by-default given the industry trend.
  2. Would there be a performance issue if we collect exemplars?
    • If people only use metrics (without traces), exemplars will be OFF, the spec asks for "no perf cost".
    • If people use both metrics and traces, exemplars will be ON, and can be implemented in a performant way (for example, instead of having to check the active span when a measurement is being reported, we should use a flag).
    • There will be additional memory cost, but the cost is quite reasonable/manageable.

@reyang reyang assigned jsuereth and unassigned tigrannajaryan Nov 23, 2021
@yurishkuro
Copy link
Member

I just opened an issue #2155. I hope it illustrates how premature this discussion about "on by default" is.

@tedsuo
Copy link
Contributor

tedsuo commented Jan 18, 2022

Just following up on this issue for clarity.

It sounds like there is a desire to leave exemplars as an experimental feature, while we focus on stabilizing the rest of the metrics SDK. And, as long as it is an experimental feature, it should be off by default. I think that is fine and prudent (in general, we should probably leave experimental features off by default).

But I would like to see exemplars on by default once they are stable. Long term, this is an important low-level feature that many backends are going to adopt, and the performance of exemplars should be stabilized at an acceptable level of production overhead.

@cijothomas
Copy link
Member Author

Once the Exemplar feature itself is stable, I think it should be okay to make it On-By-Default, as long as it is fairly easy to turn it off for those who don't want it.

It is in the spirit of one of the stated metric design goals of "Being able to connect metrics to other signals."

@jsuereth
Copy link
Contributor

jsuereth commented Feb 1, 2022

I'd still like to understand criteria for marking Exemplar stable. Specifically, what more do we need to see, what issues exist in the current specification.

@alolita
Copy link
Member

alolita commented Mar 1, 2022

@jmacd can you respond to this question? Would like to understand next steps to take a decision.

@mtwo
Copy link
Member

mtwo commented Mar 4, 2022

Following up from the metrics burndown call today: exemplars should not be enabled by default yet (most of the implementations aren't ready yet, others have performance issues), but we could enable these by default after GA once implementations are in better shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sdk Related to the SDK release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory
Projects
None yet
8 participants