diff --git a/specification/metrics/api.md b/specification/metrics/api.md index 6c45505176c..4af98ad46b7 100644 --- a/specification/metrics/api.md +++ b/specification/metrics/api.md @@ -131,11 +131,16 @@ This API MUST accept the following parameters: * [since 1.4.0] `schema_url` (optional): Specifies the Schema URL that should be recorded in the emitted telemetry. -It is unspecified whether or under which conditions the same or different -`Meter` instances are returned from this functions. - -Implementations MUST NOT require users to repeatedly obtain a `Meter` again with -the same name+version+schema_url to pick up configuration changes. This can be +Meters are identified by all of these fields. When more than one +Meter of the same `name`, `version`, and `schema_url` is created, it +is unspecified whether or under which conditions the same or different +`Meter` instances are returned. The term *identical* applied to +Meters describes instances where all identifying fields are equal. +The term *distinct* applied to Meters describes instances where at +least one identifying field has a different value. + +Implementations MUST NOT require users to repeatedly obtain a `Meter` with +the same identity to pick up configuration changes. This can be achieved either by allowing to work with an outdated configuration or by ensuring that new configuration applies also to previously returned `Meter`s. @@ -143,7 +148,7 @@ Note: This could, for example, be implemented by storing any mutable configuration in the `MeterProvider` and having `Meter` implementation objects have a reference to the `MeterProvider` from which they were obtained. If configuration must be stored per-meter (such as disabling a certain meter), the -meter could, for example, do a look-up with its name+version+schema_url in a map +meter could, for example, do a look-up with its identity in a map in the `MeterProvider`, or the `MeterProvider` could maintain a registry of all returned `Meter`s and actively update their configuration if it changes. @@ -174,22 +179,53 @@ Also see the respective sections below for more information on instrument creati ## Instrument Instruments are used to report [Measurements](#measurement). Each Instrument -will have the following information: +will have the following fields: * The `name` of the Instrument -* The `kind` of the Instrument - whether it is a [Counter](#counter) or other - instruments, whether it is synchronous or asynchronous +* The `kind` of the Instrument - whether it is a [Counter](#counter) or + one of the other kinds, whether it is synchronous or asynchronous * An optional `unit` of measure * An optional `description` -Instruments are associated with the Meter during creation, and are identified by -the name: +Instruments are associated with the Meter during creation. Instruments +are identified by all of these fields. + +Language-level features such as the distinction between integer and +floating point numbers SHOULD be considered as identifying. + + + +When more than one Instrument of the same `name` is created for +identical Meters, denoted *duplicate instrument registration*, the +implementation MUST create a valid Instrument in every case. Here, +"valid" means an instrument that is functional and can be expected to +export data, despite potentially creating a [semantic error in the +data +model](datamodel.md#opentelemetry-protocol-data-model-producer-recommendations). + +It is unspecified whether or under which conditions the same or +different Instrument instance will be returned as a result of +duplicate instrument registration. The term *identical* applied to +Instruments describes instances where all identifying fields are +equal. The term *distinct* applied to Instruments describes instances +where at least one field value is different. + +When more than one distinct Instrument is registered with the same +`name` for identical Meters, the implementation SHOULD emit a warning +to the user informing them of duplicate registration conflict(s). -* Meter implementations MUST return an error when multiple Instruments are - registered under the same Meter instance using the same name. -* Different Meters MUST be treated as separate namespaces. The names of the - Instruments under one Meter SHOULD NOT interfere with Instruments under - another Meter. +__Note the warning about duplicate Instrument registration conflicts +is meant to help avoid the semantic error state described in the +[OpenTelemetry Metrics data +model](datamodel.md#opentelemetry-protocol-data-model-producer-recommendations) +when more than one `Metric` is written for a given instrument `name` +and Meter identity by the same MeterProvider. + + + +Distinct Meters MUST be treated as separate namespaces for the +purposes of detecting [duplicate instrument registration +conflicts](#instrument-type-conflict-detection). @@ -212,7 +248,7 @@ DIGIT = %x30-39 ; 0-9 -The `unit` is an optional string provided by the author of the instrument. It +The `unit` is an optional string provided by the author of the Instrument. It SHOULD be treated as an opaque string from the API and SDK (e.g. the SDK is not expected to validate the unit of measurement, or perform the unit conversion). @@ -241,7 +277,7 @@ instrument. It MUST be treated as an opaque string from the API and SDK. * It MUST support at least 1023 characters. [OpenTelemetry API](../overview.md#api) authors MAY decide if they want to support more. -Instruments can be categorized based on whether they are synchronous or +Instruments are categorized on whether they are synchronous or asynchronous: @@ -266,6 +302,25 @@ Please note that the term *synchronous* and *asynchronous* have nothing to do with the [asynchronous pattern](https://en.wikipedia.org/wiki/Asynchronous_method_invocation). +The API MUST support creation of asynchronous instruments by passing +zero or more callback functions to be permanently registered to the +newly created instrument. + +The API SHOULD support registration of callback functions to +asynchronous instruments after they are created. + +Where the API supports registration of callback functions after +asynchronous instrumentation creation, it MUST return something (e.g., +a registration handle, receipt or token) to the user that supports +undoing the effect of callback registation. + +Callback functions SHOULD NOT take an indefinite amount of time. + +Callback functions SHOULD NOT make duplicate observations from asynchronous +instrument callbacks. The resulting behavior when a callback observes +multiple values for identical instrument and attributes is explicitly +not specified. + ### Counter `Counter` is a [synchronous Instrument](#synchronous-instrument) which supports @@ -398,9 +453,9 @@ The API MUST accept the following parameters: rule](#instrument-unit). * An optional `description`, following the [instrument description rule](#instrument-description). -* A `callback` function. +* Zero or more `callback` functions. [See the general requirements](#asynchronous-instrument). -The `callback` function is responsible for reporting the +The `callback` function is responsible for reporting [Measurement](#measurement)s. It will only be called when the Meter is being observed. [OpenTelemetry API](../overview.md#api) authors SHOULD define whether this callback function needs to be reentrant safe / thread safe or not. @@ -410,10 +465,6 @@ callback function reports the absolute value of the counter. To determine the reported rate the counter is changing, the difference between successive measurements is used. -The callback function SHOULD NOT take indefinite amount of time. If multiple -independent SDKs coexist in a running process, they MUST invoke the callback -function(s) independently. - [OpenTelemetry API](../overview.md#api) authors MAY decide what is the idiomatic approach. Here are some examples: @@ -484,10 +535,35 @@ meter.CreateObservableCounter("caesium_oscillates", () => clock.GetCaesi #### Asynchronous Counter operations -Asynchronous Counter is only intended for an asynchronous scenario. The only -operation is provided by the `callback`, which is registered during the +Asynchronous Counter uses an idiomatic interface for reporting +measurements through a `callback`, which is registered during [Asynchronous Counter creation](#asynchronous-counter-creation). +For callback functions registered after an asynchronous instrument is +created, the API is required to support a mechanism for +unregistration. For example, the object returned from `register_callback` +can support an `unregister()` method directly. + +```python +# Python +class Device: + """A device with one counter""" + + def __init__(self, meter, x): + self.x = x + counter = meter.create_observable_counter(name="usage", description="count of items used") + self.cb = counter.register_callback(self.counter_callback) + + def counter_callback(self, result): + result.Observe(self.read_counter(), {'x', self.x}) + + def read_counter(self): + return 100 # ... + + def stop(self): + self.cb.unregister() +``` + ### Histogram `Histogram` is a [synchronous Instrument](#synchronous-instrument) which can be @@ -615,17 +691,13 @@ The API MUST accept the following parameters: rule](#instrument-unit). * An optional `description`, following the [instrument description rule](#instrument-description). -* A `callback` function. +* Zero or more `callback` functions. [See the general requirements](#asynchronous-instrument). -The `callback` function is responsible for reporting the +The `callback` function is responsible for reporting [Measurement](#measurement)s. It will only be called when the Meter is being observed. [OpenTelemetry API](../overview.md#api) authors SHOULD define whether this callback function needs to be reentrant safe / thread safe or not. -The callback function SHOULD NOT take indefinite amount of time. If multiple -independent SDKs coexist in a running process, they MUST invoke the callback -function(s) independently. - [OpenTelemetry API](../overview.md#api) authors MAY decide what is the idiomatic approach. Here are some examples: @@ -699,10 +771,35 @@ meter.CreateObservableGauge("temperature", () => sensor.GetTemperature() #### Asynchronous Gauge operations -Asynchronous Gauge is only intended for an asynchronous scenario. The only -operation is provided by the `callback`, which is registered during the +Asynchronous Gauge uses an idiomatic interface for reporting +measurements through a `callback`, which is registered during [Asynchronous Gauge creation](#asynchronous-gauge-creation). +For callback functions registered after an asynchronous instrument is +created, the API is required to support a mechanism for +unregistration. For example, the object returned from `register_callback` +can support an `unregister()` method directly. + +```python +# Python +class Device: + """A device with one gauge""" + + def __init__(self, meter, x): + self.x = x + gauge = meter.create_observable_gauge(name="pressure", description="force/area") + self.cb = gauge.register_callback(self.gauge_callback) + + def gauge_callback(self, result): + result.Observe(self.read_gauge(), {'x', self.x}) + + def read_gauge(self): + return 100 # ... + + def stop(self): + self.cb.unregister() +``` + ### UpDownCounter `UpDownCounter` is a [synchronous Instrument](#synchronous-instrument) which @@ -887,9 +984,9 @@ The API MUST accept the following parameters: rule](#instrument-unit). * An optional `description`, following the [instrument description rule](#instrument-description). -* A `callback` function. +* Zero or more `callback` functions. [See the general requirements](#asynchronous-instrument). -The `callback` function is responsible for reporting the +The `callback` function is responsible for reporting [Measurement](#measurement)s. It will only be called when the Meter is being observed. [OpenTelemetry API](../overview.md#api) authors SHOULD define whether this callback function needs to be reentrant safe / thread safe or not. @@ -899,10 +996,6 @@ the callback function reports the absolute value of the Asynchronous UpDownCounter. To determine the reported rate the Asynchronous UpDownCounter is changing, the difference between successive measurements is used. -The callback function SHOULD NOT take indefinite amount of time. If multiple -independent SDKs coexist in a running process, they MUST invoke the callback -function(s) independently. - [OpenTelemetry API](../overview.md#api) authors MAY decide what is the idiomatic approach. Here are some examples: @@ -975,9 +1068,34 @@ meter.CreateObservableUpDownCounter("memory.physical.free", () => WMI.Qu #### Asynchronous UpDownCounter operations -Asynchronous UpDownCounter is only intended for an asynchronous scenario. The -only operation is provided by the `callback`, which is registered during the -[Asynchronous UpDownCounter creation](#asynchronous-updowncounter-creation). +Asynchronous UpDownCounter uses an idiomatic interface for reporting +measurements through a `callback`, which is registered during +[Asynchronous Updowncounter creation](#asynchronous-updowncounter-creation). + +For callback functions registered after an asynchronous instrument is +created, the API is required to support a mechanism for +unregistration. For example, the object returned from `register_callback` +can support an `unregister()` method directly. + +```python +# Python +class Device: + """A device with one updowncounter""" + + def __init__(self, meter, x): + self.x = x + updowncounter = meter.create_observable_updowncounter(name="queue_size", description="items in process") + self.cb = updowncounter.register_callback(self.updowncounter_callback) + + def updowncounter_callback(self, result): + result.Observe(self.read_updowncounter(), {'x', self.x}) + + def read_updowncounter(self): + return 100 # ... + + def stop(self): + self.cb.unregister() +``` ## Measurement @@ -992,8 +1110,8 @@ for the interaction between the API and SDK. ## Compatibility requirements -All the metrics components SHOULD allow new APIs to be added to existing -components without introducing breaking changes. +All the metrics components SHOULD allow new APIs to be added to +existing components without introducing breaking changes. All the metrics APIs SHOULD allow optional parameter(s) to be added to existing APIs without introducing breaking changes, if possible. diff --git a/specification/metrics/datamodel.md b/specification/metrics/datamodel.md index cc70ec0923c..a4da3e7dc04 100644 --- a/specification/metrics/datamodel.md +++ b/specification/metrics/datamodel.md @@ -15,6 +15,9 @@ * [Event Model](#event-model) * [Timeseries Model](#timeseries-model) * [OpenTelemetry Protocol data model](#opentelemetry-protocol-data-model) + + [OpenTelemetry Protocol data model: Producer recommendations](#opentelemetry-protocol-data-model-producer-recommendations) + + [OpenTelemetry Protocol data model: Consumer recommendations](#opentelemetry-protocol-data-model-consumer-recommendations) + + [Point kinds](#point-kinds) - [Metric points](#metric-points) * [Sums](#sums) * [Gauge](#gauge) @@ -28,8 +31,8 @@ - [Negative Scale: Extract and Shift the Exponent](#negative-scale-extract-and-shift-the-exponent) - [All Scales: Use the Logarithm Function](#all-scales-use-the-logarithm-function) - [Positive Scale: Use a Lookup Table](#positive-scale-use-a-lookup-table) - - [Producer Recommendations](#producer-recommendations) - + [Consumer Expectations](#consumer-expectations) + + [ExponentialHistogram: Producer Recommendations](#exponentialhistogram-producer-recommendations) + + [ExponentialHistogram: Consumer Recommendations](#exponentialhistogram-consumer-recommendations) * [Summary (Legacy)](#summary-legacy) - [Exemplars](#exemplars) - [Single-Writer](#single-writer) @@ -233,7 +236,7 @@ consisting of several metadata properties: - Metric name - Attributes (dimensions) -- Kind of point (integer, floating point, etc) +- Value type of the point (integer, floating point, etc) - Unit of measurement The primary data of each timeseries are ordered (timestamp, value) points, with @@ -257,22 +260,89 @@ to map into, but is used as a reference throughout this document. ### OpenTelemetry Protocol data model -The OpenTelemetry protocol data model is composed of Metric data streams. These -streams are in turn composed of metric data points. Metric data streams -can be converted directly into Timeseries, and share the same identity -characteristics for a Timeseries. A metric stream is identified by: - -- The originating `Resource` -- The metric stream's `name`. -- The attached `Attribute`s -- The metric stream's point kind. - -It is possible (and likely) that more than one metric stream is created per -`Instrument` in the event model. - -**Note: The same `Resource`, `name` and `Attribute`s but differing point kind -coming out of an OpenTelemetry SDK is considered an "error state" that SHOULD -be handled by an SDK.** +The OpenTelemetry protocol (OTLP) data model is composed of Metric data +streams. These streams are in turn composed of metric data points. +Metric data streams can be converted directly into Timeseries. + +Metric streams are grouped into individual `Metric` objects, +identified by: + +- The originating `Resource` attributes +- The instrumentation `Scope` (e.g., instrumentation library name, version) +- The metric stream's `name` + +Including `name`, the `Metric` object is defined by the following +properties: + +- The data point type (e.g. `Sum`, `Gauge`, `Histogram` `ExponentialHistogram`, `Summary`) +- The metric stream's `unit` +- The metric stream's `description` +- Intrinsic data point properties, where applicable: `AggregationTemporality`, `Monotonic` + +The data point type, `unit`, and intrinsic properties are considered +identifying, whereas the `description` field is explicitly not +identifying in nature. + +Extrinsic properties of specific points are not considered +identifying; these include but are not limited to: + +- Bucket boundaries of a `Histogram` data point +- Scale or bucket count of a `ExponentialHistogram` data point. + +The `Metric` object contains individual streams, identified by the set +of `Attributes`. Within the individual streams, points are identified +by one or two timestamps, details vary by data point type. + +Within certain data point types (e.g., `Sum` and `Gauge`) there is +variation permitted in the numeric point value; in this case, the +associated variation (i.e., floating-point vs. integer) is not +considered identifying. + +#### OpenTelemetry Protocol data model: Producer recommendations + +Producers SHOULD prevent the presence of multiple `Metric` identities +for a given `name` with the same `Resource` and `Scope` attributes. +Producers are expected to aggregate data for identical `Metric` +objects as a basic feature, so the appearance of multiple `Metric`, +considered a "semantic error", generally requires duplicate +conflicting instrument registration to have occurred somewhere. + +Producers MAY be able to remediate the problem, depending on whether +they are an SDK or a downstream processor: + +1. If the potential conflict involves a non-identifying property (i.e., + `description`), the producer SHOULD choose the longer string. +2. If the potential conflict involves similar but disagreeing units + (e.g., "ms" and "s"), an implementation MAY convert units to avoid + semantic errors; otherwise an implementation SHOULD inform the user + of a semantic error and pass through conflicting data. +3. If the potential conflict involves an `AggregationTemporality` + property, an implementation MAY convert temporality using a + Cumulative-to-Delta or a Delta-to-Cumulative transformation; + otherwise, an implementation SHOULD inform the user of a semantic + error and pass through conflicting data. +4. Generally, for potential conflicts involving an identifying + property (i.e., all properties except `description`), the producer + SHOULD inform the user of a semantic error and pass through + conflicting data. + +When semantic errors such as these occur inside an implementation of +the OpenTelemetry API, there is an presumption of a fixed `Resource` +value. Consequently, SDKs implementing the OpenTelemetry API have +complete information about the origin of duplicate instrument +registration conflicts and are sometimes able to help users avoid +semantic errors. See the SDK specification for specific details. + +#### OpenTelemetry Protocol data model: Consumer recommendations + +Consumers MAY reject OpenTelemetry Metrics data containing semantic +errors (i.e., more than one `Metric` identity for a given `name`, +`Resource`, and `Scope`). + +OpenTelemetry does not specify any means for conveying such an outcome +to the end user, although this subject deserves attention. + +#### Point kinds A metric stream can use one of these basic point kinds, all of which satisfy the requirements above, meaning they define a decomposable @@ -653,7 +723,7 @@ For positive scales, lookup table methods have been demonstrated that are able to exactly compute the index in constant time from a lookup table with `O(2**scale)` entries. -##### Producer Recommendations +#### ExponentialHistogram: Producer Recommendations At the lowest or highest end of the 64 bit IEEE floating point, a bucket's range may only be partially representable by the floating @@ -674,7 +744,7 @@ perform an exact computation. As a result, ExponentialHistogram exemplars could map into buckets with zero count. We expect to find such values counted in the adjacent buckets. -#### Consumer Expectations +#### ExponentialHistogram: Consumer Recommendations ExponentialHistogram bucket indices are expected to map into buckets where both the upper and lower boundaries can be represented @@ -749,11 +819,11 @@ All metric data streams within OTLP MUST have one logical writer. This means, conceptually, that any Timeseries created from the Protocol MUST have one originating source of truth. In practical terms, this implies the following: -- All metric data streams produced by OTel SDKs MUST be globally uniquely - produced and free from duplicates. All metric data streams can be uniquely - identified in some way. +- All metric data streams produced by OTel SDKs SHOULD have globally + unique identity at any given point in time. [`Metric` identity is defined + above.](#opentelemetry-protocol-data-model-producer-recommendations) - Aggregations of metric streams MUST only be written from a single logical - source. + source at any given point time. **Note: This implies aggregated metric streams must reach one destination**. In systems, there is the possibility of multiple writers sending data for the @@ -779,6 +849,13 @@ scenarios and take corrective actions. Additionally, it ensures that well-behaved systems can perform metric stream manipulation without undesired degradation or loss of visibility. +Note that violations of the Single-Writer principle are not semantic +errors, generally they result from misconfiguration. Whereas semantic +errors can sometimes be corrected by configuring Views, violations of +the Single-Writer principle can be corrected by differentiating the +`Resource` used or by ensuring that streams for a given `Resource` and +`Attribute` set do not overlap in time. + ## Temporality **Status**: [Stable](../document-status.md) diff --git a/specification/metrics/sdk.md b/specification/metrics/sdk.md index a38cfced952..d7d0139f6d0 100644 --- a/specification/metrics/sdk.md +++ b/specification/metrics/sdk.md @@ -19,6 +19,8 @@ + [Last Value Aggregation](#last-value-aggregation) + [Histogram Aggregation](#histogram-aggregation) + [Explicit Bucket Histogram Aggregation](#explicit-bucket-histogram-aggregation) + * [Observations inside asynchronous callbacks](#observations-inside-asynchronous-callbacks) + * [Resolving duplicate instrument registration conflicts](#resolving-duplicate-instrument-registration-conflicts) - [Attribute limits](#attribute-limits) - [Exemplar](#exemplar) * [ExemplarFilter](#exemplarfilter) @@ -416,6 +418,51 @@ instruments that record negative measurements, e.g. `UpDownCounter` or `Observab - Min (optional) `Measurement` value in population. - Max (optional) `Measurement` value in population. +### Observations inside asynchronous callbacks + +Callback functions MUST be invoked for the specific `MetricReader` +performing collection, such that observations made or produced by +executing callbacks only apply to the intended `MetricReader` during +collection. + +The implementation SHOULD disregard the accidental use of APIs +appurtenant to asynchronous instruments outside of registered +callbacks in the context of a single `MetricReader` collection. + +The implementation SHOULD use a timeout to prevent indefinite callback +execution. + +The implementation MUST complete the execution of all callbacks for a +given instrument before starting a subsequent round of collection. + +### Resolving duplicate instrument registration conflicts + +As [stated in the API +specification](api.md#instrument-type-conflict-detection), +implementations are REQUIRED to create valid instruments in case of +duplicate instrument registration, and the [data model includes +RECOMMENDATIONS on how to treat the consequent duplicate +conflicting](datamodel.md#opentelemetry-protocol-data-model-producer-recommendations) +`Metric` definitions. + +The implementation MUST aggregate data from identical Instruments +together in its export pipeline. + +The implementation SHOULD assist the user in managing conflicts by +reporting each duplicate-conflicting instrument registration that was +not corrected by a View as follows. When a potential conflict arises +between two non-identical `Metric` instances having the same `name`: + +1. If the potential conflict involves multiple `description` + properties, setting the `description` through a configured View + SHOULD avoid the warning. +2. If the potential conflict involves instruments that can be + distinguished by a supported View selector (e.g., instrument type) + a View recipe SHOULD be printed advising the user how to avoid the + warning by renaming one of the conflicting instruments. +3. Otherwise (e.g., use of multiple units), the implementation SHOULD + pass through the data by reporting both `Metric` objects. + ## Attribute limits Attributes which belong to Metrics are exempt from the