Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial cut at migrating jmacd's datamodel document into the spec #1512

Merged
merged 19 commits into from
Mar 18, 2021
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ release.

### Metrics

- Adds new metric data model specification ([#1512](https://github.com/open-telemetry/opentelemetry-specification/pull/1512))

### Logs

### Semantic Conventions
Expand Down
249 changes: 249 additions & 0 deletions specification/metrics/datamodel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
# Metrics Data Model

**Status**: [Experimental](../document-status.md)

<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` -->

<!-- toc -->

<!-- tocstop -->

## Overview

The OpenTelemetry data model for metrics consists of a protocol specification
and semantic conventions for delivery of pre-aggregated metric timeseries data.
The data model is designed for importing data from existing systems and
exporting data into existing systems, as well as to support internal
OpenTelemetry use-cases for generating Metrics from streams of Spans or Logs.

Popular existing metrics data formats can be unambiguously translated into the
OpenTelemetry data model for metrics, without loss of semantics or fidelity.
Translation from the Prometheus and Statsd exposition formats is explicitly
specified.

The data model specifies a number of semantics-preserving data transformations
for use on the collection path, supporting flexible system configuration. The
model supports reliability and statelessness controls, through the choice of
cumulative and delta transport. The model supports cost controls, through
jmacd marked this conversation as resolved.
Show resolved Hide resolved
spatial and temporal reaggregation.

The OpenTelemetry collector is designed to accept metrics data in a number of
formats, transport data using the OpenTelemetry data model, and then export into
existing systems. The data model can be unambiguously translated into the
Prometheus Remote Write protocol without loss of features or semantics, through
well-defined translations of the data, including the ability to automatically
remove attributes and lower histogram resolution.

## Events → Data → Timeseries

The OTLP Metrics protocol is designed as a standard for transporting metric
data. To describe the intended use of this data and the associated semantic
meaning, OpenTelemetry metric data types will be linked into a framework
containing a higher-level model, about Metrics APIs and discrete input values,
and a lower-level model, defining the Timeseries and discrete output values.
The relationship between models is displayed in the diagram below.

![Events → Data → Timeseries Diagram](img/model-layers.png)

This protocol was designed to meet the requirements of the OpenCensus Metrics
system, particularly to meet its concept of Metrics Views. Views are
accomplished in the OpenTelemetry Metrics data model through support for data
transformation on the collection path.

OpenTelemetry has identified three kinds of semantics-preserving Metric data
transformation that are useful in building metrics collection systems as ways of
controlling cost, reliability, and resource allocation. The OpenTelemetry
Metrics data model is designed to support these transformations both inside an
SDK as the data originates, or as a reprocessing stage inside the OpenTelemetry
collector. These transformations are:

1. Temporal reaggregation: Metrics that are collected at a high-frequency can be
re-aggregated into longer intervals, allowing low-resolution timeseries to be
pre-calculated or used in place of the original metric data.
2. Spatial reaggregation: Metrics that are produced with unwanted dimensions can
be re-aggregated into metrics having fewer dimensions.
3. Delta-to-Cumulative: Metrics that are input and output with Delta temporality
unburden the client from keeping high-cardinality state. The use of deltas
allows downstream services to bear the cost of conversion into cumulative
timeseries, or to forego the cost and calculate rates directly.

OpenTelemetry Metrics data points are designed so that these transformations can
be applied automatically to points of the same type, subject to conditions
outlined below. Every OTLP data point has an intrinsic
[decomposable aggregate function](https://en.wikipedia.org/wiki/Aggregate_function#Decomposable_aggregate_functions)
making it semantically well-defined to merge data points across both temporal
and spatial dimensions. Every OTLP data point also has two meaningful timestamps
which, combined with intrinsic aggregation, make it possible to carry out the
standard metric data transformations for each of the model’s basic points while
ensuring that the result carries the intended meaning.

As in OpenCensus Metrics, metrics data can be transformed into one or more
Views, just by selecting the aggregation interval and the desired dimensions.
One stream of OTLP data can be transformed into multiple timeseries outputs by
configuring different Views, and the required Views processing may be applied
inside the SDK or by an external collector.

### Example Use-cases
jmacd marked this conversation as resolved.
Show resolved Hide resolved

The metric data model is designed around a series of "core" use cases. While
this list is not exhaustive, it is meant to be representative of the scope and
breadth of OTel metrics usage.

1. OTel SDK exports 10 second resolution to a single OTel collector, using
cumulative temporality for a stateful client, stateless server:
- Collector passes-through original data to an OTLP destination
- Collector re-aggregates into longer intervals without changing dimensions
- Collector re-aggregates into several distinct views, each with a subset of
the available dimensions, outputs to the same destination
2. OTel SDK exports 10 second resolution to a single OTel collector, using delta
jsuereth marked this conversation as resolved.
Show resolved Hide resolved
temporality for a stateless client, stateful server:
- Collector re-aggregates into 60 second resolution
- Collector converts delta to cumulative temporality
3. OTel SDK exports both 10 seconds resolution (e.g. CPU, request latency) and
15 minutes resolution (e.g. room temperature) to a single OTel Collector.
The collector exports streams upstream with or without aggregation.
4. A number of OTel SDKs running locally each exports 10 second resolution, each
reports to a single (local) OTel collector.
- Collector re-aggregates into 60 second resolution
- Collector re-aggregates to eliminate the identity of individual SDKs (e.g.,
distinct `service.instance.id` values)
- Collector outputs to an OTLP destination
5. Pool of OTel collectors receive OTLP and export Prometheus Remote Write
- Collector joins service discovery with metric resources
- Collector computes “up”, staleness marker
- Collector applies a distinct external label
6. OTel collector receives Statsd and exports OTLP
- With delta temporality: stateless collector
- With cumulative temporality: stateful collector
7. OTel SDK exports directly to 3P backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Is "3P" defined somewhere?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace with the explicit text :)


These are considered the "core" use-cases used to analyze tradeoffs and design
decisions within the metrics data model.

jsuereth marked this conversation as resolved.
Show resolved Hide resolved
### Out of Scope Use-cases

The metrics data model is NOT designed to be a perfect rosetta stone of metrics.
Here are a set of use cases that, while won't be outright unsupported, are not
in scope for key design decisions:

- Using OTLP as an intermediary format between two non-compatible formats
- Importing [statsd](https://github.com/statsd/statsd) => Prometheus PRW
- Importing [collectd](https://collectd.org/wiki/index.php/Binary_protocol#:~:text=The%20binary%20protocol%20is%20the,some%20documentation%20to%20reimplement%20it)
=> Prometheus PRW
- Importing Prometheus endpoint scrape => [statsd push | collectd | opencensus]
- Importing OpenCensus "oca" => any non OC or OTel format
- TODO: define others.
Comment on lines +129 to +135
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fail a bit to understand this:

It is in scope to have Prometheus -> OTLP -> PRW and it is in scope to have StatsD -> OTLP then why is not in scope to have StatsD -> OTLP -> PRW? mathematically if the translation function OTLP -> PRW is defined and StatsD -> OTLP is defined then I can compose them to get StatsD -> OTLP -> PRW

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think StatsD -> OTLP -> PRW is in-scope. We haven't added support for raw sampled metric events, which is how they appear to the StatsD receiver, so somewhere in that picture is a stateful conversion to OTLP(cumulative) which is not a difficult task, just requires memory.

I believe there is an additional concept that belongs in the specification, and I'm sorry to introduce it here, that I think of as a counterpart to the Single-Writer definition we have discussed. It can be called a Single-Reader property, and it is required to perform a correct delta-to-cumulative translation.

Single-Reader: Having access to the whole stream of data for a metric. This happens when an OTel collector is configured as a per-node agent and every process on the node reports through that agent. This happens when an OTel collector pool runs the Kafka exporter and Kafka receiver with appropriate configuration.

StatsD -> OTLP -> PRW should be in-scope for a single-node deployment, and it ought to behave just as https://github.com/prometheus/statsd_exporter would.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu There's a lot of nuance to Statsd => PRW. While TECHNICALLY you can shove bits into PRW, when I say "compatibility" i mean the way we defined it for Prometheus: "A user of prometheus shouldn't be able to tell if OTLP was used to transport prometheues style metric to their backend". Similarly, for users of Statsd today, we should HONOR the statsd conventions and formats from statsd => PRW. The reason Statsd => PRW is out of scope is because of all the non-data-related nuance (e.g. naming conventions, namespace options etc. that a statsd user would expect).

To be frank: I don't think it's OTLP's job to define how Statsd maps into PRW, and we shouldn't sacrifice the design of OTLP for that use case. We can attempt to provide the best translation possible, but it should not "block" or otherwise inhibit progress. There's a difference between "out of scope" and "can't do". What I'm suggesting is we won't prioritize or make a ton of effort to solve statsd => PRW bugs that are inherent to the impedance mismatch of those two models.

@jmacd I'm actually tying your "Single-Reader" into the "SingleWriter rule. Specifically the way I'm phrasing Single-WRiter in the not-yet-public PR is that any given timeseries should originate from a --SingleWriter. This includes aggregation time series, meaning if the collector is manipulating a metrics, it becomes the new single-writer of that aggregation. It is also required for delta-to-cumulative manipulation (which in my opinion, that's a new series of data). To some extent I think this nuance is just terminology and we need to make sure we COVER the instance of when it's OK to do "Delta-Cumulative" and what that implies on your telemetry-pipeline architecture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

It is also required for delta-to-cumulative manipulation (which in my opinion, that's a new series of data).

Yes, that's how I've been thinking about it too. It's permitted to output the same metric name, in this case, because the incoming data is "terminal".


## Model Details

OpenTelemetry fragments metrics into three interacting models:

- An Event model, representing how instrumentation reports metric data.
- A TimeSeries model, representing how backends store metric data.
- The *O*pen*T*e*L*emetry *P*rotocol (OTLP) data model representing how metrics
are manipulated and transmitted between the Event model and the TimeSeries
storage.

### Event Model

This specification uses as its foundation a
[Metrics API consisting of 6 model instruments](api.md), each having distinct
semantics, that were prototyped in several OpenTelemetry SDKs between July 2019
and June 2020. The model instruments and their specific use-cases are meant to
anchor our understanding of the OpenTelemetry data model and are divided into
three categories:

- Synchronous vs. Asynchronous. The act of calling a Metrics API in a
synchronous context means the application/library calls the SDK, typically having
associated trace context and baggage; an Asynchronous instrument is called at
collection time, through a callback, and lacks context.
- Adding vs. Grouping. Whereas adding instruments express a sum, grouping
instruments characterize a group of measurements. The numbers passed to adding
instruments define division, in the algebraic sense, while the numbers passed
to grouping instruments are generally not. Adding instrument values are always
parts of a sum, while grouping instrument values are individual measurements.
- Monotonic vs. Non-Monotonic. The adding instruments are categorized by whether
the derivative of the quantity they express is non-negative. Monotonic
instruments are primarily useful for monitoring a rate value, whereas
non-monotonic instruments are primarily useful for monitoring a total value.

In the Event model, the primary data are (instrument, number) points, originally
observed in real time or on demand (for the synchronous and asynchronous cases,
respectively). The instruments and model use-cases will be described in greater
detail as we link the event model with the other two.

### Timeseries Model

In this low-level metrics data model, a Timeseries is defined by an entity
consisting of several metadata properties:

- Metric name and description
- Label set
jsuereth marked this conversation as resolved.
Show resolved Hide resolved
- Kind of point (integer, floating point, etc)
- Unit of measurement

The primary data of each timeseries are ordered (timestamp, value) points, for
jmacd marked this conversation as resolved.
Show resolved Hide resolved
three value types:

1. Counter (Monotonic, cumulative)
2. Gauge
3. Histogram

This model may be viewed as an idealization of
[Prometheus Remote Write](https://docs.google.com/document/d/1LPhVRSFkGNSuU1fBd81ulhsCPR4hkSZyyBj1SZ8fWOM/edit#heading=h.3p42p5s8n0ui).
Like that protocol, we are additionally concerned with knowing when a point
value is defined, as compared with being implicitly or explicitly absent. A
metric stream of delta data points defines time-interval values, not
point-in-time values. To precisely define presence and absence of data requires
further development of the correspondence between these models.

### OpenTelemetry Protocol data model

The OpenTelemetry data model for metrics includes four basic point kinds, all of
which satisfy the requirements above, meaning they define a decomposable
aggregate function (also known as a “natural merge” function) for points of the
same kind. <sup>[1](#otlpdatapointfn)</sup>

The basic point kinds are:

1. Monotonic Sum
2. Non-Monotonic Sum
3. Gauge
4. Histogram

Comparing the OpenTelemetry and Timeseries data models, OTLP carries an
additional kind of point. Whereas an OTLP Monotonic Sum point translates into a
Timeseries Counter point, and an OTLP Histogram point translates into a
Timeseries Histogram point, there are two OTLP data points that become Gauges
in the Timeseries model: the OTLP Non-Monotonic Sum point and OTLP Gauge point.

The two points that become Gauges in the Timeseries model are distinguished by
their built in aggregate function, meaning they define re-aggregation
differently. Sum points combine using addition, while Gauge points combine into
histograms.
jmacd marked this conversation as resolved.
Show resolved Hide resolved

## Single-Writer

Pending

## Temporarily

Pending

## Resources

Pending

## Temporal Alignment

Pending

## External Labels

Pending

## Footnotes

<a name="otlpdatapointfn">[1]</a>: OTLP supports data point kinds that do not
satisfy these conditions; they are well-defined but do not support standard
metric data transformations.
Binary file added specification/metrics/img/model-layers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 8 additions & 2 deletions specification/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -264,9 +264,12 @@ supports both - push and pull model of setting the `Metric` value.

### Metrics data model and SDK

Metrics data model is defined in SDK and is based on
Metrics data model is [specified here](metrics/datamodel.md) and is based on
[metrics.proto](https://github.com/open-telemetry/opentelemetry-proto/blob/master/opentelemetry/proto/metrics/v1/metrics.proto).
This data model is used by all the OpenTelemetry exporters as an input.
This data model defines three semantics: An Event model used by the API, an
in-flight data model used by the SDK and OTLP, and a TimeSeries model which
denotes how exporters should interpret the in-flight model.

Different exporters have different capabilities (e.g. which data types are
supported) and different constraints (e.g. which characters are allowed in label
keys). Metrics is intended to be a superset of what's possible, not a lowest
Expand All @@ -279,6 +282,9 @@ validation and sanitization of the Metrics data. Instead, pass the data to the
backend, rely on the backend to perform validation, and pass back any errors
from the backend.

See [Metrics Data Model Specification](metrics/datamodel.md) for more
information.

## Log Signal

### Data model
Expand Down