Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise "Reference: Glossary" #1181

Merged
merged 20 commits into from
Feb 18, 2022
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 0 additions & 98 deletions docs/sources/glossary.md

This file was deleted.

104 changes: 104 additions & 0 deletions docs/sources/reference-glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: "Reference: Glossary"
description: ""
weight: 10000
---

# Reference: Glossary

## Blocks storage

Blocks storage is the Mimir storage engine based on the Prometheus TSDB.
Grafana Mimir stores blocks in object stores such as AWS S3, Google Cloud Storage (GCS), Azure blob storage, or OpenStack Object Storage (Swift).
For the full list of supported backends and more information, refer to [Blocks storage]({{<relref "./blocks-storage/_index.md" >}})

## Chunk

A chunk is an object containing encoded timestamp-value pairs for one series.

## Churn

Churn is the frequency at which series become idle.

A series become idle once it's no longer exported by the monitored targets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A series become idle once it's no longer exported by the monitored targets.
A series becomes idle once it's no longer exported by the monitored targets.

Copy link
Contributor

@replay replay Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or

Suggested change
A series become idle once it's no longer exported by the monitored targets.
Series become idle once they're no longer exported by the monitored targets.

Typically, series become idle when a monitored target process or node gets terminated.

## Flushing

Flushing is the operation run by ingesters to offload time series from memory and store them in the long-term storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one could argue that "flushing" and "shipping" are two separate operations in the ingester, considering that they get triggered by separate tickers and a failure of shipping does not result in a failure of flushing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried a clarification in 56efa6f, what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see you reverted your clarification in 1a11857. Was that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted that because of #1181 (comment) but perhaps I misunderstood?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think we never came up with a good definition of "flushing". I've always understood it as "from ingesters to long-term storage" because "flush" makes me think we "remove" it from ingesters. I have no strong opinion on this.


## Gossip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a ## Memberlist, mentioning to refer to ## Gossip.


Gossip is a protocol by which components coordinate with the need for a centralized [key-value store]({{<relref "#key-value-store" >}}).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with "with the need for a centralized key-value store"? Maybe it's just my poor english, but I've the feeling it doesn't clarify that when using gossip you don't have to run any centralized key-value store, because Mimir components just use this p2p protocol to talk each other and coordinate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this wording is a bit confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry yeah that was a typo. It was meant to be "without" 🤦

Perhaps that still isn't the best wording. Happy to replace with something else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fixing the typo could be a good start. At least it makes it correct ;)


## HA tracker

The HA tracker is a feature of the Grafana Mimir distributor.
It deduplicates time series received from two or more Prometheus servers that are configured to scrape the same targets.
To configure HA tracking, refer to [Configure HA deduplication]({{<relref "./operating-grafana-mimir/configure-ha-deduplication.md" >}}).

## Hash ring

The hash ring is a distributed data structure used by Grafana Mimir for sharding, replication, and service discovery.
Copy link
Contributor

@replay replay Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it distributed? I thought (except in the case of memberlist) it's pretty centralized in the k/v store.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're used to call it distributed data structure, with regards to the fact that we use a centralized KV store just to "share" it but then it's kept in each process memory and each process can take local decisions (based on the ring copy they have in memory) without having to lookup the KV store each time.

Components use a [key-value store]({{<relref "#key-value-store" >}}) or [gossip]({{<relref "#gossip" >}}) to share the hash ring data structure.
For more information, refer to the [About the hash ring]({{<relref "./architecture/about-the-hash-ring.md" >}}).

## Key-value store

A key-value store is a database that associates keys with values.
To understand how Grafana Mimir uses key-value stores, refer to [About the key-value store]({{<relref "./architecture/about-the-key-value-store.md" >}}).

## Org

Refer to [Tenant]({{<relref "#tenant" >}}).

## Ring

Refer to [Hash ring]({{<relref "#hash-ring" >}}).

## Sample

A sample is a single timestamped value in a time series.

Given the series `node_cpu_seconds_total{instance="10.0.0.1",mode="system"}` its stream of samples may look like:

```
# Display format: <value> @<timestamp>
11775 @1603812134
11790 @1603812149
11805 @1603812164
11819 @1603812179
11834 @1603812194
```

## Series

A series is a single stream of [samples]({{<relref "#sample" >}}) belonging to the same metric, with the same set of label key-value pairs.

Given a single metric `node_cpu_seconds_total` you may have multiple series, each one uniquely identified by the combination of metric name and unique label key-value pairs:

```
node_cpu_seconds_total{instance="10.0.0.1",mode="system"}
node_cpu_seconds_total{instance="10.0.0.1",mode="user"}
node_cpu_seconds_total{instance="10.0.0.2",mode="system"}
node_cpu_seconds_total{instance="10.0.0.2",mode="user"}
```

## Tenant

A tenant is the owner of a set of series written to and queried from Grafana Mimir.
Grafana Mimir isolates series and alerts belonging to different tenants.
To understand how Grafana Mimir authenticates tenants, refer to [About authentication and authorization]({{<relref "./about-authentication-and-authorization.md" >}}).

## Time series

Refer to [Series]({{<relref "#series" >}}).

## User

Refer to [Tenant]({{<relref "#tenant" >}}).

## Write-ahead log (WAL)

The write-ahead Log (WAL) is an append only log stored on disk by ingesters to recover their in-memory state after the process gets restarted.
For more information, refer to [The write path]({{<relref "./architecture/_index.md#the-write-path" >}}).