Skip to content

Commit

Permalink
Documented ingeter migration to spread-minimizing tokens (#7174)
Browse files Browse the repository at this point in the history
* Documented ingeter migration to spread-minimizing tokens

Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>

* Review and edits from KN

* Fixing review findings

Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>

---------

Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Co-authored-by: Kim Nylander <kim.nylander@grafana.com>
  • Loading branch information
duricanikolic and knylander-grafana committed Jan 22, 2024
1 parent b1f3137 commit 911fd26
Show file tree
Hide file tree
Showing 3 changed files with 108 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
aliases:
- ../operators-guide/configure/configure-spread-minimizing-tokens/
description: Learn how to migrate ingesters to spread-minimizing tokens.
weight: 115
menuTitle: Spread-minimizing tokens
title: Migrate ingesters to spread-minimizing tokens
---

# Migrate ingesters to spread-minimizing tokens

Using this guide, you can configure Mimir's ingesters to use the _spread-minimizing token generation strategy_.

## Before you begin

The ingester time series replication should be [configured with enabled zone-awareness](https://grafana.com/docs/mimir/latest/configure/configure-zone-aware-replication/#configuring-ingester-time-series-replication).

{{% admonition type="note" %}}Spread-mimizing tokens are recommended if [shuffle sharding](https://grafana.com/docs/mimir/latest/configure/configure-shuffle-sharding/#ingesters-shuffle-sharding) is disabled in your ingesters, or, if shuffle-sharding is enabled, but most of the tenants of your system use all available ingesters.
{{% /admonition %}}

For simplicity, let’s assume that there are three configured availability zones named `zone-a`, `zone-b`, and `zone-c`. Migration to the _spread-minimizing token generation strategy_ is a complex process performed zone by zone to prevent any data loss.

## Step 1: Disable write requests to ingesters from `zone-a`

To disable write requests, configure the following flag on the distributor and the ruler:

```
-ingester.ring.excluded-zones=zone-a
```

Before proceeding to the next step, use the following query to ensure that there are no write requests to the ingesters from `zone-a`:

```
sum by(route) (
rate(
cortex_request_duration_seconds_count{
namespace="<your-namespace>",
container="ingester",
pod=~"ingester-zone-a-.*",
route="/cortex.Ingester/Push"}[5m]
)
)
```

You should see something like this:

![No More Write Requests](no-more-write-requests.png)

## Step 2: Shut down ingesters from `zone-a`

Next, ensure that all the in-memory series of all ingesters from `zone-a` have been flushed to long-term storage, as well as that the ingesters from `zone-a` have been forgotten from the ring.
To do this, invoke the [/ingester/shutdown](https://github.com/grafana/mimir/blob/main/docs/sources/mimir/references/http-api/index.md#shutdown) endpoint on all the ingesters from `zone-a`.

Before proceeding to the next step, ensure that all the calls completed successfully completed.

## Step 3: Enable spread-minimizing tokens for ingesters in `zone-a`

Configure the following flags on the ingesters from `zone-a`:

```
-ingester.ring.tokens-file-path=
-ingester.ring.token-generation-strategy=spread-minimizing
-ingester.ring.spread-minimizing-zones=zone-a,zone-b,zone-c
```

{{% admonition type="note" %}}
The example uses `zone-a,zone-b,zone-c` to denote a comma-separated list of configured availability zones.
{{% /admonition %}}

Before proceeding to the next step, ensure that all the ingester pods related to `zone-a` are up and running with the new configuration.

### Optional step: In-order registration of ingesters

Mimir can force the ring to perform an in-order registration of ingesters. When this feature is enabled, an ingester can register its tokens within the ring only after all previous ingesters (with ID lower than its own ID) have already been registered.
This feature minimizes a probability that a write request that should be handled by an ingester actually arrives to the ring before the ingester is registered within the ring. In this case, the request gets handled by another ingester.
This situation could introduce some deviation from an optimal load distribution.

To configure this capability:

```
-ingester.ring.spread-minimizing-join-ring-in-order=true
```

## Step 4: Re-enable write requests to ingesters from `zone-a`

To re-enable write requests, revert [link Step 1: Disable write requests to ingesters from `zone-a`](#step-1-disable-write-requests-to-ingesters-from-zone-a).

At this point, you can check the number of in-memory time series of ingesters from `zone-a` using the following query:

```
sum by(pod) (
cortex_ingester_memory_series{
namespace="<your-namespace>",
pod=~"ingester-zone-a-.*"}
)
```

If everything went smoothly, you should see something like this:

![Successful migration of ingesters from a zone](migration-of-a-zone.png)

## Step 5: Migrate ingesters from `zone-b`

Repeat steps 1 to 4, replacing all the occurrences of `zone-a` with `zone-b`.

## Step 6: migrate ingesters from `zone-c`

Repeat steps 1 to 4, replacing all the occurrences of `zone-a` with `zone-c`.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 911fd26

Please sign in to comment.