Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics backfill support to mimirtool #1822

Merged
merged 73 commits into from
Jul 20, 2022
Merged

Conversation

aknuds1
Copy link
Contributor

@aknuds1 aknuds1 commented May 5, 2022

What this PR does

Add support for metrics backfill to mimirtool.

TODOs:

  • Take individual blocks instead of a directory of blocks
  • Write integration tests that spin up Mimir and try to upload different blocks (Prometheus/Thanos/Mimir)

Which issue(s) this PR fixes or relates to

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@aknuds1 aknuds1 added the enhancement New feature or request label May 5, 2022
@aknuds1 aknuds1 force-pushed the feat/backfill-mimirtool branch 2 times, most recently from 6a36db9 to eed40f0 Compare June 2, 2022 13:40
@CLAassistant
Copy link

CLAassistant commented Jun 2, 2022

CLA assistant check
All committers have signed the CLA.

@aknuds1 aknuds1 force-pushed the feat/backfill-mimirtool branch 4 times, most recently from 55984f3 to dd1d8f4 Compare June 24, 2022 06:22
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
}

// Upload each block file
if err := filepath.WalkDir(dpath, func(pth string, e fs.DirEntry, err error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we gather files to upload first, and then upload them concurrently to speed up the process?

}
for _, e := range es {
if err := c.backfillBlock(ctx, filepath.Join(source, e.Name()), logger); err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to react on various errors differently, and keep going even if some blocks fail to upload.

We should give user a summary at the end of how many blocks uploaded correctly, how many already existed, and how many failed to upload.

Final exit code should depend on this too. If all blocks uploaded correctly or already existed, we can report exit code 0 (all good!). If there were some non-recoverable upload errors, let's use exit code 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pstibrany do you know how to make the Kingpin framework exit with a certain code (e.g. 1)? Should I just implement this by returning an error if one or more blocks failed to upload?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I would try the same (returning error) and if that fails, check Kingpin framwork documentation :)

@@ -104,20 +104,19 @@ func New(cfg Config) (*MimirClient, error) {

// Query executes a PromQL query against the Mimir cluster.
func (r *MimirClient) Query(ctx context.Context, query string) (*http.Response, error) {

query = fmt.Sprintf("query=%s&time=%d", query, time.Now().Unix())
escapedQuery := url.PathEscape(query)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This escaping is wrong for query string. We should use "url.QueryEscape", and apply it on each query parameter individually. Escaping whole query string is wrong.

Copy link
Contributor Author

@aknuds1 aknuds1 Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pstibrany thanks, fixed. It's an old problem though, code isn't introduced by my PR :) PTAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Let's move that to separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
aknuds1 and others added 6 commits July 15, 2022 13:02
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
@pstibrany
Copy link
Member

I've fixed logging (#1822 (comment)), so now mimirtool backfill provides this output:

INFO[0000] Backfilling                                   blocks="/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7,/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE,/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8,/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ" user=anonymous
INFO[0000] making request to start block upload          block=01G7RCEZSF6H4J9BD8JFYQP7R7 file=meta.json path=/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7
INFO[0000] uploading block file                          block=01G7RCEZSF6H4J9BD8JFYQP7R7 file=index path=/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7 size=283551
INFO[0000] uploading block file                          block=01G7RCEZSF6H4J9BD8JFYQP7R7 file=chunks/000001 path=/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7 size=6294804
INFO[0000] block uploaded successfully                   block=01G7RCEZSF6H4J9BD8JFYQP7R7 path=/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7
INFO[0000] making request to start block upload          block=01G7TAB1MA64QV9DSBVRP4WFNE file=meta.json path=/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE
INFO[0000] uploading block file                          block=01G7TAB1MA64QV9DSBVRP4WFNE file=index path=/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE size=274240
INFO[0000] uploading block file                          block=01G7TAB1MA64QV9DSBVRP4WFNE file=chunks/000001 path=/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE size=6452725
INFO[0001] block uploaded successfully                   block=01G7TAB1MA64QV9DSBVRP4WFNE path=/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE
INFO[0001] making request to start block upload          block=01G7W908TRFGK9KV37RQSBTBB8 file=meta.json path=/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8
INFO[0001] uploading block file                          block=01G7W908TRFGK9KV37RQSBTBB8 file=index path=/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8 size=276571
INFO[0001] uploading block file                          block=01G7W908TRFGK9KV37RQSBTBB8 file=chunks/000001 path=/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8 size=5507840
INFO[0002] block uploaded successfully                   block=01G7W908TRFGK9KV37RQSBTBB8 path=/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8
INFO[0002] making request to start block upload          block=01G7Y5VFKEHXRWAFZ4GXTTVCPQ file=meta.json path=/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ
INFO[0002] uploading block file                          block=01G7Y5VFKEHXRWAFZ4GXTTVCPQ file=index path=/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ size=244971
INFO[0002] uploading block file                          block=01G7Y5VFKEHXRWAFZ4GXTTVCPQ file=chunks/000001 path=/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ size=4917324
INFO[0002] block uploaded successfully                   block=01G7Y5VFKEHXRWAFZ4GXTTVCPQ path=/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ
INFO[0002] finished uploading blocks                     already_exists=0 failed=0 succeeded=4

In case of errors:

INFO[0000] Backfilling                                   blocks="/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7,/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE,/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8,/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ" user=anonymous
INFO[0000] making request to start block upload          block=01G7RCEZSF6H4J9BD8JFYQP7R7 file=meta.json path=/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7
ERRO[0000] server returned HTTP status 400 Bad Request: block upload is disabled  status="400 Bad Request"
ERRO[0000] failed uploading block                        error="request to start block upload failed: POST request to http://localhost:8006/api/v1/upload/block/01G7RCEZSF6H4J9BD8JFYQP7R7 failed: server returned HTTP status 400 Bad Request: block upload is disabled" path=/opt/homebrew/var/prometheus/01G7RCEZSF6H4J9BD8JFYQP7R7
INFO[0000] making request to start block upload          block=01G7TAB1MA64QV9DSBVRP4WFNE file=meta.json path=/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE
ERRO[0000] server returned HTTP status 400 Bad Request: block upload is disabled  status="400 Bad Request"
ERRO[0000] failed uploading block                        error="request to start block upload failed: POST request to http://localhost:8006/api/v1/upload/block/01G7TAB1MA64QV9DSBVRP4WFNE failed: server returned HTTP status 400 Bad Request: block upload is disabled" path=/opt/homebrew/var/prometheus/01G7TAB1MA64QV9DSBVRP4WFNE
INFO[0000] making request to start block upload          block=01G7W908TRFGK9KV37RQSBTBB8 file=meta.json path=/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8
ERRO[0000] server returned HTTP status 400 Bad Request: block upload is disabled  status="400 Bad Request"
ERRO[0000] failed uploading block                        error="request to start block upload failed: POST request to http://localhost:8006/api/v1/upload/block/01G7W908TRFGK9KV37RQSBTBB8 failed: server returned HTTP status 400 Bad Request: block upload is disabled" path=/opt/homebrew/var/prometheus/01G7W908TRFGK9KV37RQSBTBB8
INFO[0000] making request to start block upload          block=01G7Y5VFKEHXRWAFZ4GXTTVCPQ file=meta.json path=/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ
ERRO[0000] server returned HTTP status 400 Bad Request: block upload is disabled  status="400 Bad Request"
ERRO[0000] failed uploading block                        error="request to start block upload failed: POST request to http://localhost:8006/api/v1/upload/block/01G7Y5VFKEHXRWAFZ4GXTTVCPQ failed: server returned HTTP status 400 Bad Request: block upload is disabled" path=/opt/homebrew/var/prometheus/01G7Y5VFKEHXRWAFZ4GXTTVCPQ
INFO[0000] finished uploading blocks                     already_exists=0 failed=4 succeeded=0
mimirtool: error: blocks failed to upload 4 block(s), try --help

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
…gration test.

Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
@pstibrany pstibrany marked this pull request as ready for review July 19, 2022 08:44
@pstibrany pstibrany requested a review from pracucci July 19, 2022 08:44
@pstibrany pstibrany changed the title WIP: Add metrics backfill support to mimirtool Add metrics backfill support to mimirtool Jul 19, 2022
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! I mostly focused on UX, and I left few comments I would be glad if you could take a look at. Other than comments:

  • We need a CHANGELOG entry.
  • We should update the doc at docs/sources/operators-guide/tools/mimirtool.md. Not required to be done in this PR, but I just want to make sure we'll update it.

pkg/mimirtool/commands/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/commands/backfill.go Outdated Show resolved Hide resolved
cmd := app.Command("backfill", "Upload metrics blocks to Grafana Mimir.")
cmd.Action(c.backfill)
cmd.Arg("block", "block to upload").Required().SetValue(&c.blocks)
cmd.Flag("address", "Address of the Grafana Mimir cluster").Required().StringVar(&c.clientConfig.Address)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which component specifically? I would be more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should. The fact that the endpoint is handled by compactor is not relevant for mimirtool.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strong, but an OSS user deploying in microservices mode (e.g. using Helm) would be facilitated if we just tell them to which microservice the request should be sent to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should include the compactor endpoint in default nginx configuration instead?

I checked what we do in other commands ("rules", "alerts") and we do mention "ruler" or "Alertmanager" in those, although not in address option. What about we modify main backfill --help description:

Upload Prometheus TSDB blocks to Grafana Mimir compactor.

WDYT?

pkg/mimirtool/commands/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/commands/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Show resolved Hide resolved
pkg/mimirtool/client/backfill.go Outdated Show resolved Hide resolved
integration/backfill_test.go Outdated Show resolved Hide resolved
pstibrany and others added 7 commits July 19, 2022 15:13
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, LGTM! I would still reconsider the -user CLI flag (see dedicated comment).

pkg/mimirtool/commands/backfill.go Show resolved Hide resolved
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
@pstibrany
Copy link
Member

Documentation PR: #2481

@pstibrany pstibrany merged commit 96a9824 into main Jul 20, 2022
@pstibrany pstibrany deleted the feat/backfill-mimirtool branch July 20, 2022 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants