-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metrics backfill support to mimirtool #1822
Conversation
2ff118e
to
f07fac1
Compare
6a36db9
to
eed40f0
Compare
55984f3
to
dd1d8f4
Compare
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
dd1d8f4
to
b127814
Compare
pkg/mimirtool/client/backfill.go
Outdated
} | ||
|
||
// Upload each block file | ||
if err := filepath.WalkDir(dpath, func(pth string, e fs.DirEntry, err error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we gather files to upload first, and then upload them concurrently to speed up the process?
pkg/mimirtool/client/backfill.go
Outdated
} | ||
for _, e := range es { | ||
if err := c.backfillBlock(ctx, filepath.Join(source, e.Name()), logger); err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to react on various errors differently, and keep going even if some blocks fail to upload.
We should give user a summary at the end of how many blocks uploaded correctly, how many already existed, and how many failed to upload.
Final exit code should depend on this too. If all blocks uploaded correctly or already existed, we can report exit code 0 (all good!). If there were some non-recoverable upload errors, let's use exit code 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pstibrany do you know how to make the Kingpin framework exit with a certain code (e.g. 1)? Should I just implement this by returning an error if one or more blocks failed to upload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I would try the same (returning error) and if that fails, check Kingpin framwork documentation :)
pkg/mimirtool/client/client.go
Outdated
@@ -104,20 +104,19 @@ func New(cfg Config) (*MimirClient, error) { | |||
|
|||
// Query executes a PromQL query against the Mimir cluster. | |||
func (r *MimirClient) Query(ctx context.Context, query string) (*http.Response, error) { | |||
|
|||
query = fmt.Sprintf("query=%s&time=%d", query, time.Now().Unix()) | |||
escapedQuery := url.PathEscape(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This escaping is wrong for query string. We should use "url.QueryEscape", and apply it on each query parameter individually. Escaping whole query string is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pstibrany thanks, fixed. It's an old problem though, code isn't introduced by my PR :) PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Let's move that to separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
5a37110
to
b8e35bb
Compare
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
I've fixed logging (#1822 (comment)), so now
In case of errors:
|
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
…gration test. Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I mostly focused on UX, and I left few comments I would be glad if you could take a look at. Other than comments:
- We need a CHANGELOG entry.
- We should update the doc at
docs/sources/operators-guide/tools/mimirtool.md
. Not required to be done in this PR, but I just want to make sure we'll update it.
pkg/mimirtool/commands/backfill.go
Outdated
cmd := app.Command("backfill", "Upload metrics blocks to Grafana Mimir.") | ||
cmd.Action(c.backfill) | ||
cmd.Arg("block", "block to upload").Required().SetValue(&c.blocks) | ||
cmd.Flag("address", "Address of the Grafana Mimir cluster").Required().StringVar(&c.clientConfig.Address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which component specifically? I would be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should. The fact that the endpoint is handled by compactor is not relevant for mimirtool
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel strong, but an OSS user deploying in microservices mode (e.g. using Helm) would be facilitated if we just tell them to which microservice the request should be sent to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should include the compactor endpoint in default nginx configuration instead?
I checked what we do in other commands ("rules", "alerts") and we do mention "ruler" or "Alertmanager" in those, although not in address
option. What about we modify main backfill --help
description:
Upload Prometheus TSDB blocks to Grafana Mimir compactor.
WDYT?
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, LGTM! I would still reconsider the -user
CLI flag (see dedicated comment).
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Documentation PR: #2481 |
What this PR does
Add support for metrics backfill to mimirtool.
TODOs:
(Prometheus/Thanos/Mimir)Which issue(s) this PR fixes or relates to
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]