Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump prometheus and thanos to master #3000

Merged
merged 4 commits into from
Aug 12, 2020

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented Aug 8, 2020

Signed-off-by: Ben Ye yb532204897@gmail.com

What this PR does:

This pr updates Prometheus and Thanos version to the current master, required by this pr

Update Prometheus to prometheus/prometheus@2899773.

Update Thanos to thanos-io/thanos@9b578af.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pstibrany
Copy link
Contributor

Cortex changes look good to me.

@yeya24
Copy link
Contributor Author

yeya24 commented Aug 10, 2020

Cortex changes look good to me.

Thanks for the review! Do you think a changelog entry is needed?

@pstibrany
Copy link
Contributor

Cortex changes look good to me.

Thanks for the review! Do you think a changelog entry is needed?

I don't think so in this case. There are some new Thanos metrics (thanos_memcached_client_info, thanos_memcached_operation_data_size_bytes), that will get exposed by Cortex, but I don't think we need to document them.

Copy link
Contributor

@annanay25 annanay25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

@@ -63,7 +63,7 @@ type PusherAppendable struct {
}

// Appender returns a storage.Appender
func (t *PusherAppendable) Appender() storage.Appender {
func (t *PusherAppendable) Appender(_ context.Context) storage.Appender {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should propagate context from here to Pusher.Push(). It will also help to remove userID parameter from pusherAppender struct. We can encode the userID into context, in ruler.go here

return r.managerFactory(ctx, userID, notifier, logger, reg), nil

.. and extract userID from context in this file as part of DefaultTenantManagerFactory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will also help to remove userID parameter from pusherAppender struct. We can encode the userID into context, in ruler.go here

In other areas of the code, we prefer to pass userID explicitly, and not hide it into Context. But Cortex codebase is definitely not consistent about it :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks @pstibrany! Would you prefer we separate it out? Its definitely a by-product in this PR, its more important we pass context along :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you prefer we separate it out?

Yes, I think it would be bit cleaner approach. Thanks! (And sorry for not reacting earlier)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries @pstibrany :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's be explicit here and pass userID as parameter.

ctx = user.InjectUserID(ctx, userID)
manager := r.managerFactory(ctx, notifier, logger, reg)
if manager == nil {
return nil, errors.New("unable to create managerFactory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/managerFactory/manager :)

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this PR! I'm personally a bit worried about the big refactoring in TSDB, but we internally agreed to move on with this PR and eventually take precautions when deploying it to production.

Could you add a CHANGELOG entry like this, please?

- [CHANGE] Experimental blocks storage: the `operation` label value `getrange` has changed into `get_range` for the metrics `thanos_store_bucket_cache_operation_requests_total` and `thanos_store_bucket_cache_operation_hits_total`. #3000

FYI, other then Prometheus and Thanos, I also checked these direct dependencies upgrade:

  • github.com/aws/aws-sdk-go v1.33.5 to v1.33.12: checked the CHANGELOG, no change affecting Cortex

@pracucci
Copy link
Contributor

Please be aware of this issue causing segfault. Not sure Prometheus we could be affected as well, but worth to keep an eye on it: prometheus/prometheus#7776

@yeya24 yeya24 force-pushed the bump-prometheus-thanos branch 2 times, most recently from c9b7bbb to 74f17f9 Compare August 12, 2020 02:25
@yeya24
Copy link
Contributor Author

yeya24 commented Aug 12, 2020

Sorry for the delay, I just updated the pr and update Prometheus version to current master. Please help review and let me know if I need to miss something.

@yeya24 yeya24 force-pushed the bump-prometheus-thanos branch 2 times, most recently from 4775778 to 24ccacd Compare August 12, 2020 03:03
yeya24 and others added 3 commits August 12, 2020 16:01
Signed-off-by: Ben Ye <yb532204897@gmail.com>
Signed-off-by: Ben Ye <yb532204897@gmail.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
@pracucci
Copy link
Contributor

@yeya24 FYI I rebased master and fixed the CHANGELOG entry ordering in this commit.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated the issue prometheus/prometheus#7776 and I believe it's not a problem for us because, looking at the code, the issue should just affect the usage of db.ChunkQuerier() when querying the TSDB head, but we don't use it yet in any of our code paths (Cortex or Thanos code we do rely on).

@yeya24
Copy link
Contributor Author

yeya24 commented Aug 12, 2020

@yeya24 FYI I rebased master and fixed the CHANGELOG entry ordering in this commit.

Awesome! Thanks for the help.

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall change looks good to me.

I would still prefer to revert change that pushes injecting of userID into context to upper layer, and instead keep dependency on userID as explicit as possible.

@pracucci
Copy link
Contributor

I run a local stress test for about 30 minutes. No issue found.

@pracucci
Copy link
Contributor

I would still prefer to revert change that pushes injecting of userID into context to upper layer, and instead keep dependency on userID as explicit as possible.

I take care of that so we can proceed merging it and unblocking several people (@annanay25 included).

Signed-off-by: Marco Pracucci <marco@pracucci.com>
@pracucci pracucci merged commit a4aad5d into cortexproject:master Aug 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants