Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Ruler, introduce MultiTenantManager interface #3019

Merged
merged 15 commits into from
Aug 17, 2020

Conversation

annanay25
Copy link
Contributor

@annanay25 annanay25 commented Aug 12, 2020

What this PR does:
This PR refactors ruler package by introducing a MultiTenantManager interface. This is done for multiple reasons.

First, it splits out the Ruler struct into distinct components, and the Ruler is seen as a module that ties together the rule storage, ring for sharding and Manager for evaluation and notification on the rules.

+---------------------------------------------------------------+
|                                                               |
|                   Query       +-------------+                 |
|            +------------------>             |                 |
|            |                  |    Store    |                 |
|            | +----------------+             |                 |
|            | |     Rules      +-------------+                 |
|            | |                                                |
|            | |                                                |
|            | |                                                |
|       +----+-v----+   Filter  +------------+                  |
|       |           +----------->            |                  |
|       |   Ruler   |           |    Ring    |                  |
|       |           <-----------+            |                  |
|       +-------+---+   Rules   +------------+                  |
|               |                                               |
|               |                                               |
|               |                                               |
|               |    Load      +-----------------+              |
|               +-------------->                 |              |
|                              |     Manager     |              |
|                              |                 |              |
|                              +-----------------+              |
|                                                               |
+---------------------------------------------------------------+

Secondly, it allows us to add custom functionality into the Ruler depending on custom fields present in the RuleGroup description, by overriding the DefaultMultiTenantManager. For ex: This brings flexibility in using different appenders based on special config fields present in the RuleGroup definition.

Which issue(s) this PR fixes:
Fixes #

Checklist
Will drop draft status once:

  • Bump prometheus and thanos to master #3000 is merged (this PR is built on top of that one)
  • Tests updated
  • Documentation added
  • NA CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

/cc @jtlisi

yeya24 and others added 8 commits August 12, 2020 16:01
Signed-off-by: Ben Ye <yb532204897@gmail.com>
Signed-off-by: Ben Ye <yb532204897@gmail.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
@annanay25 annanay25 marked this pull request as ready for review August 12, 2020 16:58
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good job! I patiently went through the diff and looks a no-op refactoring to me, except for the few comments I left.

pkg/ruler/ruler_test.go Outdated Show resolved Hide resolved
pkg/ruler/manager.go Outdated Show resolved Hide resolved
pkg/ruler/ruler.go Show resolved Hide resolved
pkg/ruler/manager.go Outdated Show resolved Hide resolved
pkg/ruler/manager.go Outdated Show resolved Hide resolved
annanay25 and others added 2 commits August 17, 2020 12:30
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left few nit-picky comments, but overall LGTM.

pkg/ruler/ruler.go Outdated Show resolved Hide resolved
pkg/ruler/manager.go Outdated Show resolved Hide resolved
pkg/ruler/manager.go Outdated Show resolved Hide resolved
pkg/ruler/manager.go Outdated Show resolved Hide resolved
Signed-off-by: Annanay <annanayagarwal@gmail.com>
@annanay25
Copy link
Contributor Author

Thanks for the reviews, @pracucci & @pstibrany!

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -149,59 +139,64 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.DurationVar(&cfg.ResendDelay, "ruler.resend-delay", time.Minute, `Minimum amount of time to wait before resending an alert to Alertmanager.`)
}

// MultiTenantManager is the interface of interaction with a Manager that is tenant aware.
type MultiTenantManager interface {
// SyncRuleGroups is used to sync the Manager with rules from the RuleStore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next PR, we can mention that ruleGroups is per tenant.

// GetRules fetches rules for a particular tenant (userID).
GetRules(userID string) []*promRules.Group
// Stop stops all Manager components.
Stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing Stop method in the generic interface looks like an anti-pattern to me. As Ruler is not creating the manager, I don't think it should be one stopping it either.

(But that can be addressed in another PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we need the Stop() function but I had a feeling the other function should be named StartOrSyncRuleGroups(). That sounds a little long, but do you have any other suggestions?

@@ -149,59 +139,64 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.DurationVar(&cfg.ResendDelay, "ruler.resend-delay", time.Minute, `Minimum amount of time to wait before resending an alert to Alertmanager.`)
}

// MultiTenantManager is the interface of interaction with a Manager that is tenant aware.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// MultiTenantManager is the interface of interaction with a Manager that is tenant aware.
// MultiTenantManager interface describes interaction between Ruler, that periodically
// loads rules from the store, and object that keeps rules in memory for each tenant.

It sounds to me like "multi-tenant aware rules manager" :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and object that keeps rules in memory for each tenant

The DefaultMultiTenantManager uses a mapper to write rules to a file and passes file path to the Prometheus Manager. Maybe we can modify this to

// MultiTenantManager interface describes interaction between Ruler, that periodically 
// loads rules from the store, and object that syncs rules to the Prometheus manager
// for every tenant.

@pstibrany pstibrany merged commit 757b9d1 into cortexproject:master Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants