Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruler: skips rule files based on their names without any error/warning log. #4232

Closed
ffilippopoulos opened this issue May 13, 2021 · 5 comments · Fixed by #4468
Closed

Ruler: skips rule files based on their names without any error/warning log. #4232

ffilippopoulos opened this issue May 13, 2021 · 5 comments · Fixed by #4468

Comments

@ffilippopoulos
Copy link

Thanos version used:

thanos image: quay.io/thanos/thanos:v0.17.2

What happened:

We run thanos-rule with the following flag:

--rule-file=/var/thanos/rules/*.yaml

to load rules from all the yaml files we put under the above location.
We use kustomize configMapGenerator to create a configMap with the needed files:

configMapGenerator:
  - name: alerts
    files:
      - accounting.yaml=resources/alerts/accounting.yaml
      - billing.yaml=resources/alerts/billing.yaml
      - cbc.yaml=resources/alerts/cbc.yaml
      - partner-data.yaml=resources/alerts/partner-data.yaml
      - partner.yaml=resources/alerts/partner.yaml
      - payments.yaml=resources/alerts/payments.yaml
      - prometheus.yaml=resources/alerts/prometheus.yaml
      - unicom.yaml=resources/alerts/unicom.yaml

and mount it under the passed location:

        volumeMounts:
        - mountPath: /var/thanos/rules
          name: rules
      volumes:
      - configMap:
          defaultMode: 420
          name: alerts-845c2ttgdg
        name: rules

As a result, we can see all the rule files mounted in the pods:

$ kubectl --context=dev-merit --namespace=sys-mon exec -ti thanos-rule-56f565b58b-7lvnw -- ls /var/thanos/rules
accounting.yaml    partner-data.yaml  prometheus.yaml
billing.yaml       partner.yaml       unicom.yaml
cbc.yaml           payments.yaml

Moreover, thanos ruler is able to see all 8 files on startup:

level=info ts=2021-05-11T16:15:45.776401996Z caller=rule.go:836 component=rules msg="reload rule files" numFiles=8

Thanos only loads rules from 6/8 files. In particular, the following:

/var/thanos/rules/accounting.yaml
/var/thanos/rules/billing.yaml
/var/thanos/rules/cbc.yaml
/var/thanos/rules/payments.yaml
/var/thanos/rules/prometheus.yaml
/var/thanos/rules/unicom.yaml

The end result is that we are missing rules defined in 2 yaml files.

What you expected to happen:

Load all the rules, or give an error on why it failed to do so.

How to reproduce it (as minimally and precisely as possible):

Following our example should be sufficient here.

Full logs to relevant components:

No useful logs from thanos regarding this one. We were profiling the issue via trying different sets of our rule files.

Anything else we need to know:

A workaround is to prefix all mounted files with _, which seems to do the trick and allows the ruler to load all our rules fine.

In particular, we believe that this behaviour is a result of:

newFn := filepath.Join(m.workDir, s.String(), strings.TrimLeft(fn, m.workDir))
and the way file names are trimmed to be written into tmp dir.

Environment:
Kube version: v1.21.0
OS: Flatcar Container Linux by Kinvolk 2765.2.3 (Oklo)
kernel: 5.10.32-flatcar
container runtime: containerd://1.4.4

@KamalSinghKhanna
Copy link

KamalSinghKhanna commented Jun 1, 2021

I would like to work on this issue suggest me way to proceed
do we have to write Tmp instead of Trim in this line
"newFn := filepath.Join(m.workDir, s.String(), strings.TrimLeft(fn, m.workDir))"

@matej-g
Copy link
Collaborator

matej-g commented Jul 19, 2021

@wiardvanrij I'd take this one if it's not being worked on

@daiwei233
Copy link

daiwei233 commented Jul 19, 2021

@wiardvanrij I'd take this one if it's not being worked on

Solution

Hi, I'm trying to fix it out.
And change this:

newFn := filepath.Join(m.workDir, s.String(), strings.TrimLeft(fn, m.workDir))

newFn := filepath.Join(m.workDir, s.String(), strings.TrimLeft(fn, m.workDir))
to
newFn := filepath.Join(m.workDir, s.String(), filepath.Base(fn))

What do you think?

reappear

The bug reappear:

package main

import (
	"fmt"
	"strings"
)

func main() {
	// source code:
	// newFn := filepath.Join(m.workDir, s.String(), strings.TrimLeft(fn, m.workDir))
	// s.String() = "ABORT"
	fn := "/opt/programs/thanos-0.19.0.linux-amd64/rules/test.yml"             // fn
	fn1 := "/opt/programs/thanos-0.19.0.linux-amd64/rules/test1.yml"           // fn1
	workdir := "/opt/programs/thanos-0.19.0.linux-amd64/ruler_data/.tmp-rules" // m.workDir

	fmt.Println(strings.TrimLeft(fn, workdir))
	fmt.Println(strings.TrimLeft(fn1, workdir))
}

And they have the same output:

yml
yml

daiwei233 pushed a commit to daiwei233/thanos that referenced this issue Jul 20, 2021
Signed-off-by: weidai218169 <weidai218169@sohu-inc.com>
@matej-g
Copy link
Collaborator

matej-g commented Jul 22, 2021

Hey @daiwei233, please have a look at my proposed solution at #4468 and let me know your thoughts!

I think best way here is to not trim / alter the filename, since the trimming does not seem to have the desired behavior and leaving the full filename in does not seem to pose an issue (I'm wondering what was the original intention behind using TrimLeft). With using filepath.Base, I think there is a potential for conflict between filenames in different sub-directories, if they include files with identical names.

@daiwei233
Copy link

Hey @daiwei233, please have a look at my proposed solution at #4468 and let me know your thoughts!

I think best way here is to not trim / alter the filename, since the trimming does not seem to have the desired behavior and leaving the full filename in does not seem to pose an issue (I'm wondering what was the original intention behind using TrimLeft). With using filepath.Base, I think there is a potential for conflict between filenames in different sub-directories, if they include files with identical names.

You are right, i think it will be work.#4468

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants