Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test TestRulerEvaluationDelay #4857

Open
56quarters opened this issue Apr 27, 2023 · 8 comments
Open

Flaky test TestRulerEvaluationDelay #4857

56quarters opened this issue Apr 27, 2023 · 8 comments

Comments

@56quarters
Copy link
Contributor

https://github.com/grafana/mimir/actions/runs/4817437976/jobs/8578292294?pr=4855

 === RUN   TestRulerEvaluationDelay
08:21:12 Starting minio-9000
08:21:13 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:33076]
08:21:13 Started minio-9000
08:21:13 Starting mimir
08:21:14 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:33077]
08:21:14 Started mimir
    ruler_test.go:312: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:312
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
08:21:27 Killing mimir
08:21:27 Killing minio-9000
08:21:27 Killed minio-9000
08:21:27 Killed mimir
08:21:27 error while removing sharedDir /home/runner/work/mimir/mimir/integration/e2e_integration_test2407693794 err: unlinkat /home/runner/work/mimir/mimir/integration/e2e_integration_test2407693794/rule_tmp/anonymous/ns: permission denied
--- FAIL: TestRulerEvaluationDelay (15.03s)
@zenador
Copy link
Contributor

zenador commented Oct 19, 2023

https://github.com/grafana/mimir/actions/runs/6570005457/job/17846842247?pr=6391

=== RUN   TestRulerEvaluationDelay
05:04:05 Starting minio-9000
05:04:06 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:33051]
05:04:06 Started minio-9000
05:04:06 Starting mimir
05:04:07 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:33052]
05:04:07 Started mimir
    ruler_test.go:312: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:312
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
05:04:20 Killing mimir
05:04:20 Killing minio-9000
05:04:20 Killed minio-9000
05:04:20 Killed mimir
05:04:20 error while removing sharedDir /home/runner/work/mimir/mimir/integration/e2e_integration_test2060902976 err: unlinkat /home/runner/work/mimir/mimir/integration/e2e_integration_test2060902976/rule_tmp/anonymous/ns: permission denied
--- FAIL: TestRulerEvaluationDelay (15.46s)

@charleskorn
Copy link
Contributor

charleskorn commented Oct 24, 2023

Another example:

=== RUN   TestRulerEvaluationDelay
09:40:41 Starting minio-9000
09:40:42 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:33050]
09:40:42 Started minio-9000
09:40:42 Starting mimir
09:40:42 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:33051]
09:40:42 Started mimir
    ruler_test.go:312: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:312
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
09:40:55 Killing mimir
09:40:55 Killing minio-9000
09:40:55 Killed minio-9000
09:40:55 Killed mimir
09:40:55 error while removing sharedDir /home/runner/work/mimir/mimir/integration/e2e_integration_test2458340898 err: unlinkat /home/runner/work/mimir/mimir/integration/e2e_integration_test2458340898/rule_tmp/anonymous/ns: permission denied
--- FAIL: TestRulerEvaluationDelay (14.20s)

@zenador
Copy link
Contributor

zenador commented Oct 31, 2023

Again

    ruler_test.go:312: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:312
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay

@dimitarvdimitrov
Copy link
Contributor

Another instance

=== RUN   TestRulerEvaluationDelay
16:51:01 Starting minio-9000
16:51:01 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:32987]
16:51:01 Started minio-9000
16:51:01 Starting mimir
16:51:02 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:32988]
16:51:02 Started mimir
16:51:03 mimir: ts=2023-12-04T16:51:03.672636524Z caller=loader.go:116 level=warn msg="bucket index not found" user=anonymous
    ruler_test.go:312: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:312
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
16:51:15 Killing mimir
16:51:15 Killing minio-9000
16:51:15 Killed minio-9000
16:51:15 Killed mimir
16:51:16 error while removing sharedDir /home/runner/work/mimir/mimir/integration/e2e_integration_test3902039097 err: unlinkat /home/runner/work/mimir/mimir/integration/e2e_integration_test3902039097/rule_tmp/anonymous/ns: permission denied
--- FAIL: TestRulerEvaluationDelay (14.89s)

@dimitarvdimitrov
Copy link
Contributor

another instance

=== RUN   TestRulerEvaluationDelay
11:48:01 Starting minio-9000
11:48:01 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:32988]
11:48:01 Started minio-9000
11:48:01 Starting mimir
11:48:02 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:32989]
11:48:02 Started mimir
11:48:03 mimir: ts=2024-01-17T11:48:03.672226245Z caller=loader.go:116 level=warn msg="bucket index not found" user=anonymous
    ruler_test.go:312: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:312
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
11:48:15 Killing mimir
11:48:15 Killing minio-9000
11:48:15 Killed minio-9000
11:48:15 Killed mimir
11:48:15 error while removing sharedDir /home/runner/work/mimir/mimir/integration/e2e_integration_test2871173288 err: unlinkat /home/runner/work/mimir/mimir/integration/e2e_integration_test2871173288/rule_tmp/anonymous/ns: permission denied
--- FAIL: TestRulerEvaluationDelay (14.51s)

@dimitarvdimitrov
Copy link
Contributor

maybe #7217 helps with fixing this

@dimitarvdimitrov
Copy link
Contributor

there was another failure. It looks like the first sample is missing. The rest of the samples look ok. I don't understand why that sample wouldn't be returned by the query.

The sample should have timestamp of 1713780848667 (2024-04-22 10:14:08.667) and the query should be requesting all samples after 1713780848.66672 (2024-04-22 10:14:08.66672). A rounding error can't explain why a sample was omitted because prometheus's query API's start parameter is inclusive.

=== RUN   TestRulerEvaluationDelay
10:18:51 Starting minio-9000
10:18:52 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:32990]
10:18:52 Started minio-9000
10:18:52 Starting mimir
10:18:52 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:32991]
10:18:52 Started mimir
10:19:08 mimir: ts=2024-04-22T10:19:08.674126494Z caller=loader.go:120 level=warn msg="bucket index not found" user=anonymous
    ruler_test.go:296: querying from  2024-04-22 10:14:08.666719842 +0000 UTC m=+70.656273830 to 2024-04-22 10:19:08.666719842 +0000 UTC m=+370.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 2 actual value 4 actual timestamp 1713780849.667 expected timestamp 2024-04-22 10:14:08.666719842 +0000 UTC m=+70.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 4
        	            	actual  : 8
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 4 actual value 8 actual timestamp 1713780851.667 expected timestamp 2024-04-22 10:14:09.666719842 +0000 UTC m=+71.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 8
        	            	actual  : 10
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 8 actual value 10 actual timestamp 1713780852.667 expected timestamp 2024-04-22 10:14:11.666719842 +0000 UTC m=+73.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 10
        	            	actual  : 12
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 10 actual value 12 actual timestamp 1713780853.667 expected timestamp 2024-04-22 10:14:12.666719842 +0000 UTC m=+74.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 12
        	            	actual  : 14
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 12 actual value 14 actual timestamp 1713780854.667 expected timestamp 2024-04-22 10:14:13.666719842 +0000 UTC m=+75.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 14
        	            	actual  : 16
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 14 actual value 16 actual timestamp [1713](https://github.com/grafana/mimir/actions/runs/8782252977/job/24096123714?pr=7931#step:15:1714)780855.667 expected timestamp 2024-04-22 10:14:14.666719842 +0000 UTC m=+76.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 16
        	            	actual  : 18
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 16 actual value 18 actual timestamp 1713780856.667 expected timestamp 2024-04-22 10:14:15.666719842 +0000 UTC m=+77.656273830
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 18
        	            	actual  : 20
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 18 actual value 20 actual timestamp 1713780857.667 expected timestamp 2024-04-22 10:14:16.666719842 +0000 UTC m=+78.656273830
    ruler_test.go:315: expected value 20 actual value 20 actual timestamp 1713780858.667 expected timestamp 2024-04-22 10:14:17.666719842 +0000 UTC m=+79.656273830
10:19:20 Killing mimir
10:19:20 Killing minio-9000
10:19:20 Killed minio-9000
10:19:20 Killed mimir

@dimitarvdimitrov
Copy link
Contributor

same problem as above

Details

=== RUN   TestRulerEvaluationDelay
14:22:03 Starting minio-9000
14:22:03 Ports for container: e2e-mimir-test-minio-9000 Mapping: map[9000:33003]
14:22:03 Started minio-9000
14:22:04 Starting mimir
14:22:05 Ports for container: e2e-mimir-test-mimir Mapping: map[9009:33004]
14:22:05 Started mimir
14:22:20 mimir: ts=2024-06-24T14:22:20.673332672Z caller=loader.go:120 level=warn msg="bucket index not found" user=anonymous
    ruler_test.go:296: querying from  2024-06-24 14:17:20.64879261 +0000 UTC m=+146.840826920 to 2024-06-24 14:22:20.64879261 +0000 UTC m=+446.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 2
        	            	actual  : 4
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 2 actual value 4 actual timestamp 1719238642.649 expected timestamp 2024-06-24 14:17:20.64879261 +0000 UTC m=+146.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 4
        	            	actual  : 8
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 4 actual value 8 actual timestamp 1719238644.649 expected timestamp 2024-06-24 14:17:21.64879261 +0000 UTC m=+147.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 8
        	            	actual  : 10
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 8 actual value 10 actual timestamp 1719238645.649 expected timestamp 2024-06-24 14:17:23.64879261 +0000 UTC m=+149.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 10
        	            	actual  : 12
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 10 actual value 12 actual timestamp [1719](https://github.com/grafana/mimir/actions/runs/9646885310/job/26604585831?pr=8501#step:15:1720)238646.649 expected timestamp 2024-06-24 14:17:24.64879261 +0000 UTC m=+150.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 12
        	            	actual  : 14
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 12 actual value 14 actual timestamp 1719238647.649 expected timestamp 2024-06-24 14:17:25.64879261 +0000 UTC m=+151.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 14
        	            	actual  : 16
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 14 actual value 16 actual timestamp 1719238648.649 expected timestamp 2024-06-24 14:17:26.64879261 +0000 UTC m=+152.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 16
        	            	actual  : 18
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 16 actual value 18 actual timestamp 1719238649.649 expected timestamp 2024-06-24 14:17:27.64879261 +0000 UTC m=+153.840826920
    ruler_test.go:314: 
        	Error Trace:	/home/runner/work/mimir/mimir/integration/ruler_test.go:314
        	Error:      	Not equal: 
        	            	expected: 18
        	            	actual  : 20
        	Test:       	TestRulerEvaluationDelay
    ruler_test.go:315: expected value 18 actual value 20 actual timestamp 1719238650.649 expected timestamp 2024-06-24 14:17:28.64879261 +0000 UTC m=+154.840826920
    ruler_test.go:315: expected value 20 actual value 20 actual timestamp 1719238651.649 expected timestamp 2024-06-24 14:17:29.64879261 +0000 UTC m=+155.840826920
14:22:32 Killing mimir
14:22:32 Killing minio-9000
14:22:32 Killed minio-9000
14:22:32 Killed mimir
14:22:32 error while removing sharedDir /home/runner/work/mimir/mimir/integration/e2e_integration_test1320575141 err: unlinkat /home/runner/work/mimir/mimir/integration/e2e_integration_test1320575141/rule_tmp/anonymous/ns: permission denied
--- FAIL: TestRulerEvaluationDelay (29.34s)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants