compact: consider handling duplicate labels, or continuing on error #497

erilane · 2018-08-30T23:23:22Z

Thanos, Prometheus and Golang version used
thanos, version 0.1.0-rc.2 (branch: HEAD, revision: 53e4d69)
build user: root@c7199d758b5e
build date: 20180705-12:54:50
go version: go1.10.3

prometheus, version 2.3.2 (branch: HEAD, revision: 71af5e29e815795e9dd14742ee7725682fa14b7b)
build user: root@5258e0bd9cc1
build date: 20180712-14:02:52
go version: go1.10.3

What happened
Thanos compact logs the following error and then exits with status 1

level=error ts=2018-08-30T23:08:03.919366788Z caller=main.go:160 msg="running command failed" err="compaction: gather index issues for block /thanos-compact-scratch/compact/0@{dc=\"dc1\",env=\"stg\",product=\"mongodb\",prom_config=\"mongodb-stg-dc1\"}/01CNNJJVN77F43F7Y5QA4ESWRE: out-of-order label set {__name__=\"mongodb_collection_avg_objsize_bytes\",dc=\"dc1\",env=\"stg\",host=\"mongoc3.stg.dc1.thousandeyes.com\",instance=\"mongoc3.stg.dc1.thousandeyes.com:9126\",job=\"hosts\",ns=\"config.actionlog\",ns=\"config.actionlog\",product=\"mongodb\",project=\"msc-cluster\"} for series 20551"

note the duplicate ns="config.actionlog"

What you expected to happen
Compactor to de-dup the labels (?), or simply log the error and move on to other work

How to reproduce it (as minimally and precisely as possible):
I'm still trying to figure out how this happened. Maybe a bad scrape target. It seems to have stopped happening on my environment. But, if I intentionally create a bad scrape target with something like:

bad_metric{a="1"} 1
bad_metric{a="1",a="1"} 1

Prometheus will show two different timeseries with identical labels when I query for bad_metric

The text was updated successfully, but these errors were encountered:

Allex1 · 2018-09-14T19:25:51Z

+1

bwplotka · 2018-10-03T11:11:31Z

What you expected to happen
Compactor to de-dup the labels (?), or simply log the error and move on to other work

So.. would you rather don't care about all of those issues? Even if they will after 1000 compactions and downsampling operations grow to serious problem that is unfixable and you would need to delete couple month of data? That can happen if you are feeding some malformed block into compaction logic when compaction was never tested to compact block with such thing.

I doubt this issue will cause such a serious cause so what needs to be done is a short unit test if we can compact those blocks and what it looks like afterwards. And if all good then we can switch to soft notification -> a metric and log line and contining the compaction work (:

bwplotka · 2018-10-20T14:20:18Z

See https://improbable-eng.slack.com/archives/CA4UWKEEN/p1539959413000100

You need to apply relabelling to get rid of From (move to from). In mean time we need to find a way to:

alert on this as soon as possible
auto fix this? (lowercase)
allow this, but the consequences are roughly unknown (we know it is inefficient, but we don't know how much)

sbueringer · 2019-03-02T16:06:54Z

I have a similar issue with time series produced by kube-state-metrics. Example:
two annotations on a namespace:

	namespace-provisioner/secret: test1
	namespace.provisioner/secret: test2

This leads to a timeseries with the following labels:

{__name__=\"kube_namespace_annotations\",annotation_namespace_provisioner_secret=\"test1\",annotation_namespace_provisioner_secret=\"test2\",..}

which produces this error:

level=error ts=2019-03-02T15:46:12.99163908Z caller=main.go:181 msg="running command failed" err="error executing compaction: compaction failed: compaction: gather index issues for block /var/thanos/compact/data/compact/0@{monitor=\"prometheus\",replica=\"2\"}/01D4Y5G8ZQQAZ58T81QQPSJ586: out-of-order label set {__name__=\"kube_namespace_annotations\",...

Sorry I'm probably asking in the wrong place. Are duplicate labels allowed in the Prometheus format? (and is this therefore a bug in kube-state-metrics or thanos?)

EDIT: When I query Prometheus for this timeseries, Prometheus shows only one of the 2 duplicate labels.

sbueringer · 2019-03-04T16:42:42Z

Would be fixed by #848 right?

sbueringer · 2019-03-12T06:05:28Z

@bwplotka

FUSAKLA · 2019-03-23T18:46:20Z

Hi @sbueringer , I believe that yes. It should fix the issue with the same label names.

The only issue that persists is first letter uppercase in label names which should be possible to overcame when #953 is merged, but that's different case.

Could you please try it out with current master if the issue is resolved?

Thanks!

sbueringer · 2019-08-02T06:07:29Z

Sorry for the late answer. I verified it with the current Thanos version and the issue is fixed. So in my opinion you can close the issue

FUSAKLA · 2019-08-07T16:35:28Z

Great to hear, thanks for verifying!

bwplotka added the feature request/improvement label Sep 3, 2018

bwplotka added the difficulty: easy label Sep 12, 2018

jojohappy mentioned this issue Sep 28, 2018

Fixed compact failed due to duplicate labels #542

Closed

FUSAKLA closed this as completed Aug 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compact: consider handling duplicate labels, or continuing on error #497

compact: consider handling duplicate labels, or continuing on error #497

erilane commented Aug 30, 2018

Allex1 commented Sep 14, 2018

bwplotka commented Oct 3, 2018

bwplotka commented Oct 20, 2018 •

edited

Loading

sbueringer commented Mar 2, 2019 •

edited

Loading

sbueringer commented Mar 4, 2019

sbueringer commented Mar 12, 2019

FUSAKLA commented Mar 23, 2019 •

edited

Loading

sbueringer commented Aug 2, 2019

FUSAKLA commented Aug 7, 2019

compact: consider handling duplicate labels, or continuing on error #497

compact: consider handling duplicate labels, or continuing on error #497

Comments

erilane commented Aug 30, 2018

Allex1 commented Sep 14, 2018

bwplotka commented Oct 3, 2018

bwplotka commented Oct 20, 2018 • edited Loading

sbueringer commented Mar 2, 2019 • edited Loading

sbueringer commented Mar 4, 2019

sbueringer commented Mar 12, 2019

FUSAKLA commented Mar 23, 2019 • edited Loading

sbueringer commented Aug 2, 2019

FUSAKLA commented Aug 7, 2019

bwplotka commented Oct 20, 2018 •

edited

Loading

sbueringer commented Mar 2, 2019 •

edited

Loading

FUSAKLA commented Mar 23, 2019 •

edited

Loading