Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into update-prometheus
Browse files Browse the repository at this point in the history
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
  • Loading branch information
codesome committed May 18, 2020
2 parents 7f45a48 + d1ef032 commit b0f7eec
Show file tree
Hide file tree
Showing 64 changed files with 1,363 additions and 294 deletions.
10 changes: 7 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,18 @@ We use *breaking* word for marking changes that are not backward compatible (rel
### Fixed

- [#2536](https://github.com/thanos-io/thanos/pull/2536) minio-go: Fixed AWS STS endpoint url to https for Web Identity providers on AWS EKS
- [#2501](https://github.com/thanos-io/thanos/pull/2501) Query: gracefully handle additional fields in `SeriesResponse` protobuf message that may be added in the future.
- [#2568](https://github.com/thanos-io/thanos/pull/2568) Query: does not close the connection of strict, static nodes if establishing a connection had succeeded but Info() call failed
- [#2501](https://github.com/thanos-io/thanos/pull/2501) Query: Gracefully handle additional fields in `SeriesResponse` protobuf message that may be added in the future.
- [#2568](https://github.com/thanos-io/thanos/pull/2568) Query: Does not close the connection of strict, static nodes if establishing a connection had succeeded but Info() call failed
- [#2615](https://github.com/thanos-io/thanos/pull/2615) Rule: Fix bugs where rules were out of sync.
- [#2548](https://github.com/thanos-io/thanos/pull/2548) Query: Fixed rare cases of double counter reset accounting when querying `rate` with deduplication enabled.

### Added

- [#2502](https://github.com/thanos-io/thanos/pull/2502) Added `hints` field to `SeriesResponse`. Hints in an opaque data structure that can be used to carry additional information from the store and its content is implementation specific.
- [#2521](https://github.com/thanos-io/thanos/pull/2521) Sidecar: add `thanos_sidecar_reloader_reloads_failed_total`, `thanos_sidecar_reloader_reloads_total`, `thanos_sidecar_reloader_watch_errors_total`, `thanos_sidecar_reloader_watch_events_total` and `thanos_sidecar_reloader_watches` metrics.
- [#2412](https://github.com/thanos-io/thanos/pull/2412) ui: add React UI from Prometheus upstream. Currently only accessible from Query component as only `/graph` endpoint is migrated.
- [#2532](https://github.com/thanos-io/thanos/pull/2532) Store: Added hidden option for experimental caching bucket, that can cache chunks into shared memcached. This can speed up querying and reduce number of requests to object storage.
- [#2532](https://github.com/thanos-io/thanos/pull/2532) Store: Added hidden option `--store.caching-bucket.config=<yaml content>` (or `--store.caching-bucket.config-file=<file.yaml>`) for experimental caching bucket, that can cache chunks into shared memcached. This can speed up querying and reduce number of requests to object storage.
- [#2579](https://github.com/thanos-io/thanos/pull/2579) Store: Experimental caching bucket can now cache metadata as well. Config has changed from #2532.

### Changed

Expand All @@ -31,6 +34,7 @@ We use *breaking* word for marking changes that are not backward compatible (rel
- [2513](https://github.com/thanos-io/thanos/pull/2513) Tools: Moved `thanos bucket` commands to `thanos tools bucket`, also
moved `thanos check rules` to `thanos tools rules-check`. `thanos tools rules-check` also takes rules by `--rules` repeated flag not argument
anymore.
- [#2548](https://github.com/thanos-io/thanos/pull/2548/commits/53e69bd89b2b08c18df298eed7d90cb7179cc0ec) Store, Querier: remove duplicated chunks on StoreAPI.
- [#2596](https://github.com/thanos-io/thanos/pull/2596) Update to Prometheus [@cd73b3d33e064bbd846fc7a26dc8c313d46af382](https://github.com/prometheus/prometheus/commit/cd73b3d33e064bbd846fc7a26dc8c313d46af382) which falls in between v2.17.0 and v2.18.0.
- TSDB now supports isolation of append and queries.
- TSDB now holds less WAL files after Head Truncation.
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
FROM quay.io/prometheus/busybox:latest
LABEL maintainer="The Thanos Authors"

COPY thanos /bin/thanos
COPY /thanos_tmp_for_docker /bin/thanos

ENTRYPOINT [ "/bin/thanos" ]
25 changes: 13 additions & 12 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
PREFIX ?= $(shell pwd)
FILES_TO_FMT ?= $(shell find . -path ./vendor -prune -o -name '*.go' -print)

DOCKER_IMAGE_REPO ?= quay.io/thanos/thanos
Expand Down Expand Up @@ -59,8 +58,7 @@ PROMTOOL ?= $(GOBIN)/promtool-$(PROMTOOL_VERSION)
# systems gsed won't be installed, so will use sed as expected.
SED ?= $(shell which gsed 2>/dev/null || which sed)

MIXIN_ROOT ?= mixin
THANOS_MIXIN ?= mixin/thanos
THANOS_MIXIN ?= mixin
JSONNET_VENDOR_DIR ?= mixin/vendor

WEB_DIR ?= website
Expand Down Expand Up @@ -176,8 +174,8 @@ react-app-start: $(REACT_APP_NODE_MODULES_PATH)
.PHONY: build
build: ## Builds Thanos binary using `promu`.
build: check-git deps $(PROMU)
@echo ">> building binaries $(GOBIN)"
@$(PROMU) build --prefix $(PREFIX)
@echo ">> building Thanos binary in $(GOBIN)"
@$(PROMU) build --prefix $(GOBIN)

.PHONY: crossbuild
crossbuild: ## Builds all binaries for all platforms.
Expand All @@ -193,8 +191,11 @@ deps: ## Ensures fresh go.mod and go.sum.
.PHONY: docker
docker: ## Builds 'thanos' docker with no tag.
docker: build
@echo ">> copying Thanos from $(GOBIN) to ./thanos_tmp_for_docker"
@cp $(GOBIN)/thanos ./thanos_tmp_for_docker
@echo ">> building docker image 'thanos'"
@docker build -t "thanos" .
@rm ./thanos_tmp_for_docker

.PHONY: docker-multi-stage
docker-multi-stage: ## Builds 'thanos' docker image using multi-stage.
Expand All @@ -212,13 +213,13 @@ docker-push:
.PHONY: docs
docs: ## Regenerates flags in docs for all thanos commands.
docs: $(EMBEDMD) build
@EMBEDMD_BIN="$(EMBEDMD)" SED_BIN="$(SED)" scripts/genflagdocs.sh
@EMBEDMD_BIN="$(EMBEDMD)" SED_BIN="$(SED)" THANOS_BIN="$(GOBIN)/thanos" scripts/genflagdocs.sh
@find . -type f -name "*.md" | SED_BIN="$(SED)" xargs scripts/cleanup-white-noise.sh

.PHONY: check-docs
check-docs: ## checks docs against discrepancy with flags, links, white noise.
check-docs: $(EMBEDMD) $(LICHE) build
@EMBEDMD_BIN="$(EMBEDMD)" SED_BIN="$(SED)" scripts/genflagdocs.sh check
@EMBEDMD_BIN="$(EMBEDMD)" SED_BIN="$(SED)" THANOS_BIN="$(GOBIN)/thanos" scripts/genflagdocs.sh check
@$(LICHE) --recursive docs --exclude "(couchdb.apache.org/bylaws.html|cloud.tencent.com|alibabacloud.com|zoom.us)" --document-root .
@$(LICHE) --exclude "goreportcard.com" --document-root . *.md
@find . -type f -name "*.md" | SED_BIN="$(SED)" xargs scripts/cleanup-white-noise.sh
Expand Down Expand Up @@ -402,20 +403,20 @@ examples/tmp:
$(JSONNET) -J ${JSONNET_VENDOR_DIR} -m examples/tmp/ ${THANOS_MIXIN}/separated_alerts.jsonnet | xargs -I{} sh -c 'cat {} | $(GOJSONTOYAML) > {}.yaml; rm -f {}' -- {}

.PHONY: examples/dashboards # to keep examples/dashboards/dashboards.md.
examples/dashboards: $(JSONNET) ${THANOS_MIXIN}/mixin.libsonnet ${THANOS_MIXIN}/defaults.libsonnet ${THANOS_MIXIN}/dashboards/*
examples/dashboards: $(JSONNET) ${THANOS_MIXIN}/mixin.libsonnet ${THANOS_MIXIN}/config.libsonnet ${THANOS_MIXIN}/dashboards/*
-rm -rf examples/dashboards/*.json
$(JSONNET) -J ${JSONNET_VENDOR_DIR} -m examples/dashboards ${THANOS_MIXIN}/dashboards.jsonnet

examples/alerts/alerts.yaml: $(JSONNET) $(GOJSONTOYAML) ${THANOS_MIXIN}/mixin.libsonnet ${THANOS_MIXIN}/defaults.libsonnet ${THANOS_MIXIN}/alerts/*
examples/alerts/alerts.yaml: $(JSONNET) $(GOJSONTOYAML) ${THANOS_MIXIN}/mixin.libsonnet ${THANOS_MIXIN}/config.libsonnet ${THANOS_MIXIN}/alerts/*
$(JSONNET) ${THANOS_MIXIN}/alerts.jsonnet | $(GOJSONTOYAML) > $@

examples/alerts/rules.yaml: $(JSONNET) $(GOJSONTOYAML) ${THANOS_MIXIN}/mixin.libsonnet ${THANOS_MIXIN}/defaults.libsonnet ${THANOS_MIXIN}/rules/*
examples/alerts/rules.yaml: $(JSONNET) $(GOJSONTOYAML) ${THANOS_MIXIN}/mixin.libsonnet ${THANOS_MIXIN}/config.libsonnet ${THANOS_MIXIN}/rules/*
$(JSONNET) ${THANOS_MIXIN}/rules.jsonnet | $(GOJSONTOYAML) > $@

.PHONY: jsonnet-vendor
jsonnet-vendor: $(JSONNET_BUNDLER) $(MIXIN_ROOT)/jsonnetfile.json $(MIXIN_ROOT)/jsonnetfile.lock.json
jsonnet-vendor: $(JSONNET_BUNDLER) $(THANOS_MIXIN)/jsonnetfile.json $(THANOS_MIXIN)/jsonnetfile.lock.json
rm -rf ${JSONNET_VENDOR_DIR}
cd ${MIXIN_ROOT} && $(JSONNET_BUNDLER) install
cd ${THANOS_MIXIN} && $(JSONNET_BUNDLER) install

JSONNETFMT_CMD := $(JSONNETFMT) -n 2 --max-blank-lines 2 --string-style s --comment-style s

Expand Down
13 changes: 10 additions & 3 deletions cmd/thanos/rule.go
Original file line number Diff line number Diff line change
Expand Up @@ -763,8 +763,9 @@ func reloadRules(logger log.Logger,
metrics *RuleMetrics) error {
level.Debug(logger).Log("msg", "configured rule files", "files", strings.Join(ruleFiles, ","))
var (
errs tsdberrors.MultiError
files []string
errs tsdberrors.MultiError
files []string
seenFiles = make(map[string]struct{})
)
for _, pat := range ruleFiles {
fs, err := filepath.Glob(pat)
Expand All @@ -774,7 +775,13 @@ func reloadRules(logger log.Logger,
continue
}

files = append(files, fs...)
for _, fp := range fs {
if _, ok := seenFiles[fp]; ok {
continue
}
files = append(files, fp)
seenFiles[fp] = struct{}{}
}
}

level.Info(logger).Log("msg", "reload rule files", "numFiles", len(files))
Expand Down
2 changes: 1 addition & 1 deletion cmd/thanos/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ func registerStore(m map[string]setupFunc, app *kingpin.Application) {
httpBindAddr, httpGracePeriod := regHTTPFlags(cmd)
grpcBindAddr, grpcGracePeriod, grpcCert, grpcKey, grpcClientCA := regGRPCFlags(cmd)

dataDir := cmd.Flag("data-dir", "Data directory in which to cache remote blocks.").
dataDir := cmd.Flag("data-dir", "Local data directory used for caching purposes (index-header, in-mem cache items and meta.jsons). If removed, no data will be lost, just store will have to rebuild the cache. NOTE: Putting raw blocks here will not cause the store to read them. For such use cases use Prometheus + sidecar.").
Default("./data").String()

indexCacheSize := cmd.Flag("index-cache-size", "Maximum size of items held in the in-memory index cache. Ignored if --index-cache.config or --index-cache.config-file option is specified.").
Expand Down
2 changes: 1 addition & 1 deletion cmd/thanos/tools.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ func (g *ThanosRuleGroups) Validate() (errs []error) {
if _, ok := set[g.Name]; ok {
errs = append(
errs,
fmt.Errorf("groupname: \"%s\" is repeated in the same file", g.Name),
fmt.Errorf("groupname: %q is repeated in the same file", g.Name),
)
}

Expand Down
3 changes: 1 addition & 2 deletions docs/components/compact.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,7 @@ In order to achieve this co-ordination, blocks are not deleted directly. Instead

## Flags

[embedmd]: # "flags/compact.txt $"

[embedmd]:# (flags/compact.txt $)
```$
usage: thanos compact [<flags>]
Expand Down
88 changes: 74 additions & 14 deletions docs/components/store.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ In general about 1MB of local disk space is required per TSDB block stored in th
## Flags
[embedmd]: # "flags/store.txt $"
[embedmd]:# (flags/store.txt $)
```$
usage: thanos store [<flags>]

Expand Down Expand Up @@ -68,7 +68,13 @@ Flags:
TLS CA to verify clients against. If no client
CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--data-dir="./data" Data directory in which to cache remote blocks.
--data-dir="./data" Local data directory used for caching purposes
(index-header, in-mem cache items and
meta.jsons). If removed, no data will be lost,
just store will have to rebuild the cache.
NOTE: Putting raw blocks here will not cause
the store to read them. For such use cases use
Prometheus + sidecar.
--index-cache-size=250MB Maximum size of items held in the in-memory
index cache. Ignored if --index-cache.config or
--index-cache.config-file option is specified.
Expand Down Expand Up @@ -137,11 +143,51 @@ Flags:
Prometheus relabel-config syntax. See format
details:
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
--consistency-delay=30m Minimum age of all blocks before they are being read.
--consistency-delay=0s Minimum age of all blocks before they are being
read. Set it to safe value (e.g 30m) if your
object storage is eventually consistent. GCS
and S3 are (roughly) strongly consistent.
--ignore-deletion-marks-delay=24h
Duration after which the blocks marked for deletion will be filtered out while fetching blocks.
The idea of ignore-deletion-marks-delay is to ignore blocks that are marked for deletion with some delay. This ensures store can still serve blocks that are meant to be deleted but do not have a replacement yet. If delete-delay duration is provided to compactor or bucket verify component, it will upload deletion-mark.json file to mark after what duration the block should be deleted rather than deleting the block straight away.
If delete-delay is non-zero for compactor or bucket verify component, ignore-deletion-marks-delay should be set to (delete-delay)/2 so that blocks marked for deletion are filtered out while fetching blocks before being deleted from bucket. Default is 24h, half of the default value for --delete-delay on compactor.
Duration after which the blocks marked for
deletion will be filtered out while fetching
blocks. The idea of ignore-deletion-marks-delay
is to ignore blocks that are marked for
deletion with some delay. This ensures store
can still serve blocks that are meant to be
deleted but do not have a replacement yet. If
delete-delay duration is provided to compactor
or bucket verify component, it will upload
deletion-mark.json file to mark after what
duration the block should be deleted rather
than deleting the block straight away. If
delete-delay is non-zero for compactor or
bucket verify component,
ignore-deletion-marks-delay should be set to
(delete-delay)/2 so that blocks marked for
deletion are filtered out while fetching blocks
before being deleted from bucket. Default is
24h, half of the default value for
--delete-delay on compactor.
--web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface. Actual
endpoints are still served on / or the
web.route-prefix. This allows thanos bucket web
UI to be served behind a reverse proxy that
strips a URL sub-path.
--web.prefix-header="" Name of HTTP request header used for dynamic
prefixing of UI links and redirects. This
option is ignored if web.external-prefix
argument is set. Security risk: enable this
option only if a reverse proxy in front of
thanos is resetting the header. The
--web.prefix-header=X-Forwarded-Prefix option
can be useful, for example, if Thanos UI is
served via Traefik reverse proxy with
PathPrefixStrip option enabled, which sends the
stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a
sub-path.

```

## Time based partitioning
Expand Down Expand Up @@ -234,31 +280,45 @@ While the remaining settings are **optional**:

## Caching Bucket

Thanos Store Gateway supports a "caching bucket" with chunks caching to speed up loading of chunks from TSDB blocks. Currently only memcached "backend" is supported:
Thanos Store Gateway supports a "caching bucket" with chunks and metadata caching to speed up loading of chunks from TSDB blocks. To configure caching, one needs to use `--store.caching-bucket.config=<yaml content>` or `--store.caching-bucket.config-file=<file.yaml>`.

Currently only memcached "backend" is supported:

```yaml
backend: memcached
backend_config:
addresses:
- localhost:11211
caching_config:
chunk_subrange_size: 16000
max_chunks_get_range_requests: 3
chunk_object_size_ttl: 24h
chunk_subrange_ttl: 24h
chunk_subrange_size: 16000
max_chunks_get_range_requests: 3
chunk_object_size_ttl: 24h
chunk_subrange_ttl: 24h
blocks_iter_ttl: 5m
metafile_exists_ttl: 2h
metafile_doesnt_exist_ttl: 15m
metafile_content_ttl: 24h
metafile_max_size: 1MiB
```

`backend_config` field for memcached supports all the same configuration as memcached for [index cache](#memcached-index-cache).

`caching_config` is a configuration for chunks cache and supports the following optional settings:
Additional options to configure various aspects of chunks cache are available:

- `chunk_subrange_size`: size of segment of chunks object that is stored to the cache. This is the smallest unit that chunks cache is working with.
- `max_chunks_get_range_requests`: how many "get range" sub-requests may cache perform to fetch missing subranges.
- `chunk_object_size_ttl`: how long to keep information about chunk file length in the cache.
- `chunk_subrange_ttl`: how long to keep individual subranges in the cache.

Note that chunks cache is an experimental feature, and these fields may be renamed or removed completely in the future.
Following options are used for metadata caching (meta.json files, deletion mark files, iteration result):

- `blocks_iter_ttl`: how long to cache result of iterating blocks.
- `metafile_exists_ttl`: how long to cache information about whether meta.json or deletion mark file exists.
- `metafile_doesnt_exist_ttl`: how long to cache information about whether meta.json or deletion mark file doesn't exist.
- `metafile_content_ttl`: how long to cache content of meta.json and deletion mark files.
- `metafile_max_size`: maximum size of cached meta.json and deletion mark file. Larger files are not cached.

Note that chunks and metadata cache is an experimental feature, and these fields may be renamed or removed completely in the future.

## Index Header

Expand Down
8 changes: 8 additions & 0 deletions examples/alerts/alerts.md
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,14 @@ rules:
for: 5m
labels:
severity: warning
- alert: ThanosReceiveNoUpload
annotations:
message: Thanos Receive {{$labels.job}} has not uploaded latest data to object
storage.
expr: increase(thanos_shipper_uploads_total{job=~"thanos-receive.*"}[2h]) == 0
for: 30m
labels:
severity: warning
```
## Replicate
Expand Down
8 changes: 8 additions & 0 deletions examples/alerts/alerts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,14 @@ groups:
for: 5m
labels:
severity: warning
- alert: ThanosReceiveNoUpload
annotations:
message: Thanos Receive {{$labels.job}} has not uploaded latest data to object
storage.
expr: increase(thanos_shipper_uploads_total{job=~"thanos-receive.*"}[2h]) == 0
for: 30m
labels:
severity: warning
- name: thanos-sidecar.rules
rules:
- alert: ThanosSidecarPrometheusDown
Expand Down
Loading

0 comments on commit b0f7eec

Please sign in to comment.