Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache vertical shards in query frontend #5648

Merged
merged 3 commits into from
Sep 4, 2022

Conversation

fpetkovski
Copy link
Contributor

The vertical sharding middleware is currently executed after the
caching middleware. Because of this, individual vertical shards are
not getting cached when the succeed. Caching is only done when the
entire requests including all shards complete successfully.

This commit moves the vertical sharding middleware before the caching
middleware. It also modifies caching keys to contain the total shards
and the shard number so that each vertical shard gets an independent
caching key.

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Enable caching in query frontend for individual vertical shards.

Verification

Added unit tests and tests locally by purposefully causing failures in queriers.


// Allow other requests to execute
lock.Unlock()
<-time.After(200 * time.Millisecond)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query frontend fanout seems to cancel all requests as soon as one request fails. Because of that, it is hard to provoke partial failures since the first failure will cause all parallel requests to fail.

This like adds a delay to failures so that successful requests can succeed, but it can still lead to flakes. Not sure if there is a better option to test this scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, maybe we could check if numFailures == 0 { waitForAnotherRequestToCome(); } somehow? Perhaps a shared sync.Cond could be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat trick, seems to be a much better solution 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, looks like it's not possible to add this coordination here because we need to release other threads when the sharding middleware has seen a successful response. If we release them here, failed requests can still complete before successful ones.

yeya24
yeya24 previously approved these changes Aug 26, 2022
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this change. Thanks!


testutil.Equals(t, tc.expected, *res)

//if *res > tc.expected {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this? :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed


// Allow other requests to execute
lock.Unlock()
<-time.After(200 * time.Millisecond)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhm, maybe we could check if numFailures == 0 { waitForAnotherRequestToCome(); } somehow? Perhaps a shared sync.Cond could be used?

The vertical sharding middleware is currently executed after the
caching middleware. Because of this, individual vertical shards are
not getting cached when the succeed. Caching is only done when the
entire requests including all shards complete successfully.

This commit moves the vertical sharding middleware before the caching
middleware. It also modifies caching keys to contain the total shards
and the shard number so that each vertical shard gets an independent
caching key.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@GiedriusS GiedriusS merged commit a947f33 into thanos-io:main Sep 4, 2022
prajain12 pushed a commit to prajain12/thanos that referenced this pull request Sep 6, 2022
* Cache vertical shards in query frontend

The vertical sharding middleware is currently executed after the
caching middleware. Because of this, individual vertical shards are
not getting cached when the succeed. Caching is only done when the
entire requests including all shards complete successfully.

This commit moves the vertical sharding middleware before the caching
middleware. It also modifies caching keys to contain the total shards
and the shard number so that each vertical shard gets an independent
caching key.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Adjust cache key tests

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Remove source of flakiness by using sync.Cond

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Prakul Jain <prakul.jain@udaan.com>
prajain12 pushed a commit to prajain12/thanos that referenced this pull request Sep 6, 2022
* Cache vertical shards in query frontend

The vertical sharding middleware is currently executed after the
caching middleware. Because of this, individual vertical shards are
not getting cached when the succeed. Caching is only done when the
entire requests including all shards complete successfully.

This commit moves the vertical sharding middleware before the caching
middleware. It also modifies caching keys to contain the total shards
and the shard number so that each vertical shard gets an independent
caching key.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Adjust cache key tests

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Remove source of flakiness by using sync.Cond

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Prakul Jain <prakul.jain@udaan.com>
prajain12 pushed a commit to prajain12/thanos that referenced this pull request Sep 6, 2022