Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async query deduplication and cancellation support #14112

Closed
benjreinhart opened this issue Apr 14, 2021 · 8 comments
Closed

Async query deduplication and cancellation support #14112

benjreinhart opened this issue Apr 14, 2021 · 8 comments
Labels
global:async-query Related to Async Queries feature

Comments

@benjreinhart
Copy link
Contributor

SIP-39 - Global Async Query Support mentions a future effort around deduplicating and cancelling asynchronous queries. I'm opening this issue to document our implementation plans for visibility + feedback.

While we explored a few options for implementation, this is our current preferred approach.

cc active participants in SIP-39: @robdiciuccio @etr2460 @DiggidyDave @willbarrett @williaster

Overview

Deduplication will require the current flow, a 1:1 mapping of query to client, to move to a 1:N mapping of query to N clients.

State

Existing state

The application currently performs four writes to Redis to support both sync/async queries:

  • [sync/async] Write the query results (and some other metadata) to cache
  • [async] Write the query context (for validating user access when later requesting cached values)
  • [async] Write to global event stream (for WebSocket server)
  • [async] Write to user-specific event stream (for WebSocket server)

New state requirements

All of the above values are currently written after the background worker has executed the query. Deduplication and cancellation features will need some state before the query has executed. To track requests, we'll need additional state:

  • All job ids that are waiting for the results of a given query
  • Some job metadata such as the query key, channel id, and user id

Architecture

Async Query Deduplication

  • Each request generates a key unique to that particular request using the generated SQL and any other relevant parameters (e.g., impersonating user). Referring to this as the query key.
  • A set of job ids are associated with the query key. These are waiting on query completion.
  • Each job id has its own JSON blob in Redis, containing job metadata.
  • Deduplication occurs when a query request generates a query key that maps to an existing key in Redis with a non-empty set of job ids. In this case, a new job is created and added to the set of jobs waiting on that query to complete. No new celery tasks are enqueued.
  • When the Celery task running the query completes, it looks up all job ids waiting completion of the query. For each one, it grabs its job metadata and adds events to the Redis streams which will be consumed by the WebSocket server and forwarded to connected clients.
  • If a request comes in for a query whose results are already cached, the server should return the cached results.
  • Cancellation for a particular client can occur via removing its job id from the set. If the set of job ids awaiting query completion is empty, the query can be safely cancelled.

Pros

  • Existing state in Redis does not need to be modified
  • All Redis operations are atomic, no read-modify-write race conditions to consider

Cons

  • Some cognitive overhead given the number of k/v pairs in Redis needed
@junlincc junlincc added the global:async-query Related to Async Queries feature label Apr 21, 2021
@etr2460
Copy link
Member

etr2460 commented Apr 23, 2021

Does the query key you discuss here differ from the chart cache_key? Can we reuse the cache key logic here?

@benjreinhart
Copy link
Contributor Author

@etr2460 I agree that consistency here is desired. I think we wanted to leave the existing cache values as they are, but as for the logic that generates the cache key, the idea is for it to be the same hashing logic, but with a different key prefix.

Does that make sense, or are you thinking something different?

@etr2460
Copy link
Member

etr2460 commented Apr 27, 2021

Nope, that makes sense to me! Simply wanted to make sure that you knew we already had logic to generate this key that could be reused for this use case

@stale
Copy link

stale bot commented Jun 26, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Jun 26, 2021
@csmith940
Copy link

Hey all, has there been any progress made on this?

@stale stale bot removed the inactive Inactive for >= 30 days label Jul 29, 2021
@stale
Copy link

stale bot commented Apr 30, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 30, 2022
@hushaoqing
Copy link
Contributor

Looking forward to this feature

@stale stale bot removed the inactive Inactive for >= 30 days label Aug 3, 2022
@rusackas
Copy link
Member

rusackas commented Feb 9, 2024

Love the proposal, but I'm closing this as stale since it's been silent for so long, and we're trying to steer toward a more actionable Issues backlog. If people are still encountering this in current versions (currently 3.x) please open a new Issue or a PR to address the problem.

CC @craig-rueda in case there's any interest to carry more of this forward in PRs or SIPs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
global:async-query Related to Async Queries feature
Projects
None yet
Development

No branches or pull requests

6 participants