Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Priority index locking and logging of long operations #1847

Merged
merged 13 commits into from
Aug 26, 2020

Conversation

shanson7
Copy link
Collaborator

@shanson7 shanson7 commented Jun 9, 2020

Still needs cleaning up and more testing, but this PR changes the locking behavior in the index so that long read queries can't block ingest (Update and AddOrUpdate calls).

There is a scenario (common in our case) where long index queries (say 5s) can block the acquisition of a write lock (like in the WriteQueue or add if not using the WriteQueue) which in turn blocks new read locks (like in Update). This means long index read ops can actually block ingest.

This PR alleviates that problem by differentiating the possibly slow operations and the almost certainly fast operations so that a write lock is still blocked on long index operations, but the fast operations should only get blocked by write locks (which we should make sure are fast).

It also adds logging around long lock waits/holds for non-high priority locking. I decided not to track the timings for those, since we may be doing them many thousands of times per second.

maxBuffered: maxBuffered,
maxDelay: maxDelay,
flushTrigger: make(chan struct{}, 1),
flushPending: false,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a large change in WriteQueue behavior to make maxBuffered more of a heuristic than an absolute max. This was done to take more control of the locking in the flush function.

if !timer.Stop() {
<-timer.C
}
return
}
}
}

func (wq *WriteQueue) isFlushPending() bool {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent was to use this in testing. Or something similar.

// Wait for any pending writes
bc := BlockContext{lock: pm, preLockTime: time.Now()}
pm.lowPrioLock.RLock()
pm.lock.RLock()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely convinced we actually need to lock both of these here, if Lock is guaranteed to acquire both write locks.

@Dieterbe
Copy link
Contributor

At a high level, the PriorityRWMutex seems to make sense.

Here's my breakdown

Master

Situation Reads Behavior Writes behavior
fast read executing fast+slow reads exec immediately write executes as soon as earlier reads complete
slow read executing fast+slow reads exec immediately write executes as soon as earlier reads complete
fast read executing, write lock waiting fast+slow reads wait until read+write complete [1] write executes as soon as earlier reads complete
slow read executing, write lock waiting fast+slow reads wait until read+write complete [2] write executes as soon as earlier reads complete

[1] it wouldn't be so bad if we could squeeze in some fast reads while the original fast read is still executing, but how many do we want to squeeze in? Where do we want to draw
the line as to how long we delay the write? This is a bit more of an ambiguous case and probably not worth any hassle.
[2] this sucks. we want to execute fast reads straight away

PR 1847

Situation Reads Behavior Writes behavior Comment
fast read executing fast+slow reads exec immediately write executes as soon as earlier reads complete write blocks on lock
slow read executing fast+slow reads exec immediately write executes as soon as earlier reads complete write blocks on lowPrioLock
fast read executing, write lock waiting fast+slow reads wait until read+write complete write executes as soon as earlier reads complete write blocks on lock
slow read executing, write lock waiting fast R exec imm, slow R waits until write completes write executes as soon as earlier reads complete write blocks on lowPrioLock

So
the same:

  • slow reads still scheduled after writes (for fairness)

new:

  • fast reads can execute any time except during a write

So this tackles precisely the 2nd case.

@shanson7
Copy link
Collaborator Author

Yes! Excellent breakdown.

@Dieterbe
Copy link
Contributor

So, what difference in behavior do you observe with this PR applied? Were you observing missing data due to the ingestion blocks? Or was it more along the lines of instances marking themselves as unavailable due to too high priority (kafka lag) ?

I suspect that your request latencies are unchanged?

@shanson7
Copy link
Collaborator Author

So, what difference in behavior do you observe with this PR applied? Were you observing missing data due to the ingestion blocks? Or was it more along the lines of instances marking themselves as unavailable due to too high priority (kafka lag) ?

Yes, this mainly solves the issue of instances getting marked unavailable due to queries that took a few seconds to and blocked at the same time a write was waiting.

I suspect that your request latencies are unchanged?

We did see a very small improvement to long tail latencies, but I expect that was mostly due to fewer "bursts" of ingest after being blocked for a few seconds. Most of the latency improvement we got stemmed from enabling the write queue so the write locks were acquired less frequently.

@Dieterbe Dieterbe self-assigned this Aug 24, 2020
@Dieterbe Dieterbe added this to the sprint-16 milestone Aug 24, 2020
@Dieterbe
Copy link
Contributor

the prioritymutex makes sense to me, though i need to review the writeQueue buffering/flushing changes still.
how do you feel about the quality of this? have you been running this in prod?

@shanson7
Copy link
Collaborator Author

the prioritymutex makes sense to me, though i need to review the writeQueue buffering/flushing changes still.
how do you feel about the quality of this? have you been running this in prod?

High confidence. We've been running it in prod for about 2 months now.

@Dieterbe Dieterbe changed the title WIP - Priority index locking and logging of long operations Priority index locking and logging of long operations Aug 26, 2020
@Dieterbe
Copy link
Contributor

The write queue change makes sense too. Since getting the index lock is slow, until we acquire it, keep ingesting into the write queue, and thus possibly breach maxBuffered by however many entries come in while we wait for the index lock. Seems like a very sensible tradeoff.

@Dieterbe Dieterbe merged commit 4e5172e into grafana:master Aug 26, 2020
@shanson7 shanson7 deleted the log_long_idx_locks branch October 22, 2021 16:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants