Reduce concurrency bottlenecks in the table and block caches #172

petermattis · 2019-07-03T12:02:41Z

Shard tableCache and cache.Cache
Update tableCacheNode.refCount atomically.

petermattis · 2019-07-03T12:02:46Z

This change is

petermattis · 2019-07-03T15:34:39Z

I've add a 3rd commit which implements the idea mentioned in https://github.com/petermattis/pebble/issues/11#issuecomment-508116863 for tableCache. The results were even better than my hacky experiment suggested.

The sharding of the table and block caches take the same strategy: create num-cpu shards and select the shard based on the key. The shards are configured with size/num-shards capacity. The table and block cache mutexes were showing up highly on blocking profiles and were demonstratably limiting performance on concurrent scan workloads. On a 16 core machine, a scan only workload previously showed: N Before After 1 137,016 160,719 ( +17%) 10 433,436 968,601 (+123%) 100 416,035 909,192 (+118%) The first column (N) is the number of concurrent workers. Note that at 100 concurrent workers the performance was somewhat "jumpy" which is something that deserves further investigation.

Also update tableCacheShard.mu.releasing atomically, though the field is still present in tableCacheShard.mu because it is decremented while holding that lock. This removes the need to grab tableCacheShard.mu in the common case when closing an iterator.

Updating the tableCacheShard LRU lists requires mutually exclusive access to the list. Grabbing and releasing tableCacheShard.mu for each access shows up prominently on blocking profiles. There is a known technique for avoiding this overhead: record the access in a per-thread data structure and then batch apply the buffered accesses when the buffer is full. For a highly concurrent scan workload, this results in a significant improvement in both throughput and stability of the throughput numbers. Here are numbers when running with 100 concurrent scanners before this change: ____optype__elapsed_____ops(total)___ops/sec(cum) scan_100 120.0s 109476226 912303.4 After: ____optype__elapsed_____ops(total)___ops/sec(cum) scan_100 120.0s 151736009 1264384.6 That's a 39% speedup. Even better, the instantaneous numbers during the run were stable. See #11

The Clock-PRO algorithm doesn't require exclusive access to internal state on hits. Switch to using a RWMutex which reduces a source of contention in cached workloads. For a concurrent scan workload, performance before this commit was: ____optype__elapsed_____ops(total)___ops/sec(cum) scan_100 120.0s 151736009 1264384.6 After: ____optype__elapsed_____ops(total)___ops/sec(cum) scan_100 120.0s 170326751 1419354.7 That's a 12% improvement in throughput. See #11

Ryanfsdf

Reviewed 3 of 3 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, 1 of 1 files at r4, 3 of 3 files at r5.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @ajkr)

ajkr

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @petermattis)

table_cache.go, line 207 at r5 (raw file):

		// exclusively.
		hits := c.hitsPool.Get().(*tableCacheHits)
		hits.flushLocked()

I like this way of batching updates to the LRU list until a miss occurs (or 64 hits). MyRocks really could've used something like this too.

ajkr

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @petermattis)

table_cache.go, line 207 at r5 (raw file):

Previously, ajkr (Andrew Kryczka) wrote…

I like this way of batching updates to the LRU list until a miss occurs (or 64 hits). MyRocks really could've used something like this too.

Oh this is table cache. They needed it for block cache as they use max_open_files==-1. Seems you are avoiding the problem in block cache, too, though I'm not familiar with the clock-pro algorithm yet.

petermattis

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ajkr)

table_cache.go, line 207 at r5 (raw file):

Previously, ajkr (Andrew Kryczka) wrote…

Oh this is table cache. They needed it for block cache as they use max_open_files==-1. Seems you are avoiding the problem in block cache, too, though I'm not familiar with the clock-pro algorithm yet.

The same idea should work for the block cache, it just isn't needed there. The idea comes from https://drive.google.com/file/d/0B8oWjCpZGTn3YVhCc2dZSU5DNFBJRlBZa0s1STdaUWR5emxj/view.

petermattis requested review from ajkr and Ryanfsdf July 3, 2019 12:02

petermattis mentioned this pull request Jul 3, 2019

compare MVCCScan interface to lower-level CGo iterator interface cockroachdb/cockroach#38116

Closed

petermattis force-pushed the pmattis/shard-caches branch from 0768b6d to c3c68fc Compare July 3, 2019 14:27

petermattis force-pushed the pmattis/shard-caches branch 2 times, most recently from 9ac3500 to 7137a7d Compare July 3, 2019 22:13

petermattis added 5 commits July 5, 2019 10:11

Reimplement tableCacheShard.releasing using a sync.WaitGroup

5689eb2

petermattis force-pushed the pmattis/shard-caches branch from 502bcfb to 5689eb2 Compare July 5, 2019 14:12

Ryanfsdf approved these changes Jul 8, 2019

View reviewed changes

ajkr reviewed Jul 8, 2019

View reviewed changes

petermattis commented Jul 8, 2019

View reviewed changes

petermattis merged commit a3902b0 into master Jul 9, 2019

petermattis deleted the pmattis/shard-caches branch July 9, 2019 12:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce concurrency bottlenecks in the table and block caches #172

Reduce concurrency bottlenecks in the table and block caches #172

petermattis commented Jul 3, 2019

petermattis commented Jul 3, 2019

petermattis commented Jul 3, 2019

Ryanfsdf left a comment

ajkr left a comment

ajkr left a comment

petermattis left a comment

Reduce concurrency bottlenecks in the table and block caches #172

Reduce concurrency bottlenecks in the table and block caches #172

Conversation

petermattis commented Jul 3, 2019

petermattis commented Jul 3, 2019

petermattis commented Jul 3, 2019

Ryanfsdf left a comment

Choose a reason for hiding this comment

ajkr left a comment

Choose a reason for hiding this comment

ajkr left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment