-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse VMs when possible #898
Conversation
|
3ecdfc3
to
b2cd809
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, this was fun to review, cheers. the benchmark results look promising 👍
type vmPool struct { | ||
mutex sync.Mutex | ||
available map[string][]*jsonnet.VM | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if a more fine-grained locking structure would provide benefit, something like this:
- In the
vmPool
, have async.RWMutex
- Make
available
's value type astruct { []*jsonnet.VM, sync.Mutex }
- Lock the
RWMutex
for reading unless creating a slice for the first time
That way, you would only have to hold the global lock for one call (to create a struct instance and put it in the map); otherwise, you would only lock your own slice.
wdyt?
More deep / hard stuff follows, might not be worth it
Alternatively, would it be possible to precompute all the hashes so that the empty pool is initialised in advance and eliminate the global RWMutex
?
Then building on that (assuming it can be done reasonably), could go one step further still and have an unbuffered channel per optionset. The VMs would be directly handed between environments then - if you finish and you know someone is going to want your VM, block until they get started. Otherwise discard it.
return vm | ||
} | ||
|
||
// Release returns a Jsonnet VM to the pool | ||
func (p *vmPool) Release(vm *jsonnet.VM, opts Opts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like VMs are never going to be garbage collected, is that right? How heavy are these objects? Could that be a problem with a huge set of environments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely a leak. I could make it an option (easy, as we pass the opts object). What do you think of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS the interesting part is figuring out when to do the release, so sounds good but hard to tell without seeing what it looks like 😁
Alternatively sync.Pool
might be a pragmatic way to do this, trading off being able to guarantee reuse if possible for GCability? One pool per hash.
Whenever ext code and imports are unchanged, VMs have a cache that can be useful for future evaluations. This PR adds a dumb pool that keeps the latest VM that was used for that same eval path. I also added a shuffling of envs, so that the likelihood of reusing a VM is higher. When envs are done evaluating, their VM is put back in the pool (with an enriched cache from their eval). An important thing to note is that VMs cannot be used concurrently, so we need to remove them from the pool when they are used.
b2cd809
to
22ecb22
Compare
I won't move forward with this one. The gain is too small to warrant the memory leak |
Whenever ext code and imports are unchanged, VMs have a cache that can be useful for future evaluations. This PR adds a dumb pool that keeps the latest VM that was used for that same eval path.
I also added a shuffling of envs, so that the likelihood of reusing a VM is higher. When envs are done evaluating, their VM is put back in the pool (with an enriched cache from their eval).
An important thing to note is that VMs cannot be used concurrently, so we need to remove them from the pool when they are used.