Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add memoise primop #8025

Open
Artturin opened this issue Mar 11, 2023 · 2 comments
Open

Add memoise primop #8025

Artturin opened this issue Mar 11, 2023 · 2 comments
Labels
feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc

Comments

@Artturin
Copy link
Member

Artturin commented Mar 11, 2023

just a issue to not have the memoise primop hidden and forgotten in a commit on a non merged branch 0395b9b

'builtins.memoise f x' is equal to 'f x', but uses a cache to speed up
repeated invocations of 'f' with the same 'x'. A typical application
is memoising evaluations of Nixpkgs in NixOps network
specifications. For example, with the patch below, the time to
evaluate a 10-machine NixOps network is reduced from 17.1s to 9.6s,
while memory consumption goes from 4486 MiB to 3089 MiB. (This is with
GC_INITIAL_HEAP_SIZE=16G.)

Priorities

Add 👍 to issues you find important.

@Artturin Artturin added the feature Feature request or proposal label Mar 11, 2023
@roberth roberth added the language The Nix expression language; parser, interpreter, primops, evaluation, etc label Mar 16, 2023
roberth referenced this issue Mar 17, 2023
'builtins.memoise f x' is equal to 'f x', but uses a cache to speed up
repeated invocations of 'f' with the same 'x'. A typical application
is memoising evaluations of Nixpkgs in NixOps network
specifications. For example, with the patch below, the time to
evaluate a 10-machine NixOps network is reduced from 17.1s to 9.6s,
while memory consumption goes from 4486 MiB to 3089 MiB. (This is with
GC_INITIAL_HEAP_SIZE=16G.)

Nixpkgs patch:

diff --git a/pkgs/top-level/impure.nix b/pkgs/top-level/impure.nix
index a9f21e45aed..f641067e022 100644
--- a/pkgs/top-level/impure.nix
+++ b/pkgs/top-level/impure.nix
@@ -79,7 +79,7 @@ in
 # not be passed.
 assert args ? localSystem -> !(args ? system || args ? platform);

-import ./. (builtins.removeAttrs args [ "system" "platform" ] // {
+builtins.memoise or (x: x) (import ./.) (builtins.removeAttrs args [ "system" "platform" ] // {
   inherit config overlays crossSystem;
   # Fallback: Assume we are building packages on the current (build, in GNU
   # Autotools parlance) system.
@roberth
Copy link
Member

roberth commented Mar 17, 2023

The function returned by memoise as implemented in 0395b9b is strict in its argument. I suppose this may be expected. However...

Nondeterministic strictness

Potentially worse is that it does not consistently perform a deep traversal, but rather evaluates attributes (etc) based on which other arguments the function has seen in the past. This breaks strong referential transparency. The strictness behavior of a function is observable, so it must not depend on unrelated past arguments.
This may be resolved by doing a proper deepSeq on the argument. This avoids the head scratching but does make memoise less useful and probably a little slower.
An alternative is to make the evaluation of the argument optional. If the comparison and lookup encounter an exception, we may catch the exception and behave accordingly. It's always possible to just assume that the argument is distinct from anything else, and therefore apply no memoization, but perhaps we could continue the comparison, maybe compare exceptions. If both the current argument and a cached argument behave the same, we may consider them equivalent. We might consider exceptions to be values throwing an exception to be smaller than everything for the purpose of MemoArgComparator. However, this assumes that the value produces an exception consistently and cheaply. This may not be the case! Some exceptions may be retried after tryEval.

Lazy memoise?

Intruiging idea, but too complex and suffers from similar issue with failing for the wrong reason because of memoization cache state (I may be wrong, but still quite confident that we don't have to try this)

Apologies for the not so polished braindump, but I have to share something, so that others can try to check my reasoning or just avoid this path altogether.

Wild idea; probably not useful but, but it feels close, so I'm sharing it anyway.

This one is probably too tricky. Return a proxy thunk representing the return value. The concept of a proxy value may also be useful for getFlake caching, or caching flake attributes between flakes. The proxy thunk remembers which thunk needs to be passed. When forced, it computes the whnf of the returned value, but unless it's a primitive value, it returns a copy where all the items are again proxy thunks. Whenever a thunk in the original argument is evaluated, a comparison can be made to other entries of the cache. Disadvantage: memoisation may not be effective at all in some cases, or until most of the computation is already done anyway. The proxy values create garbage that won't be deduplicated. Advantage: full lazy, it seems. Works for arguments that are identical by memory address. Strict memoisation can be recovered by passing a function that does a deepSeq first. Seems like an elegant property, but it is less efficient.
Perhaps if the proxy thunks are clever enough, they can evaluate precisely what's necessary to make sure that whatever computation they're doing is equivalent, by tracking that the evaluated parts of the input are equal. It seems that this should work, and that it supports deduplicating computations where the input is different but only in an irrelevant part of the input (which is not actually evaluated). Intriguing, but speculatively evaluating too much of the wrong thing may again mean that we crash for a completely arbitrary reason.

@roberth
Copy link
Member

roberth commented Mar 21, 2024

I've opened a draft PR for it, #10280.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc
Projects
None yet
Development

No branches or pull requests

2 participants