Replace Z3 phantom reference map by doubly-linked list. #355

ThomasHaas · 2024-02-04T11:29:31Z

I played around with the user propagator feature I implemented in #349 and ran into a performance problem related to phantom references. The issue is that the user propagator creates a BooleanFormula wrapper whenever the SMT solver reports a variable assignment, causing a phantom reference to get created.
This quickly generates hundreds of thousands of phantom references (or even millions?) and the HashMap that is used to manage phantom references degrades heavily: 80%(!) of all solving time was spent just accessing the map (this was 4 minutes from 5 minutes of total solving time!).

In this PR I propose a simple (general-purpose) solution by replacing the map with a doubly-linked list.
Doubly-linked lists provide the necessary O(1) deletion and insertion time while scaling to arbitrary sizes.
Furthermore, the memory footprint of the list should be strictly smaller than that of the previously used IdentityHashMap.

On my test benchmark, the solving time was reduced from 5 minutes down to 70 seconds, and the phantom reference overhead was just 10% rather than 80% of the total time. Interestingly, my profiler reported the 10% to be mostly from ReferenceQueue.poll rather than the actual cleanup code (the profiler might be inaccurate though).

PhilippWendler · 2024-02-05T16:04:42Z

Furthermore, the memory footprint of the list should be strictly smaller than that of the previously used IdentityHashMap.

Really?

The doubly-linked list has 2n references + n longs, so 192 bits per entry. We do not need to count the object header of Z3AstReference objects because we have the PhantomReference objects anyway.

The hash map consists of one array with a size of 2n to 4n references (references for key and value, with overhead for empty cells), so 128 to 256 bits per entry. The hash map also references Long objects, which are free if they happen to be exist anyway because they are referenced somewhere else as well, or cost another 192 bits per entry. But I don't know what is the case here.

So if I did not overlook something, the list is not strictly smaller than the map, if the latter happens to be >= 75% full and the Longs are referenced anyway.

But I see a major additional argument: The memory consumption of the list shrinks again once fewer elements exist. IdentityHashMap does not implement any shrinking of the table, so if the program creates a lot of Z3Formula objects and frees them afterwards, the memory consumed by the table is kept occupied until the whole solver context is freed.

Of course, having add/remove be guaranteed O(1) instead of just amortized is also nice.

src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java

PhilippWendler · 2024-02-05T16:11:35Z

src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java

-
-    // Force clean all ASTs, even those which were not GC'd yet.
-    // Is a no-op if phantom reference handling is not enabled.
-    for (long ast : referenceMap.values()) {
-      Native.decRef(getEnv(), ast);
-    }


What is the reason for deleting this?

This was by accident because I misinterpreted the purpose. I added it back.

However, I noticed that if the PhantomReferences are already enqueued, then they will never get removed from the queue after closing (until the context object itself gets collected).
Unfortunately, ReferenceQueue.clear does not exist and one has to poll all entries to clear them.
Should I add this or does this not really matter?

You mean those references that have been enqueued before close() is called? These could in principle be cleaned up, but note that there will likely also be references that are enqueued after close() is called, and there is nothing we can do about them, so I do not think we need to bother with the already enqueued ones.

Once the solver context becomes unreachable and is cleaned up by GC, the queue and all PhantomReferences are also cleaned up. So this affects only users who close the context but then keep a reference to it for a longer time.

See my answer to the bulk deletion point you raised further below.

Added back force deletion of native objects.

ThomasHaas · 2024-02-05T18:34:55Z

Furthermore, the memory footprint of the list should be strictly smaller than that of the previously used IdentityHashMap.

Really?

I was expecting this response :).

So if I did not overlook something, the list is not strictly smaller than the map, if the latter happens to be >= 75% full and the Longs are referenced anyway.

It is true that I accidentally calculated in the boxing of the longs, but those do already exist in all cases where the function is used. Nevertheless, the load factor of IdentityHashMap is 66.67% (it resizes if 3*size > 2*capacity) guaranteeing that for each stored object there is at least one free cell, hence the size per object is 3n <= size/object <= 4n.
The proposed list is always 3n in memory size and thus smaller. It might be the case that a fully filled hashmap is identical in size to the list, but for all practical purposes I would still claim that the list is strictly smaller :).

But I see a major additional argument: The memory consumption of the list shrinks again once fewer elements exist. IdentityHashMap does not implement any shrinking of the table, so if the program creates a lot of Z3Formula objects and frees them afterwards, the memory consumed by the table is kept occupied until the whole solver context is freed.

Good point. This also means the upper bound size/object <= 4n does not even hold. The maps memory consumption is unbounded.

Of course, having add/remove be guaranteed O(1) instead of just amortized is also nice.

Yes, it performs A LOT better in extreme cases where the hashmap's operations start to approach linear time.

src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java

Co-authored-by: Philipp Wendler <2545335+PhilippWendler@users.noreply.github.com>

ThomasHaas · 2024-02-07T20:14:46Z

I noticed that the Z3FormulaCreator is full of calls to clearReferences, pretty much everywhere where objects get created.
How about moving the call to clearReferences to inside storePhantomReference so that all code that generates phantom references also clears them? This would make all existing calls to clearReferences obsolete.

… comparison. Z3 uses internal formula hashing, thus there should not be multiple pointers to the same boolean constants. This reduces the number of JNI calls for some simple cases.

Build-dependencies produce the largest artifact, around 160 MB for JavaSMT. We do not need those artifacts being around for longer. The latest artifact of a repository is kept for longer than this expiration time, so this change should not affect the reuse of any build-cache.

…eferences

kfriedberger

I could not detect a breaking example or unit test. The code itself looks valid.
However, I am not sure whether we need/want to merge this PR.

Executing a command like
java -cp bin:lib/java/runtime-z3/common.jar:lib/java/runtime-z3/com.microsoft.z3.jar org.sosy_lab.java_smt.example.nqueens_user_propagator.NQueens 11 2
or
java -cp bin:lib/java/runtime-z3/common.jar:lib/java/runtime-z3/com.microsoft.z3.jar org.sosy_lab.java_smt.example.nqueens_user_propagator.NQueens 11 3
showed a no speedup, e.g., the example runs for about 6.0-6.5s, with and without this change.
Perhaps the speedup was related to other optimizations implemented for Z3 or user propagation.

ThomasHaas · 2024-02-13T10:22:53Z

I could not detect a breaking example or unit test. The code itself looks valid. However, I am not sure whether we need/want to merge this PR.

Executing a command like java -cp bin:lib/java/runtime-z3/common.jar:lib/java/runtime-z3/com.microsoft.z3.jar org.sosy_lab.java_smt.example.nqueens_user_propagator.NQueens 11 2 or java -cp bin:lib/java/runtime-z3/common.jar:lib/java/runtime-z3/com.microsoft.z3.jar org.sosy_lab.java_smt.example.nqueens_user_propagator.NQueens 11 3 showed a no speedup, e.g., the example runs for about 60-65s, with and without this change. Perhaps the speedup was related to other optimizations implemented for Z3 or user propagation.

Have you enabled phantom references (they are disabled by default)?
Also, I made the user propagator robust(*) against this feature by reusing existing wrappers and creating only half as many wrappers as before.
Even if I disable the optimization, Nqueens (N=13) does not run into performance problems but it still causes the reference map to reach a size of 7+ million entries, meaning it will consume over 1GB of memory until the whole context is dropped (due to the lack of shrinking).

(*) The robustness is only for formulas reported by the user propagator itself. If the user decides to create new formulas from inside the callbacks, they can still end up with millions of phantom references and a possibility of encountering bad map performance.

EDIT: Btw. on what hardware are you running the Nqueens example? My machine runs 11-Queens in 2 seconds rather than 60. I'm surprised by such a big discrepancy.

kfriedberger · 2024-02-13T19:58:17Z

First: I forgot to enable phantom references. Thanks for the hint.

Second: I copied the time numbers and forgot the decimal separator (too late in the night :-) ). My older private laptop requires about 6.0-6.5s for the example. I measured again on a bigger machine just now and get about 2s when executing NQueens of size 11 with method 2 or 3.

For n=12 and method 2, the big machine requires about 20s with map-stored phantom references, and also 20s with the list-based references.
For n=13 and method 2, its about 230s with map-based references and also 228s list-based references.

Measuring further examples might not bring any benefit on my side. From theoretical point, the space complexity of list and map are quite similar. General time complexity is also similar. Worst time complexity is better for the list-based approach. -> I will merge this PR.

Replaced Z3 phantom reference map by doubly-linked list.

a87be90

ThomasHaas changed the title ~~Replaced Z3 phantom reference map by doubly-linked list.~~ Replace Z3 phantom reference map by doubly-linked list. Feb 4, 2024

ThomasHaas added 2 commits February 4, 2024 19:09

Simplified phantom references by adding dummy head and tail nodes.

c002dbe

More cleanup

29dbc3b

PhilippWendler reviewed Feb 5, 2024

View reviewed changes

ThomasHaas added 2 commits February 5, 2024 18:31

Made Z3AstReference final and all methods private.

e5a1ecb

Added back force deletion of native objects.

Minor update.

e456d89

kfriedberger reviewed Feb 5, 2024

View reviewed changes

src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java Show resolved Hide resolved

Fixed stupid mistake

16fd028

PhilippWendler reviewed Feb 6, 2024

View reviewed changes

src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java Outdated Show resolved Hide resolved

src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java Show resolved Hide resolved

ThomasHaas and others added 2 commits February 6, 2024 10:43

Update src/org/sosy_lab/java_smt/solvers/z3/Z3FormulaCreator.java

8f14db8

Co-authored-by: Philipp Wendler <2545335+PhilippWendler@users.noreply.github.com>

Clear reference queue after clearing reference list.

588c807

kfriedberger added 7 commits February 12, 2024 23:18

Evaluator: cleanup

735b1e9

sosy-lab#352: add model-evaluation for floating-point numbers.

03ecdc3

sosy-lab#357: replace check for boolean constants with direct pointer…

c185d4a

… comparison. Z3 uses internal formula hashing, thus there should not be multiple pointers to the same boolean constants. This reduces the number of JNI calls for some simple cases.

include more information about segmentation faults in build artifacts.

5cfeb38

Merge remote-tracking branch 'origin/master' into ThomasHaas-phantomR…

3c6b3de

…eferences

fix warning about empty loop.

d5e7b7d

kfriedberger approved these changes Feb 12, 2024

View reviewed changes

kfriedberger merged commit a24e437 into sosy-lab:master Feb 13, 2024
3 checks passed

ThomasHaas mentioned this pull request Feb 17, 2024

Improved Java phantom references Z3Prover/z3#7131

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Z3 phantom reference map by doubly-linked list. #355

Replace Z3 phantom reference map by doubly-linked list. #355

ThomasHaas commented Feb 4, 2024 •

edited

Loading

PhilippWendler commented Feb 5, 2024

PhilippWendler Feb 5, 2024

ThomasHaas Feb 5, 2024

PhilippWendler Feb 6, 2024

ThomasHaas Feb 6, 2024

ThomasHaas commented Feb 5, 2024

ThomasHaas commented Feb 7, 2024 •

edited

Loading

kfriedberger left a comment •

edited

Loading

ThomasHaas commented Feb 13, 2024 •

edited

Loading

kfriedberger commented Feb 13, 2024 •

edited

Loading

Replace Z3 phantom reference map by doubly-linked list. #355

Replace Z3 phantom reference map by doubly-linked list. #355

Conversation

ThomasHaas commented Feb 4, 2024 • edited Loading

PhilippWendler commented Feb 5, 2024

PhilippWendler Feb 5, 2024

Choose a reason for hiding this comment

ThomasHaas Feb 5, 2024

Choose a reason for hiding this comment

PhilippWendler Feb 6, 2024

Choose a reason for hiding this comment

ThomasHaas Feb 6, 2024

Choose a reason for hiding this comment

ThomasHaas commented Feb 5, 2024

ThomasHaas commented Feb 7, 2024 • edited Loading

kfriedberger left a comment • edited Loading

Choose a reason for hiding this comment

ThomasHaas commented Feb 13, 2024 • edited Loading

kfriedberger commented Feb 13, 2024 • edited Loading

ThomasHaas commented Feb 4, 2024 •

edited

Loading

ThomasHaas commented Feb 7, 2024 •

edited

Loading

kfriedberger left a comment •

edited

Loading

ThomasHaas commented Feb 13, 2024 •

edited

Loading

kfriedberger commented Feb 13, 2024 •

edited

Loading