Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_code_cache.cc and node_snapshot.cc generation is unreproducible #29108

Closed
ChALkeR opened this issue Aug 13, 2019 · 9 comments
Closed

node_code_cache.cc and node_snapshot.cc generation is unreproducible #29108

ChALkeR opened this issue Aug 13, 2019 · 9 comments
Labels
build Issues and PRs related to build files or the CI.

Comments

@ChALkeR
Copy link
Member

ChALkeR commented Aug 13, 2019

Refs: nodejs/build#589

Currently, performing these steps on Linux:

  1. Download source https://nodejs.org/dist/latest/node-v12.8.0.tar.xz (or some other version)
  2. Extract the archive to /tmp/node-v12.8.0, build once with default configuration, rename the dir to /tmp/node-v12.8.0-1
  3. Extract the archive to /tmp/node-v12.8.0, build again with default configuration, rename the dir to /tmp/node-v12.8.0-2.
  4. Diff the two folders with diff -qr /tmp/node-v12.8.0-1 /tmp/node-v12.8.0-2.

Produces this result:

Files node-v12.8.0-1/node and node-v12.8.0-2/node differ
Files node-v12.8.0-1/out/Release/node and node-v12.8.0-2/out/Release/node differ
Files node-v12.8.0-1/out/Release/obj/gen/node_code_cache.cc and node-v12.8.0-2/out/Release/obj/gen/node_code_cache.cc differ
Files node-v12.8.0-1/out/Release/obj/gen/node_snapshot.cc and node-v12.8.0-2/out/Release/obj/gen/node_snapshot.cc differ
Files node-v12.8.0-1/out/Release/obj.target/node/gen/node_code_cache.o and node-v12.8.0-2/out/Release/obj.target/node/gen/node_code_cache.o differ
Files node-v12.8.0-1/out/Release/obj.target/node/gen/node_snapshot.o and node-v12.8.0-2/out/Release/obj.target/node/gen/node_snapshot.o differ

Note: it is important to extract from archive (to preserve the same modification timestamps) and build it from the same path, at least at this moment.

Note: on macOS, the diff is larger due to something unreproducible happening on the linker stage, let's check on Linux only for now. But on macOS, node_code_cache.cc and node_snapshot.cc generation is also affected.

That blocks reproducible builds afaik, and it at the first glance seems to be the only major cause behind mismatching binaries produced in the same environment with the process above.

Those files differ in generated array contents.
Perhaps someone knows what exactly is going on there?

@ChALkeR ChALkeR added the build Issues and PRs related to build files or the CI. label Aug 13, 2019
@addaleax
Copy link
Member

/cc @joyeecheung

@joyeecheung
Copy link
Member

The array contents are generated directly from outputs of V8 APIs...are those supposed to be reproducible? cc @hashseed

@joyeecheung
Copy link
Member

joyeecheung commented Aug 14, 2019

The APIs in questions are:

ScriptCompiler::CompileFunctionInContext() -> ScriptCompiler::CreateCodeCacheForFunction()

SnapshotCreator::CreateBlob() (and a bunch of SnapshotCreator::Add*() calls).

@bnoordhuis
Copy link
Member

bnoordhuis commented Aug 14, 2019

For node_code_cache.cc the reason is entropy. This fixes it although it's probably not a good idea:

diff --git a/tools/code_cache/mkcodecache.cc b/tools/code_cache/mkcodecache.cc
index defa1462ce..5c1c5b8cb4 100644
--- a/tools/code_cache/mkcodecache.cc
+++ b/tools/code_cache/mkcodecache.cc
@@ -38,6 +38,11 @@ int main(int argc, char* argv[]) {
     return 1;
   }
 
+  v8::V8::SetEntropySource([] (unsigned char* buffer, size_t length) {
+    memset(buffer, 0, length);
+    return true;
+  });
+
   std::unique_ptr<v8::Platform> platform = v8::platform::NewDefaultPlatform();
   v8::V8::InitializePlatform(platform.get());
   v8::V8::Initialize();

For node_snapshot.cc, I suspect there are additionally things like timestamps getting encoded in.

@bnoordhuis
Copy link
Member

Looking into node_snapshot.cc more, after disabling entropy the offsets of the read-only snapshot data and the first context still shift around by 16 bytes for some inexplicable reason.

That suggests the startup snapshot data is variable in size? The layout is this:

  1. Header
  2. Startup snapshot data
  3. Read-only snapshot data (shifts around)
  4. First context (shifts around)
  5. Second context (fixed at 0x3CB88 - que pasa?)

Note: those shifts also change the checksum in the header that's at offset 8-16.

Disabling entropy at least ensures that the blob size is the same, otherwise it fluctuates in size by about 100 bytes...

@bnoordhuis
Copy link
Member

I enabled --profile_deserialization and indeed it shows that sizes fluctuate across runs:

  ACTION _Users_bnoordhuis_src_v1_x_node_gyp_node_target_node_mksnapshot /Users/bnoordhuis/src/v1.x/out/Release/obj/gen/node_snapshot.cc
[Verifying snapshot checksum took 0.111 ms]
[Deserializing isolate (156512 bytes) took 4.070 ms]
[Deserializing context #0 (50304 bytes) took 0.368 ms]
[Deserializing context #0 (50304 bytes) took 0.387 ms]
[Serializing to 2776 bytes took 0.228 ms]
[Serializing to 784 bytes took 0.025 ms]
[Serializing to 2544 bytes took 0.048 ms]
Deserialization will reserve:
    287000 bytes per isolate
    124544 bytes per context #0
    182824 bytes per context #1
Snapshot blob consists of:
    174056 bytes in 6 chunks for startup
     24272 bytes for read-only
     50312 bytes in 31 chunks for context #0
     70536 bytes in 46 chunks for context #1

Second run:

  ACTION _Users_bnoordhuis_src_v1_x_node_gyp_node_target_node_mksnapshot /Users/bnoordhuis/src/v1.x/out/Release/obj/gen/node_snapshot.cc
[Verifying snapshot checksum took 0.114 ms]
[Deserializing isolate (156512 bytes) took 4.278 ms]
[Deserializing context #0 (50304 bytes) took 0.379 ms]
[Deserializing context #0 (50304 bytes) took 0.347 ms]
[Serializing to 2776 bytes took 0.264 ms]
[Serializing to 784 bytes took 0.022 ms]
[Serializing to 2544 bytes took 0.060 ms]
Deserialization will reserve:
    287000 bytes per isolate
    124544 bytes per context #0
    182824 bytes per context #1
Snapshot blob consists of:
    174056 bytes in 6 chunks for startup
     24272 bytes for read-only
     50288 bytes in 31 chunks for context #0  <-- variable
     70376 bytes in 47 chunks for context #1  <-- variable

Third run:

  ACTION _Users_bnoordhuis_src_v1_x_node_gyp_node_target_node_mksnapshot /Users/bnoordhuis/src/v1.x/out/Release/obj/gen/node_snapshot.cc
[Verifying snapshot checksum took 0.180 ms]
[Deserializing isolate (156512 bytes) took 4.647 ms]
[Deserializing context #0 (50304 bytes) took 0.649 ms]
[Deserializing context #0 (50304 bytes) took 0.607 ms]
[Serializing to 2776 bytes took 0.259 ms]
[Serializing to 784 bytes took 0.030 ms]
[Serializing to 2544 bytes took 0.051 ms]
Deserialization will reserve:
    287000 bytes per isolate
    124544 bytes per context #0
    182824 bytes per context #1
Snapshot blob consists of:
    174064 bytes in 6 chunks for startup <-- variable
     24272 bytes for read-only
     50312 bytes in 31 chunks for context #0 <-- variable
     70584 bytes in 46 chunks for context #1 <-- variable
diff --git a/tools/snapshot/node_mksnapshot.cc b/tools/snapshot/node_mksnapshot.cc
index f52cccb705..63e4b7dd40 100644
--- a/tools/snapshot/node_mksnapshot.cc
+++ b/tools/snapshot/node_mksnapshot.cc
@@ -19,6 +19,12 @@ int wmain(int argc, wchar_t* argv[]) {
 int main(int argc, char* argv[]) {
 #endif  // _WIN32
 
+  v8::V8::SetFlagsFromString("--profile_deserialization");
+  v8::V8::SetEntropySource([] (unsigned char* buffer, size_t length) {
+    memset(buffer, 0, length);
+    return true;
+  });
+
   if (argc < 2) {
     std::cerr << "Usage: " << argv[0] << " <path/to/output.cc>\n";
     return 1;

@ChALkeR
Copy link
Member Author

ChALkeR commented Aug 14, 2019

How is entropy used in the code cache process? Like, what exact properties does it affect?

I would have suspected that it could be related to hash table impl security, but anything going on there is static either way and does not change between Node.js launches as the code cache is generated at the build time, so it is predictable in runtime given that the Node.js version is fixed.

@bnoordhuis
Copy link
Member

How is entropy used in the code cache process? Like, what exact properties does it affect?

The user-visible use of entropy is Math.random(). Non-visible is ASLR, hash table randomization, and - wait for it! - instruction scheduling. Bet you didn't see that last one coming!

I poked at it some more and I discovered that changing mkcodecache.cc and node_mksnapshot.cc to use --random_seed=42 makes the build reproducible.

diff --git a/tools/code_cache/mkcodecache.cc b/tools/code_cache/mkcodecache.cc
index defa1462ce..bc45d46a43 100644
--- a/tools/code_cache/mkcodecache.cc
+++ b/tools/code_cache/mkcodecache.cc
@@ -26,6 +26,8 @@ int wmain(int argc, wchar_t* argv[]) {
 int main(int argc, char* argv[]) {
 #endif  // _WIN32
 
+  v8::V8::SetFlagsFromString("--random_seed=42");
+
   if (argc < 2) {
     std::cerr << "Usage: " << argv[0] << " <path/to/output.cc>\n";
     return 1;
diff --git a/tools/snapshot/node_mksnapshot.cc b/tools/snapshot/node_mksnapshot.cc
index f52cccb705..29f9e1cbcb 100644
--- a/tools/snapshot/node_mksnapshot.cc
+++ b/tools/snapshot/node_mksnapshot.cc
@@ -19,6 +19,8 @@ int wmain(int argc, wchar_t* argv[]) {
 int main(int argc, char* argv[]) {
 #endif  // _WIN32
 
+  v8::V8::SetFlagsFromString("--random_seed=42");
+
   if (argc < 2) {
     std::cerr << "Usage: " << argv[0] << " <path/to/output.cc>\n";
     return 1;

out/Release/node -p 'Math.random()' still prints a different number each time. V8 reseeds itself through crypto::EntropySource(), or by reading from /dev/urandom in non-crypto builds.

The hash seed is reseeded too (i.e., test/pummel/test-hash-seed.js still passes) so it might just be a Good Enough(TM) solution.

bnoordhuis added a commit to bnoordhuis/io.js that referenced this issue Aug 15, 2019
Use a fixed random seed to ensure that the generated sources are
identical across runs.

The final node binary still reseeds itself on start-up so there should
be no security implications caused by predictable random numbers (e.g.,
`Math.random()`, ASLR, the hash seed, etc.)

Fixes: nodejs#29108
@ChALkeR
Copy link
Member Author

ChALkeR commented Aug 15, 2019

instruction scheduling. Bet you didn't see that last one coming!

I surely didn't. 😮Thanks for the explanation!

@Trott Trott closed this as completed in 5116a6a Aug 20, 2019
targos pushed a commit that referenced this issue Aug 20, 2019
Use a fixed random seed to ensure that the generated sources are
identical across runs.

The final node binary still reseeds itself on start-up so there should
be no security implications caused by predictable random numbers (e.g.,
`Math.random()`, ASLR, the hash seed, etc.)

Fixes: #29108

PR-URL: #29142
Reviewed-By: Gus Caplan <me@gus.host>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants