Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata from incremental isn't the same as non-incremental #47066

Closed
johnklai1 opened this issue Dec 29, 2017 · 12 comments
Closed

Metadata from incremental isn't the same as non-incremental #47066

johnklai1 opened this issue Dec 29, 2017 · 12 comments
Labels
A-incr-comp Area: Incremental compilation A-metadata Area: Crate metadata

Comments

@johnklai1
Copy link

johnklai1 commented Dec 29, 2017

Updated report

Looks like metadata generated from a cold cache of incremental compilation ends up being byte-for-byte different than metadata generated from a warm cache of incremental compilation. This can be seen with this script: https://gist.github.com/alexcrichton/80c976db507a7f56a16a1f15b25293e4

Original report

It looks like #46846 may not be fully resolved.

After updating to a recent nightly I can no longer reproduce the non-determinism on Linux, but I am still seeing it on OS X:

jklai-mbp:~/serde $ md5 target/debug/deps/libserde_derive*
MD5 (target/debug/deps/libserde_derive-4b2dfc743823c6ab.dylib) = 21c2da4a6eebec6fd9dc7f80599b6f15
md5: target/debug/deps/libserde_derive-4b2dfc743823c6ab.dylib.dSYM: Is a directory
MD5 (target/debug/deps/libserde_derive_internals-99a522a1137b3a68.rlib) = 8802e6f9f3831842fc7af2708f87ce4d
jklai-mbp:~/serde $ rm target/debug/deps/libserde_derive-4b2dfc743823c6ab.dylib
jklai-mbp:~/serde $ cargo build
   Compiling serde_derive v1.0.26 (file:///Users/jklai/serde/serde_derive)
    Finished dev [unoptimized + debuginfo] target(s) in 2.79 secs
jklai-mbp:~/serde $ md5 target/debug/deps/libserde_derive*
MD5 (target/debug/deps/libserde_derive-4b2dfc743823c6ab.dylib) = 706cbc1d8a9946579e2f6a19690189c4
md5: target/debug/deps/libserde_derive-4b2dfc743823c6ab.dylib.dSYM: Is a directory
MD5 (target/debug/deps/libserde_derive_internals-99a522a1137b3a68.rlib) = 8802e6f9f3831842fc7af2708f87ce4d

serde commit:

jklai-mbp:~/serde $ git log -n 1
commit 9cfcd78c876c9d5f0806baf09514832b2d1d9115

rustc version:

jklai-mbp:~/serde $ rustc --version --verbose
rustc 1.24.0-nightly (77e189cd7 2017-12-28)
binary: rustc
commit-hash: 77e189cd79ce98cf8d39082157445db54696f316
commit-date: 2017-12-28
host: x86_64-apple-darwin
release: 1.24.0-nightly
LLVM version: 4.0

xcode version:

jklai-mbp:~/serde $ /usr/bin/xcodebuild -version
Xcode 7.3.1
Build version 7D1014
@johnklai1
Copy link
Author

@alexcrichton who helped look into #46846

@alexcrichton
Copy link
Member

alexcrichton commented Dec 30, 2017

Aha now this one's fascinating! Using the same script from #46846 I get:

--- output      2017-12-30 09:41:53.000000000 -0800
+++ output2     2017-12-30 09:40:24.000000000 -0800
@@ -204,7 +204,7 @@
 arg: -o
 arg: /Users/acrichton/code/serde/target/debug/deps/libserde_derive-e81c055c9829c0e5.dylib
 arg: /Users/acrichton/code/serde/target/debug/deps/serde_derive-e81c055c9829c0e5.crate.metadata.rcgu.o
- sha1: 30aae66513a4a2b872458585f9fbf409f4552fd0
+ sha1: 2f9b0e3dfc7ef9016e9fa06e5aac78b71d534cb5
 arg: /Users/acrichton/code/serde/target/debug/deps/serde_derive-e81c055c9829c0e5.crate.allocator.rcgu.o
  sha1: 0968fc565a4a0a2d8914a70cf05adba93efe0df0
 arg: -Wl,-dead_strip

So the metadata is changing! (cc @michaelwoerister, minimal test case)

Interestingly I can only reproduce this nondeterminism though on the first incremental rebuild. That is:

$ rm -rf target
$ cargo +nightly build -p serde_derive && md5 target/debug/deps/libserde_derive*.dylib
MD5 (target/debug/deps/libserde_derive-e81c055c9829c0e5.dylib) = 81edae9023e3aba77074074ca415a013
$ rm -rf target/debug/deps/libserde_derive* && cargo +nightly build -p serde_derive && md5 target/debug/deps/libserde_derive*.dylib
MD5 (target/debug/deps/libserde_derive-e81c055c9829c0e5.dylib) = ed0b3fc7813d7ec6c7ffcb44fb290550
$ rm -rf target/debug/deps/libserde_derive* && cargo +nightly build -p serde_derive && md5 target/debug/deps/libserde_derive*.dylib
MD5 (target/debug/deps/libserde_derive-e81c055c9829c0e5.dylib) = ed0b3fc7813d7ec6c7ffcb44fb290550
# ...

@johnklai1 is this what you're seeing as well? Or is the md5 of libserde_derive.dylib constantly changing? Or only after the first build?


I originally thought this was related to incremental compilation, but once I set CARGO_INCREMENTAL=0 then the md5 is actually always changing! What's really odd here though is that the input to the linker is constant (same sha1 and everything). That to me means that either the linker is nondeterministic (unlikely) or something else after linking (dsymutil maybe?) is nondeterministic.

I was having trouble narrowing down what's going on here. I took a diff of the xxd output on both files and there's only a few bytes of difference. (both files are included there as well).

Now I wasn't quite sure what was going on here so I tried to reduce it. Turns out ...

$ rustc +nightly /dev/null -C debuginfo=1 -o d.dylib --crate-type proc-macro && md5 d.dylib
MD5 (d.dylib) = 210b2835a8ac8f9d86ba809f5b0c35d6
$ rustc +nightly /dev/null -C debuginfo=1 -o d.dylib --crate-type proc-macro && md5 d.dylib
MD5 (d.dylib) = 971fc21184955eac740d207e38a47b2f

This is where things started to get real crazy. With the help of goblin (took me forever to realize to use that) I hunted this down to...

# Compile an object file with a symbol in it
$ echo 'void foo() {}' > foo.c
$ cc -g foo.c -o foo.o -c

# Link that object to a shared library, take a look at the output
$ cc foo.o -m64 -dynamiclib -o libfoo.dylib -Wl,-dylib
$ md5 libfoo.dylib
MD5 (libfoo.dylib) = e60e735b7c919c19259daddd04a625c8

# update the timestamp on the object file
$ sleep 1
$ touch foo.o

# now link the same way we did above
$ cc foo.o -m64 -dynamiclib -o libfoo.dylib -Wl,-dylib
$ md5 libfoo.dylib
MD5 (libfoo.dylib) = 9754a78562696bbe5912efd9fc892a83

Turns out with debuginfo on OSX somehow the timestamp of the object file is going into the final output. That's why this isn't showing up in incremental compilation because the timestamp isn't changing (we're reusing the object file). That is, however, why this is showing up with random empty files!

I... dunno what to do about this latter one. I suppose rustc could always set the mtime to all input object files to something deterministic though?

@alexcrichton
Copy link
Member

Looking at the open source code available for ld64 it appears there's no way to not use the literal mtime without completely emitting debuginfo altogether (the -S argument to the linker).

It looks like lld also unconditionally gets the mtime without an option to reset it...

Our only recourse on the latter issue here may be to just always reset the mtime to a known value. That wouldn't help anyone linking in C code but it could hopefully help pure-Rust projects at least?

@alexcrichton
Copy link
Member

I've also minimized the "first time incremental metadata changes" to https://gist.github.com/alexcrichton/80c976db507a7f56a16a1f15b25293e4 I believe.

@johnklai1
Copy link
Author

Wow, awesome investigation @alexcrichton !

@johnklai1 is this what you're seeing as well? Or is the md5 of libserde_derive.dylib constantly changing? Or only after the first build?

Yeah, this appears to be what I am seeing as well. If I run cargo clean followed by cargo build I get a new md5. Then if I remove the libserde_derive.dylib file and run cargo build again I get a different md5. But then removing the dylib and re-running cargo build no longer changes the md5.

I'm taking a look at Windows too and will report back with my findings.

@alexcrichton alexcrichton changed the title serde compilation is not deterministic Metadata from incremental isn't the same as non-incremental Dec 31, 2017
@alexcrichton
Copy link
Member

I've extracted the OSX issue to #47086, retitled this issue to focus on the incremental problem, and updated the OP to have the current test case I've been using.

@johnklai1 ok cool thanks for confirming. Windows will for sure be affected by this specific issue (incremental compilation with a warm and cold cache causes different outputs b/c the metadata object file is changing). Whether or not Windows suffers from an issue like #47086 though I'm not sure, and would for sure be good to investigate :)

@michaelwoerister I did some more debugging in why the metadata is changing. I'm not super familiar with metadata and such and I found it difficult to debug, but I'm relatively certain that what's happening here is that a Span is being encoded differently between two runs of the compiler. The good news is that it's the same span, it's just that the encoding is different.

I believe that the closure in the test case has the MIR make its way into the metadata (not sure why but hey). I think that the encoded MIR for this closure is what's different in the metadata (again though, I find it difficult to say this with certainty).

What I believe, though, is that with a cold cache (no incremental directory) a span is generated with the encoding 00 00 (two zero bytes). The span in question I believe has hi/lo of both zero. With a warm cache it appears to be encoded as fa bb c3 02 ac bc c3 02 (which if I got uleb128 right I think is 5299706 and 5299756). However with the logging I have I believe both spans are indeed decoded as 0 for hi/lo, so this is just an encoding issue and not a "oh no spans aren't deterministic" issue.

I noticed that there's a pretty tricky implementation for spans being decoded, and I wonder if that's involved here perhaps? Do you know if the set of imported_filemaps changes over time between compilation sessions?

@michaelwoerister
Copy link
Member

I think this problem arises due to our handling of invalid spans: Invalid spans (e.g. with lo > hi) cannot be properly represented in incr. comp. cache, so we map them to DUMMY_SP during serialization there. We don't do exactly the same for crate metadata. There we save them as they are but map them to DUMMY_SP during deserialization. As far as the compiler is concerned, there is no difference between the approaches (it always sees a DUMMP_SP) but the binary contents can be different.

This should be easily fixable by also adjusting invalid spans at serialization time for crate metadata.

Thanks for debugging into this, @alexcrichton!

@michaelwoerister michaelwoerister added A-incr-comp Area: Incremental compilation A-metadata Area: Crate metadata labels Jan 3, 2018
@petrochenkov
Copy link
Contributor

petrochenkov commented Jan 3, 2018

It's strange that spans with lo > hi even exist because they are eliminated in Span::new (

encode(&match lo <= hi {
true => SpanData { lo, hi, ctxt },
false => SpanData { lo: hi, hi: lo, ctxt },
})
).
You have to work with SpanData exclusively without ever going through Span to get spans with lo > hi, but this never happens in rustc as far as I can see.

@michaelwoerister
Copy link
Member

From what I can tell, these invalid spans originate mostly (or even exclusively) from macro expansion. Another case of invalid span is when lo points to a different FileMap than hi.

@petrochenkov
Copy link
Contributor

Oh wait, looks like decoder doesn't work with spans, but with raw byte positions from codemap

bors added a commit that referenced this issue Jan 7, 2018
…tadata-encoding, r=alexcrichton

Map invalid Spans to DUMMY_SP during crate metadata encoding.

This mirrors what we do for stabilizing the incr. comp. cache and is necessary for reproducible builds. For the incr. comp. cache, we *have* to do this because the encoding there cannot represent broken Spans. Metadata encoding has to be in sync with that as not to get unexpected interactions when compiling incrementally.

This should help with fixing issue #47066.

r? @alexcrichton
@michaelwoerister
Copy link
Member

The latest nightly should contain the fix from #47177.

@alexcrichton
Copy link
Member

Looks to be so, thanks @michaelwoerister!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation A-metadata Area: Crate metadata
Projects
None yet
Development

No branches or pull requests

4 participants