Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm-emscripten-finalize: Make EM_ASM modifications optional #3044

Merged
merged 5 commits into from
Aug 19, 2020
Merged

Conversation

kripken
Copy link
Member

@kripken kripken commented Aug 14, 2020

wasm-emscripten-finalize renames EM_ASM calls to have the signature in
the name. This isn't actually useful - emscripten doesn't benefit from that. I
think it was optimized in fastcomp, and in upstream we copied the general
form but not the optimizations, and then EM_JS came along which is
easier to optimize anyhow.

This PR makes those changes optional: when not doing them, it just
leaves the calls as they are. Emscripten will need some changes to
handle that, but those are simple.

For convenience this adds a flag to "minimize wasm changes". The idea
is that this flag avoids needing a double-roll or other inconvenience
as the changes need to happen in tandem on the emscripten side.
The same flag can be reused for later changes similar to this one.
When they are all done we can remove the flag. (Note how the code
ifdefed by the flag can be removed once we no longer need the old
way of doing things - that is, the new approach is simpler on the
binaryen side).

See #3043

@kripken kripken requested a review from sbc100 August 14, 2020 22:54
@sbc100
Copy link
Member

sbc100 commented Aug 14, 2020

Would be nice to see the effect this change has in the test output. Could we run test/lld/ in this mode to see the difference?

@kripken
Copy link
Member Author

kripken commented Aug 14, 2020

Sure, here's the full diff:

https://gist.github.com/kripken/92242fbe40e6f1b008d946fa7cd65475

The tl;dr is that it avoids renaming the asm const functions, so no more signatures in the names:

- (import "env" "emscripten_asm_const_iii" (func $emscripten_asm_const_iii (param i32 i32 i32) (result i32)))
+ (import "env" "emscripten_asm_const_int" (func $emscripten_asm_const_int (param i32 i32 i32) (result i32)))
[..]
-   (call $emscripten_asm_const_iii
+   (call $emscripten_asm_const_int
[..]
   "declares": [
+    "emscripten_asm_const_int"
   ],

And the last part shows that the asm const are just normal imported functions now, added to the "declares" like any import.

Reading the full diff now, one odd part is this:

- (import "env" "emscripten_asm_const_i" (func $emscripten_asm_const_i (param i32) (result i32)))
+ (import "env" "_Z24emscripten_asm_const_intIJEEiPKcDpT_" (func $_Z24emscripten_asm_const_intIJEEiPKcDpT_ (param i32) (result i32)))

Were we demangling the C++ name mangling for asm consts..? Not sure I see where that would have been done. Anyhow, easy to support both on emscripten.

@sbc100
Copy link
Member

sbc100 commented Aug 14, 2020

We should probably declare those as extern "C" on the emscripten side I guess?

@kripken
Copy link
Member Author

kripken commented Aug 14, 2020

It looks like they already are. I verified that a C++ file built with em++ emits the C name. So it's just that a testcase here uses mangled names for some likely obsolete reason.

@sbc100
Copy link
Member

sbc100 commented Aug 14, 2020

OK, we should update the test input then (doesn't have to be part of this PR .. maybe we could do it first?)

I'm curious why these functions are now appearing in the "declares" section, but not being removing some some other section? Where were they appearing before? I guess they show up indirectly in asmConsts? In which case isn't adding them to declares redundant?

@kripken
Copy link
Member Author

kripken commented Aug 15, 2020

I don't think we can update the test outputs yet. The tests won't pass with the new outputs, not unless we flip the new flag on by default. But flipping it on by default will make it fail on emscripten CI.

These functions now appear as declares because that is how all imported functions are handled. Then the emscripten JS compiler tries to hook them up to JS library functions, etc. But the special code (that is disabled with the new flag, and will be removed in the future) used to do something else with them. It listed them in the metadata instead, and processed them on the emscripten side. (Note that we still need the metadata - it contains the JS contents of the EM_ASM code. The emscripten side will still parse that metadata, but it will do less with it.)

@kripken
Copy link
Member Author

kripken commented Aug 15, 2020

(Adding them to declares isn't redundant - it's just the normal behavior. And we need them to be in that list, so emscripten handles them normally. I.e., this is getting rid of special code and behavior on both sides.)

@sbc100
Copy link
Member

sbc100 commented Aug 15, 2020

What I meant my updating the tests inputs is just running scripts/test/generate_lld_tests.py. This will get rid of the C++ mangling assuming emscripten doesn't do that anymore. This script regenerates the wat file test inputs from .c files using emcc.

However I know you recently added a few tests to this directory wth .c equivalents so it might not work for all the tests. But I think its a good things to periodically run so the we know that our test inputs are truly representative of emcc outputs. Again, does not have to be part of this PR.

@sbc100
Copy link
Member

sbc100 commented Aug 15, 2020

For test inputs without .c files you might want to hand end them to remove the C++ mangling since emscripten doesn't mangle those names right?

@kripken
Copy link
Member Author

kripken commented Aug 15, 2020

Oh, I see - sorry, I misunderstood before.

Running that script now, it has no effect. Investigating, the reason is that the test with those mangled names doesn't have a C origin. It looks like there's just the wat. I can edit it by hand in a separate PR, sure.

kripken added a commit that referenced this pull request Aug 17, 2020
That had just a wat file, with no C. This adds a C file as best I
can guess - looks pretty close - and updates all the lld tests with
scripts/test/generate_lld_tests.py and ./auto_update_tests.py lld

As the diff shows, the handwritten wat was very different than what
emcc+lld emit now. I think we must have switches EM_ASMs to
be variadic at some point? That is what they currently are, and
would explain the diff.

See the discussion that led to this in #3044
@kripken kripken merged commit b43807a into master Aug 19, 2020
@kripken kripken deleted the emasmsimp branch August 19, 2020 21:55
kripken added a commit to emscripten-core/emscripten that referenced this pull request Aug 21, 2020
Emscripten counterpart to WebAssembly/binaryen#3044 , see background there,
and WebAssembly/binaryen#3043

This basically gets rid of the special handling of EM_ASMs. Instead, they are just
normal JS library methods like any other.

(The one interesting thing in them, unchanged from before this PR, is that we
need to read the varargs before doing any proxying, see the comment there for
more details.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants