Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename __invoke_* functions in wasm backend? #3081

Closed
kripken opened this issue Aug 27, 2020 · 8 comments
Closed

Rename __invoke_* functions in wasm backend? #3081

kripken opened this issue Aug 27, 2020 · 8 comments
Assignees

Comments

@kripken
Copy link
Member

kripken commented Aug 27, 2020

Currently wasm-emscripten-finalize finds functions starting with __invoke_ and renames them. This is for asm.js-style C++ exceptions and longjmp support. The renaming is important because the type in the wasm is the LLVM IR type, as this comment from wasm-emscripten.cpp says:

  // LLVM backend lowers
  //   invoke @func(arg1, arg2) to label %invoke.cont unwind label %lpad
  // into
  // ... (some code)
  //   call @invoke_SIG(func, arg1, arg2)
  // ... (some code)
  // SIG is a mangled string generated based on the LLVM IR-level function
  // signature. In LLVM IR, types are not lowered yet, so this mangling scheme
  // simply takes LLVM's string representtion of parameter types and concatenate
  // them with '_'. For example, the name of an invoke wrapper for function
  // void foo(struct mystruct*, int) will be
  // "__invoke_void_%struct.mystruct*_int".
  // This function converts the names of invoke wrappers based on their lowered
  // argument types and a return type. In the example above, the resulting new
  // wrapper name becomes "invoke_vii".

So there can be many invokes with different IR names that turn into the same invoke on wasm types. The ratio can be very large, for example all LLVM IR pointers (i32*, mystruct*, etc.) would get a different name in IR, but all we need in wasm is an i32 for them.

In #3043 we are working to remove the work done in wasm-emscripten-finalize, so we'd like to remove this. We could support all the invokes in emscripten directly, but the large number would be a significant regression in debug builds (where we don't do anything after link). Instead, could we rename them in the wasm backend? That is, I think I understand what that comment says about using LLVM IR types, as the emscripten-exceptions and -longjmp passes are on LLVM IR. But could we add a backend pass that renames them?

(The names don't need to match anything - it can be any naming scheme whatsoever. The only goal is to avoid one invoke per LLVM IR type, it's that amount that is the problem.)

cc @aheejin @tlively

@tlively
Copy link
Member

tlively commented Aug 27, 2020

Yes, I think this should be possible. What does wasm-emscripten-finalize rename them to? And what motivates the renaming?

@kripken
Copy link
Member Author

kripken commented Aug 27, 2020

The motivation is just the numbers issue - to have as few as possible. The number can get really out of hand, e.g. hello world with c++ iostreams with exceptions enabled has 177 invokes (many with long names like __invoke_%"class.std::runtime_error"*_%"class.std::runtime_error"*_%"class.std::__2::basic_string"* that bloat the binary), but after wasm-emscripten-finalize it's just 22.

The concrete renaming used atm is vii, ijdi etc. as in emscripten in general, where the first is v=void or the return type, and the others are i=i32, j=i64, f=f32, d=f64. It might be nice to do the same in the backend just to avoid confusion (but if there's something easier, any naming scheme would be fine, we'll be looking at the types and not the name).

@tlively
Copy link
Member

tlively commented Aug 27, 2020

I see thanks. This would probably be a good time to switch to a naming convention that can handle multivalue functions as well. Although my guess is that actually supporting multivalue would require additional upstream changes.

@aheejin
Copy link
Member

aheejin commented Aug 27, 2020

Those names are generated in LowerEmscriptenEHSjLj pass in LLVM, but that pass is an IR pass and we don't know the final lowered types yet, so I guess as @kripken said we need a new backend pass.

The current naming scheme assumes the first character to be a return type, but this might have to change in the presence of multivalue. I guess we can delimit the return type and param types with an additional _. For example, if a function takes a multivalue type (i32, i32) and returns i32 and i64, the function name can be something like __invoke_ii_ij.

@tlively, what additional upstream change do you think needs to happen?

(I rewrote this comment. Please ignore the comment I posted and deleted 10 mins ago)

@kripken
Copy link
Member Author

kripken commented Aug 27, 2020

Note that this itself could maybe be a backend pass that assumes multivalue is not enabled, since we could just say that fastcomp-style exceptions and longjmp don't support multivalue (but new wasm exceptions support would work with multivalue - and not need this pass).

@tlively
Copy link
Member

tlively commented Aug 27, 2020

@aheejin, I don't know of any particular upstream changes that would be necessary, but I also don't know that it would already work with multivalue, so I was just being cautious. I think @kripken is right though that we probably don't need to support multivalue here.

@kripken
Copy link
Member Author

kripken commented Sep 3, 2020

It looks like this LLVM change will soon be the last item blocking #3043, which blocks wasm debugging. We were hoping to get that done by the end of September. There's almost a whole month left, so no rush, but please let me know if that's not realistic.

@aheejin aheejin self-assigned this Sep 4, 2020
kripken added a commit to emscripten-core/emscripten that referenced this issue Sep 14, 2020
…CHANGES_AFTER_LINK option (#12173)

emscripten.py now tracks whether we actually need to run wasm-emscripten-finalize.
Currently we don't need to in a specific subset of -O0 builds, in which we don't
need legalization (by using BigInt support) and we don't need invokes (which means
we don't use longjmp or C++ exceptions; this restriction will be lifted once
WebAssembly/binaryen#3081 is fixed).

This new option will show an error if any binaryen work
is done after wasm-ld. That is, it ensures that we do not run wasm-emscripten-finalize
or wasm-opt. That is useful to verify that the link is as fast as possible and also that
it does not rewrite DWARF, which is necessary to support split DWARF (binaryen can
rewrite normal DWARF, but it's simpler and better to support split DWARF by just
not doing any binaryen work after wasm-ld).

See WebAssembly/binaryen#3043
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
Companions:
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
Companions:
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
Companions:
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
Companions:
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Some lld tests now have new dynCalls in their results, which is
expected, because they have

Addresses: WebAssembly#3043 and WebAssembly#3081
Companions:
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we rename invoke wrappers and `emscripten_longjmp_jmpbuf` in
the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
aheejin added a commit to aheejin/binaryen that referenced this issue Oct 1, 2020
Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf`
in the wasm backend, this deletes all related renaming routines and
relevant tests. But we still need to generate dynCalls for invokes; for
that this adds dynCall generations for invokes in `GenerateDynCalls`
pass, and moves related functions from wasm-emscripten.cpp to
GenerateDynCalls.cpp, given that now they are only used there.

Addresses: WebAssembly#3043 and WebAssembly#3081
aheejin added a commit to aheejin/emscripten that referenced this issue Oct 1, 2020
Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf`
in the wasm backend, using Emscripten EH or SjLj does not need
Binaryen's postprocessing. This makes Emscripten not enable Binaryen
postprocessing using wasm-emscripten-finalize when we are using them,
and deletes all special handling for `emscripten_longjmp_jmpbuf`, given
that Emscripten will not see it.

Addresses:
WebAssembly/binaryen#3043
WebAssembly/binaryen#3081
aheejin added a commit to llvm/llvm-project that referenced this issue Oct 7, 2020
Renaming for some Emscripten EH functions has so far been done in
wasm-emscripten-finalize tool in Binaryen. But recently we decided to
make a compilation/linking path that does not rely on
wasm-emscripten-finalize for modifications, so here we move that
functionality to LLVM.

Invoke wrappers are generated in LowerEmscriptenEHSjLj pass, but final
wasm types are not available in the IR pass, we need to rename them at
the end of the pipeline.

This patch also removes uses of `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass, replacing that with `emscripten_longjmp`.
`emscripten_longjmp_jmpbuf` is lowered to `emscripten_longjmp`, but
previously we generated calls to `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass because it takes `jmp_buf*` instead of `i32`.
But we were able use `ptrtoint` to make it use `emscripten_longjmp`
directly here.

Addresses:
WebAssembly/binaryen#3043
WebAssembly/binaryen#3081

Companions:
WebAssembly/binaryen#3191
emscripten-core/emscripten#12399

Reviewed By: dschuff, tlively, sbc100

Differential Revision: https://reviews.llvm.org/D88697
aheejin added a commit that referenced this issue Oct 11, 2020
Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf`
in the wasm backend, this deletes all related renaming routines and
relevant tests. Depends on #3192.

Addresses: #3043 and #3081

Companions:
https://reviews.llvm.org/D88697
emscripten-core/emscripten#12399
aheejin added a commit to emscripten-core/emscripten that referenced this issue Oct 11, 2020
Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf`
in the wasm backend, using Emscripten EH or SjLj does not need
Binaryen's postprocessing. This makes Emscripten not enable Binaryen
postprocessing using wasm-emscripten-finalize when we are using them,
and deletes all special handling for `emscripten_longjmp_jmpbuf`, given
that Emscripten will not see it.

Addresses:
WebAssembly/binaryen#3043
WebAssembly/binaryen#3081

Companions:
https://reviews.llvm.org/D88697
WebAssembly/binaryen#3191
@aheejin aheejin closed this as completed Oct 11, 2020
arichardson pushed a commit to arichardson/llvm-project that referenced this issue Mar 24, 2021
Renaming for some Emscripten EH functions has so far been done in
wasm-emscripten-finalize tool in Binaryen. But recently we decided to
make a compilation/linking path that does not rely on
wasm-emscripten-finalize for modifications, so here we move that
functionality to LLVM.

Invoke wrappers are generated in LowerEmscriptenEHSjLj pass, but final
wasm types are not available in the IR pass, we need to rename them at
the end of the pipeline.

This patch also removes uses of `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass, replacing that with `emscripten_longjmp`.
`emscripten_longjmp_jmpbuf` is lowered to `emscripten_longjmp`, but
previously we generated calls to `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass because it takes `jmp_buf*` instead of `i32`.
But we were able use `ptrtoint` to make it use `emscripten_longjmp`
directly here.

Addresses:
WebAssembly/binaryen#3043
WebAssembly/binaryen#3081

Companions:
WebAssembly/binaryen#3191
emscripten-core/emscripten#12399

Reviewed By: dschuff, tlively, sbc100

Differential Revision: https://reviews.llvm.org/D88697
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
Renaming for some Emscripten EH functions has so far been done in
wasm-emscripten-finalize tool in Binaryen. But recently we decided to
make a compilation/linking path that does not rely on
wasm-emscripten-finalize for modifications, so here we move that
functionality to LLVM.

Invoke wrappers are generated in LowerEmscriptenEHSjLj pass, but final
wasm types are not available in the IR pass, we need to rename them at
the end of the pipeline.

This patch also removes uses of `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass, replacing that with `emscripten_longjmp`.
`emscripten_longjmp_jmpbuf` is lowered to `emscripten_longjmp`, but
previously we generated calls to `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass because it takes `jmp_buf*` instead of `i32`.
But we were able use `ptrtoint` to make it use `emscripten_longjmp`
directly here.

Addresses:
WebAssembly/binaryen#3043
WebAssembly/binaryen#3081

Companions:
WebAssembly/binaryen#3191
emscripten-core/emscripten#12399

Reviewed By: dschuff, tlively, sbc100

Differential Revision: https://reviews.llvm.org/D88697
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants