Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: Allow safe #[target_feature] on wasm #84988

Merged
merged 1 commit into from
Jun 3, 2021

Conversation

alexcrichton
Copy link
Member

This commit updates the compiler's handling of the #[target_feature]
attribute when applied to functions on WebAssembly-based targets. The
compiler in general requires that any functions with #[target_feature]
are marked as unsafe as well, but this commit relaxes the restriction
for WebAssembly targets where the attribute can be applied to safe
functions as well.

The reason this is done is that the motivation for this feature of the
compiler is not applicable for WebAssembly targets. In general the
#[target_feature] attribute is used to enhance target CPU features
enabled beyond the basic level for the rest of the compilation. If done
improperly this means that your program could execute an instruction
that the CPU you happen to be running on does not understand. This is
considered undefined behavior where it is unknown what will happen (e.g.
it's not a deterministic SIGILL).

For WebAssembly, however, the target is different. It is not possible
for a running WebAssembly program to execute an instruction that the
engine does not understand. If this were the case then the program would
not have validated in the first place and would not run at all. Even if
this were allowed in some hypothetical future where engines have some
form of runtime feature detection (which they do not right now) any
implementation of such a feature would generate a trap if a module
attempts to execute an instruction the module does not understand. This
deterministic trap behavior would still not fall into the category of
undefined behavior because the trap is deterministic.

For these reasons the #[target_feature] attribute is now allowed on
safe functions, but only for WebAssembly targets. This notably enables
the wasm-SIMD intrinsics proposed for stabilization in #74372 to be
marked as safe generally instead of today where they're all unsafe due
to the historical implementation of #[target_feature] in the compiler.

@rust-highfive
Copy link
Collaborator

r? @jackh726

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 6, 2021
@nagisa nagisa added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label May 6, 2021
@nagisa
Copy link
Member

nagisa commented May 6, 2021

Marked this as T-lang (@rust-lang/lang) because this kind of change seems mostly a language change, rather than, say, an implementation one (which does look good to me).

@joshtriplett
Copy link
Member

Looks good to me!

@rfcbot merge

@rfcbot
Copy link

rfcbot commented May 6, 2021

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels May 6, 2021
@scottmcm
Copy link
Member

Hmm, this does seem fine, but leaves me wondering where the edge is for the "WebAssembly which is a 100% safe target at execution time". Is it basically that target_feature is, in a sense, not part of the Rust Abstract Machine? Because I assume that we wouldn't want to allow, say, pointer dereferences in safe code in WASM even though it's a safe target where reading anything in the WASM memory is "fine", in a sense.

@joshtriplett
Copy link
Member

@scottmcm It's still Rust's responsibility to make sure you can't invoke UB via safe code. But given the WebAssembly verifier, if your program loads at all, your functions (with or without #[target_feature]) will all be callable, and you can't invoke UB by calling them.

@alexcrichton
Copy link
Member Author

The way I think about this is that Rust still has UB in the sense of raw pointers and such. This UB is exploited by the optimizer so while your final program will have deterministic behavior on wasm it may not be the behavior that you intended. The determinism is just a property of the target, not of the language.

For something like target-specific intrinsics the reason that #[target_feature] requires unsafe, and I believe the only reason, is that a guarantee cannot be provided about what happens when an old CPU executes a future instruction. Targets like x86 (afaik) provide no guarantees in this regard, so your code could actually be UB if you execute it on the wrong computer. Targets like WebAssembly, however, have no need for this restriction, due to the validation propeties of the target. WebAssembly specifically (and no other target) provides the guarantee that all old "cpus" will fault (aka trap) executing some future instruction.

Basically the target_feature attribute has precisely one reason why it's unsafe, and WebAssembly specifically is designed such that the reason does not apply. This does not apply to WebAssembly as a whole because the execution model still includes UB for things like optimizers and such. None of that UB is what target_feature is unsafe, though.

@RalfJung
Copy link
Member

RalfJung commented May 15, 2021

For something like target-specific intrinsics the reason that #[target_feature] requires unsafe, and I believe the only reason, is that a guarantee cannot be provided about what happens when an old CPU executes a future instruction.

Currently, target_feature-UB is listed in the reference as regular UB, which means the optimizer could exploit it, so making such things safe would be unsound even on a wasm target. If we want to change this, we need to somehow get a guarantee from LLVM that the optimizer will not exploit target_feature mismatches for optimizations. I think accepting this change must be blocked on that investigation. If/when this change lands, we then also need to adjust the reference and remove or at least qualify the target_feature-UB.

More generally, I am somewhat concerned about having safety depend on the target. Having the same code be safe or unsafe for different targets is very surprising IMO, and also reinforces an incorrect (but widespread) model of UB/unsafe that focuses on "what the target/hardware" does -- which is not an adequate way to think about UB/unsafe.

@jackh726
Copy link
Member

Since this is a language change, going to reassign to lang-team.

r? @joshtriplett

@alexcrichton
Copy link
Member Author

I'm happy to update the reference here, but I personally still believe that this should be a safe operation on wasm. All the concrete reasons this is unsafe on x86 and other architectures don't apply to wasm. For example there's no such thing as executing an unknown instruction. There's no such thing as interpreting one instruction as another. These are intrinsic properties of the wasm target.

It's a good point to not push too much into the realm of equating UB and machine behavior, but at the same time I think reasonable adjustments should still be possible. I'm not trying to say "wasm has no UB so Rust should have no unsafe on wasm". I very much understand why that's never what we would want. This very specific feature of wasm is the only thing in question (the #[target_feature] attribute).

I would not want to commit to requiring unsafe here without an example of where things could actually go wrong. All the examples in other threads of why it's unsafe on x86, for example, are not applicable to WebAssembly. To be more specific I would personally want something concrete to point to instead of just sort of being fearful of LLVM and its implementation. My interest in this is stabilizing wasm simd, though, and I do not have time to go learn the entire LLVM backend and deduce that this is safe or not. If that's the requirement then I'd ask @joshtriplett if he's ok removing the concern on the stabilization issue.

@RalfJung
Copy link
Member

RalfJung commented May 15, 2021

All the concrete reasons this is unsafe on x86 and other architectures don't apply to wasm. For example there's no such thing as executing an unknown instruction. There's no such thing as interpreting one instruction as another. These are intrinsic properties of the wasm target.

The question is, does LLVM perform optimizations based on "this target flag is not that in that function so that function will never call this code here"? If yes, that would be a reason this is unsafe that applies on all targets, since things happen on the LLVM IR level.

Crudely speaking, LLVM could transform

// Use your favorite wasm target feature instead.
#[target_feature(enable = "sse4")] unsafe fn foo_sse4() {}

fn main() {
  unsafe { foo_sse4() }
}

into

fn main() {
  unsafe { unreachable_unchecked() }
}

If it did, I hope we agree that this would mean we couldn't make such calls safe on wasm.

I don't know if it actually does this, or if it documents that it won't do this. It might well be that neither of these applies, which means we'd ride a dangerous line of relying on unspecified LLVM details -- but potentially we could submit upstream patches to improve the docs and thus establish this as a proper guarantee.
Cc @rust-lang/wg-llvm your expertise is needed here. :)

@alexcrichton
Copy link
Member Author

I can try to answer the question to the best of my ability, but I'm not an LLVM expert in that I'm aware of all of LLVM's design decisions.

The example program above, I think, is an optimization that can never happen. The whole premise of enabling extra target features on some functions is that you can transition to them from other functions that don't have the target features enabled. This can only safely happen if you properly guard first on platforms like x86, but from LLVM's perspective all it will ever see is function main called function foo_sse4. The pattern you listed above is already exactly how simd works on all other platforms (including x86).

I can't really imagine a scenario in which LLVM could legally apply such an optimization. Maybe if LLVM deduced that you didn't properly check for sse4 support then it could do such an optimization, but at that point we're working with an antagonistic instead of an optimizing compiler.

I'm not really sure I can say what needs to be said to satisfy your concerns though @RalfJung. I'm not an LLVM expert in that I'm aware of all of LLVM's historical design decisions. I've worked extensively with LLVM with lots of targets, but all I can say here is my opinion and anecdotal evidence. I don't have the connections (or the time/motivation really) to canvas LLVM at large or other folks at large for what we can do in this situation.

In some sense this to me feels like infinite loops. Infinite loops should never have been UB in LLVM from Rust's perspective, but they were in LLVM. Over time that was fixed in LLVM. Similarly calling future instructions really should not be UB in WebAssembly, it just really doesn't make sense for the target. If we discover something in the future where it's actually UB then I think we should fix that. I don't think we should be beholden to a fear that one day something will accidentally be UB even though you didn't write unsafe in Rust. That to me sounds like a compiler bug that needs fixing rather than something we need to redesign library APIs around. I'm pretty confident that there's no UB today and while I'm not confident that there won't be UB tomorrow I feel like we can fix it when we get there.

@nikomatsakis
Copy link
Contributor

@rfcbot reviewed

I was persuaded by our discussion in today's lang team meeting.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label May 18, 2021
@rfcbot
Copy link

rfcbot commented May 18, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label May 18, 2021
@RalfJung
Copy link
Member

I can't really imagine a scenario in which LLVM could legally apply such an optimization. Maybe if LLVM deduced that you didn't properly check for sse4 support then it could do such an optimization, but at that point we're working with an antagonistic instead of an optimizing compiler.

I'm not one for anthropomorphizing compilers -- if LLVM's spec allows this, we might dislike the spec, but I wouldn't call that "antagonistic".

However, chances are the spec doesn't really say either way, so an upstream bugreport asking for spec clarification might be a good idea. I'm not sure about this though, hence me pinging our LLVM crew. :)

If we discover something in the future where it's actually UB then I think we should fix that. I don't think we should be beholden to a fear that one day something will accidentally be UB even though you didn't write unsafe in Rust. That to me sounds like a compiler bug that needs fixing rather than something we need to redesign library APIs around. I'm pretty confident that there's no UB today and while I'm not confident that there won't be UB tomorrow I feel like we can fix it when we get there.

That's totally fair -- basically we're taking a bet here that if actual spec issues come up, we can work with LLVM to have them change/extend their spec so that it works for us. Seems like a reasonable bet to me.
I can't judge the risk of LLVM already doing something like this right now, so I will leave that risk assessment to others. :)

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels May 28, 2021
@alexcrichton
Copy link
Member Author

ping @joshtriplett would you be up for reviewing and r+'ing this?

@joshtriplett
Copy link
Member

Given that we have lang consensus, and that @nagisa said it looks fine from an implementation perspective:

@bors r+

@bors
Copy link
Contributor

bors commented Jun 2, 2021

📌 Commit 7fed92b has been approved by joshtriplett

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 2, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 3, 2021
…sm, r=joshtriplett

rustc: Allow safe #[target_feature] on wasm

This commit updates the compiler's handling of the `#[target_feature]`
attribute when applied to functions on WebAssembly-based targets. The
compiler in general requires that any functions with `#[target_feature]`
are marked as `unsafe` as well, but this commit relaxes the restriction
for WebAssembly targets where the attribute can be applied to safe
functions as well.

The reason this is done is that the motivation for this feature of the
compiler is not applicable for WebAssembly targets. In general the
`#[target_feature]` attribute is used to enhance target CPU features
enabled beyond the basic level for the rest of the compilation. If done
improperly this means that your program could execute an instruction
that the CPU you happen to be running on does not understand. This is
considered undefined behavior where it is unknown what will happen (e.g.
it's not a deterministic `SIGILL`).

For WebAssembly, however, the target is different. It is not possible
for a running WebAssembly program to execute an instruction that the
engine does not understand. If this were the case then the program would
not have validated in the first place and would not run at all. Even if
this were allowed in some hypothetical future where engines have some
form of runtime feature detection (which they do not right now) any
implementation of such a feature would generate a trap if a module
attempts to execute an instruction the module does not understand. This
deterministic trap behavior would still not fall into the category of
undefined behavior because the trap is deterministic.

For these reasons the `#[target_feature]` attribute is now allowed on
safe functions, but only for WebAssembly targets. This notably enables
the wasm-SIMD intrinsics proposed for stabilization in rust-lang#74372 to be
marked as safe generally instead of today where they're all `unsafe` due
to the historical implementation of `#[target_feature]` in the compiler.
@bors
Copy link
Contributor

bors commented Jun 3, 2021

⌛ Testing commit 7fed92b with merge 016e9b5...

@bors
Copy link
Contributor

bors commented Jun 3, 2021

☀️ Test successful - checks-actions
Approved by: joshtriplett
Pushing 016e9b5 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 3, 2021
@bors bors merged commit 016e9b5 into rust-lang:master Jun 3, 2021
@rustbot rustbot added this to the 1.54.0 milestone Jun 3, 2021
alexcrichton added a commit to alexcrichton/stdarch that referenced this pull request Jun 4, 2021
This updates all simd intrinsics to be safe to account for
rust-lang/rust#84988. In the spirit of more safety it also removes the
pervasive usage of `transmute` in the intrinsics in favor of more
targeted casts.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jun 8, 2021
This commit fixes an issue not found during rust-lang#84988 where rustdoc is used
to document cross-platform intrinsics but it was requiring that
functions which use `#[target_feature]` are `unsafe` erroneously, even
if they're WebAssembly specific. Rustdoc today, for example, already has
a special case where it enables annotations like
`#[target_feature(enable = "simd128")]` on platforms other than
WebAssembly. The purpose of this commit is to relax the "require all
`#[target_feature]` functions are `unsafe`" requirement for all targets
whenever rustdoc is running, enabling all targets to fully document
other targets, such as WebAssembly, where intrinsics functions aren't
always `unsafe`.
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Jun 9, 2021
Enable rustdoc to document safe wasm intrinsics

This commit fixes an issue not found during rust-lang#84988 where rustdoc is used
to document cross-platform intrinsics but it was requiring that
functions which use `#[target_feature]` are `unsafe` erroneously, even
if they're WebAssembly specific. Rustdoc today, for example, already has
a special case where it enables annotations like
`#[target_feature(enable = "simd128")]` on platforms other than
WebAssembly. The purpose of this commit is to relax the "require all
`#[target_feature]` functions are `unsafe`" requirement for all targets
whenever rustdoc is running, enabling all targets to fully document
other targets, such as WebAssembly, where intrinsics functions aren't
always `unsafe`.
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Jun 10, 2021
@jonas-schievink jonas-schievink added relnotes Marks issues that should be documented in the release notes of the next release. and removed relnotes Marks issues that should be documented in the release notes of the next release. labels Jul 17, 2021
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Aug 12, 2021
Pkgsrc changes:
 * Bump bootstrap requirements to 1.53.0.
 * Adjust patches, adapt to upstream changes, adjust cargo checksums
 * If using an external llvm, require >= 10.0

Upsteream changes:

Version 1.54.0 (2021-07-29)
============================

Language
-----------------------

- [You can now use macros for values in built-in attribute macros.][83366]
  While a seemingly minor addition on its own, this enables a lot of
  powerful functionality when combined correctly. Most notably you can
  now include external documentation in your crate by writing the following.
  ```rust
  #![doc = include_str!("README.md")]
  ```
  You can also use this to include auto-generated modules:
  ```rust
  #[path = concat!(env!("OUT_DIR"), "/generated.rs")]
  mod generated;
  ```

- [You can now cast between unsized slice types (and types which contain
  unsized slices) in `const fn`.][85078]
- [You can now use multiple generic lifetimes with `impl Trait` where the
   lifetimes don't explicitly outlive another.][84701] In code this means
   that you can now have `impl Trait<'a, 'b>` where as before you could
   only have `impl Trait<'a, 'b> where 'b: 'a`.

Compiler
-----------------------

- [Rustc will now search for custom JSON targets in
  `/lib/rustlib/<target-triple>/target.json` where `/` is the "sysroot"
  directory.][83800] You can find your sysroot directory by running
  `rustc --print sysroot`.
- [Added `wasm` as a `target_family` for WebAssembly platforms.][84072]
- [You can now use `#[target_feature]` on safe functions when targeting
  WebAssembly platforms.][84988]
- [Improved debugger output for enums on Windows MSVC platforms.][85292]
- [Added tier 3\* support for `bpfel-unknown-none`
   and `bpfeb-unknown-none`.][79608]

\* Refer to Rust's [platform support page][platform-support-doc] for more
   information on Rust's tiered platform support.

Libraries
-----------------------

- [`panic::panic_any` will now `#[track_caller]`.][85745]
- [Added `OutOfMemory` as a variant of `io::ErrorKind`.][84744]
- [ `proc_macro::Literal` now implements `FromStr`.][84717]
- [The implementations of vendor intrinsics in core::arch have been
   significantly refactored.][83278] The main user-visible changes are
   a 50% reduction in the size of libcore.rlib and stricter validation
   of constant operands passed to intrinsics. The latter is technically
   a breaking change, but allows Rust to more closely match the C vendor
   intrinsics API.

Stabilized APIs
---------------

- [`BTreeMap::into_keys`]
- [`BTreeMap::into_values`]
- [`HashMap::into_keys`]
- [`HashMap::into_values`]
- [`arch::wasm32`]
- [`VecDeque::binary_search`]
- [`VecDeque::binary_search_by`]
- [`VecDeque::binary_search_by_key`]
- [`VecDeque::partition_point`]

Cargo
-----

- [Added the `--prune <spec>` option to `cargo-tree` to remove a package from
  the dependency graph.][cargo/9520]
- [Added the `--depth` option to `cargo-tree` to print only to a certain depth
  in the tree ][cargo/9499]
- [Added the `no-proc-macro` value to `cargo-tree --edges` to hide procedural
  macro dependencies.][cargo/9488]
- [A new environment variable named `CARGO_TARGET_TMPDIR` is
  available.][cargo/9375]
  This variable points to a directory that integration tests and
  benches can use as a "scratchpad" for testing filesystem operations.

Compatibility Notes
-------------------
- [Mixing Option and Result via `?` is no longer permitted in
  closures for inferred types.][86831]
- [Previously unsound code is no longer permitted where different
  constructors in branches could require different lifetimes.][85574]
- As previously mentioned the [`std::arch` instrinsics now uses
  stricter const checking][83278] than before and may reject some
  previously accepted code.
- [`i128` multiplication on Cortex M0+ platforms currently
  unconditionally causes overflow when compiled with `codegen-units
  = 1`.][86063]

[85574]: rust-lang/rust#85574
[86831]: rust-lang/rust#86831
[86063]: rust-lang/rust#86063
[86831]: rust-lang/rust#86831
[79608]: rust-lang/rust#79608
[84988]: rust-lang/rust#84988
[84701]: rust-lang/rust#84701
[84072]: rust-lang/rust#84072
[85745]: rust-lang/rust#85745
[84744]: rust-lang/rust#84744
[85078]: rust-lang/rust#85078
[84717]: rust-lang/rust#84717
[83800]: rust-lang/rust#83800
[83366]: rust-lang/rust#83366
[83278]: rust-lang/rust#83278
[85292]: rust-lang/rust#85292
[cargo/9520]: rust-lang/cargo#9520
[cargo/9499]: rust-lang/cargo#9499
[cargo/9488]: rust-lang/cargo#9488
[cargo/9375]: rust-lang/cargo#9375
[`BTreeMap::into_keys`]: https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.into_keys
[`BTreeMap::into_values`]: https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.into_values
[`HashMap::into_keys`]: https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.into_keys
[`HashMap::into_values`]: https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.into_values
[`arch::wasm32`]: https://doc.rust-lang.org/core/arch/wasm32/index.html
[`VecDeque::binary_search`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.binary_search
[`VecDeque::binary_search_by`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.binary_search_by
[`VecDeque::binary_search_by_key`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.binary_search_by_key
[`VecDeque::partition_point`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.partition_point
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Oct 14, 2023
…eatures, r=RalfJung

const-eval: allow calling functions with targat features disabled at compile time in WASM

This is not unsafe on WASM, see rust-lang#84988

r? `@RalfJung`

Fixes rust-lang#116516
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Oct 15, 2023
Rollup merge of rust-lang#116576 - eduardosm:const-eval-wasm-target-features, r=RalfJung

const-eval: allow calling functions with targat features disabled at compile time in WASM

This is not unsafe on WASM, see rust-lang#84988

r? `@RalfJung`

Fixes rust-lang#116516
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Oct 17, 2023
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Oct 17, 2023
…r=RalfJung

const-eval: allow calling functions with targat features disabled at compile time in WASM

This is not unsafe on WASM, see rust-lang/rust#84988

r? `@RalfJung`

Fixes rust-lang/rust#116516
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.