Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add intrinsics for portable packed simd vector reductions #48983

Merged
merged 10 commits into from
Mar 17, 2018

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Mar 13, 2018

Adds the following portable vector reduction intrinsics:

  • fn simd_reduce_add<T, U>(x: T) -> U;
  • fn simd_reduce_mul<T, U>(x: T) -> U;
  • fn simd_reduce_min<T, U>(x: T) -> U;
  • fn simd_reduce_max<T, U>(x: T) -> U;
  • fn simd_reduce_and<T, U>(x: T) -> U;
  • fn simd_reduce_or<T, U>(x: T) -> U;
  • fn simd_reduce_xor<T, U>(x: T) -> U;

I've also added:

  • fn simd_reduce_all(x: T) -> bool;
  • fn simd_reduce_any(x: T) -> bool;

These produce better code that what we are currently producing in stdsimd, but the code is still not optimal due to this LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=36702

r? @alexcrichton

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 13, 2018
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 13, 2018

I've basically no idea of what I am doing here so please review this thoroughly.

@alexcrichton
Copy link
Member

Thanks! @gnzlbg mind explaining for me how this differs from the codegen that stdsimd does today?

// Vector reductions:
extern "C" LLVMValueRef
LLVMRustBuildVectorReduceFAdd(LLVMBuilderRef B, LLVMValueRef Acc, LLVMValueRef Src) {
return wrap(unwrap(B)->CreateFAddReduce(unwrap(Acc),unwrap(Src)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CI may be failing on these in the sense that they're not defined on LLVM 3.9, but it's fine to have some #ifdef here to only compile these in on 6.0+ and otherwise return errors on older versions

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 14, 2018

So the main difference is that currently stdsimd just calls the intrinsics instead of using the LLVM builders, which created bad LLVM-IR in some cases. Some of the intrinsics have preconditions that require us to set fast-math flags, pass undef, etc. for the intrinsic call but we can't easily do this from stdsimd, so... they do IMO belong in the compiler.

Also, for vectors of bools, it is pretty important that the intrinsics get called on <N x i1> vectors, and we can't do that from stdsimd either.

I'd say that we should document these on stdsimd and iterate on the design there tweaking things here as necessary.

Semantically, this PR (and the stdsimd one) change the behavior of the reductions with respect to float vectors:

  • the reduction order is now implementation defined: before to .sum() for a float vector of (0., 1., 2., 3.) the intrinsic was equivalent to 0. + 0. + 1. + 2. + 3. (the first 0. is an accumulator). Now it just adds 0,1,2, and 3 together in some order

  • the min/max float reductions require that the vectors do not contain NaNs, if the vector contains NaNs the behavior is unspecified. AFAICT the reduction will still return NaN, but it might not have the same representation as the NaN in your vector. This also applies if the vector contains multiple NaNs.

Since this behavior wasn't documented on stdsimd, I don't think its a breaking change (plus the intrinsics have been available there for a couple of weeks only).

Whether these semantics are the right call, I don't know. We should explore the alternatives in stdsimd, iterating here accordingly.

I could change this PR to expose simd_reduce_{mul,sum}{,_ordered} and simd_reduce_{min,max}{,_nanless} so that we can iterate on stdsimd without changing rustc that much. But as https://bugs.llvm.org/show_bug.cgi?id=36702 evolves we are going to have to tune simd_reduce_{all,any}.

@alexcrichton
Copy link
Member

Oh no worries I was mainly just curious! The boolean <N x i1> sounds like the killer reason here. I think things like undef are possible via mem::uninitialized(), but it's also true that the fast math flags require compiler help as opposed to something we can bind. In any case sounds good to me to merge when ready!

@sanxiyn
Copy link
Member

sanxiyn commented Mar 14, 2018

LLVM vector reduction intrinsics are new in LLVM 5 (r302514 to be specific), so they are not present in LLVM 3.9. That's the cause of Travis failure.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 14, 2018

@alexcrichton

I think things like undef are possible via mem::uninitialized(),

I tried this but did not work out because the intrinsics need to be passed undef and when they are passed undef then they need to have the fast-math flags enabled.

I am cleaning the PR a bit to actually expose the different combinations of the intrinsics so that we can experiment in stdsimd without having to change rustc to try new things. We'll still need to change rustc for the boolean intrinsics once the issue is resolved in LLVM upstream but there is nothing that we can do about that right now.

@sanxiyn thanks, yes, I'll add a commit that ifdefs those out in LLVM < 6.0.

}
extern "C" LLVMValueRef
LLVMRustBuildVectorReduceFMul(LLVMBuilderRef, LLVMValueRef, LLVMValueRef Src) {
return Src;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to lead to broken IR (you'll wind up with vectors where scalars are expected). To truly support earlier LLVM versions, you'll need a polyfill that does the reduction "by hand".

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 14, 2018 via email

@hanna-kruppe
Copy link
Contributor

I don't know why you didn't put the std::terminate there to begin with, but OK. I don't have much opinion on whether there should be a polyfill.

@hanna-kruppe
Copy link
Contributor

Well, come to think of it, breaking the builds of people using these intrinsics seems bad. The current limitations around older LLVM versions that I know of keep things working with degraded quality (e.g., I think cfg(target_feature=...) are currently not set with an out-of-tree LLVM, but you don't get an error). But I defer to @alexcrichton

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 14, 2018 via email

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 14, 2018 via email

@alexcrichton
Copy link
Member

Yes for functionality like this it needs to work in terms of linking and the compiler runs correctly on older LLVM versions (like 3.9). The feature itself, though, normally crashes and burns rustc if used. This means if these SIMD intrinsics are used on LLVM 3.9 then rustc would crash and burn. That's fine for now, and we can probably fix it later by having a #[cfg] which says "don't do the intrinsic thing, do the manual slow thing"

@sanxiyn
Copy link
Member

sanxiyn commented Mar 15, 2018

Adding // min-llvm-version 5.0 to simd-intrinsic-generic-reduction.rs should make Travis happy.

@hsivonen
Copy link
Member

Yes for functionality like this it needs to work in terms of linking and the compiler runs correctly on older LLVM versions (like 3.9)

What's the use case for supporting LLVM older than the one included in the official builds?

For the Linux distro scenario, I understand the use case of compiling with upstream LLVM without Rust-specific LLVM patches, but making it easy for distros to upgrade rustc without upgrading LLVM seems like a way to end up with a defective rustc that makes Rust look bad if e.g. SIMD code compiles but performs badly.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 15, 2018

Do not merge yet - I still need to add some fail tests and check for misuses. This is my largest rustc PR to date so I'd still like as much feedback as possible.


#else

void error_and_exit(const char* msg) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible option (if you're willing to do this) would be to return nullptr and then bug! in rustc (to get a proper ICE message)

@alexcrichton
Copy link
Member

@bors: r+

Thanks @gnzlbg!

@bors
Copy link
Contributor

bors commented Mar 15, 2018

📌 Commit 19b81f6 has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 15, 2018
@kennytm
Copy link
Member

kennytm commented Mar 15, 2018

@bors r-

compile-fail/simd-intrinsic-generic-reduction.rs also needs that min-llvm-version header.

[00:59:00] ---- [compile-fail] compile-fail/simd-intrinsic-generic-reduction.rs stdout ----
[00:59:00] 	
[00:59:00] error: compiler encountered internal error
[00:59:00] status: exit code: 101
[00:59:00] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/compile-fail/simd-intrinsic-generic-reduction.rs" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/compile-fail" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/compile-fail/simd-intrinsic-generic-reduction.stage2-x86_64-unknown-linux-gnu" "-Crpath" "-O" "-Zmiri" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/compile-fail/simd-intrinsic-generic-reduction.stage2-x86_64-unknown-linux-gnu.aux" "-A" "unused"
[00:59:00] stdout:
[00:59:00] ------------------------------------------
[00:59:00] 
[00:59:00] ------------------------------------------
[00:59:00] stderr:
[00:59:00] ------------------------------------------
[00:59:00] {"message":"librustc_trans/builder.rs:966: LLVMRustBuildVectorReduceFAdd is not available in LLVM version < 5.0","code":null,"level":"error: internal compiler error","spans":[],"children":[],"rendered":"error: internal compiler error: librustc_trans/builder.rs:966: LLVMRustBuildVectorReduceFAdd is not available in LLVM version < 5.0\n\n"}
[00:59:00] thread 'rustc' panicked at 'Box<Any>', librustc_errors/lib.rs:540:9
[00:59:00] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:59:00] 
[00:59:00] note: the compiler unexpectedly panicked. this is a bug.
[00:59:00] 
[00:59:00] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[00:59:00] 
[00:59:00] note: rustc 1.26.0-dev running on x86_64-unknown-linux-gnu
[00:59:00] 
[00:59:00] note: compiler flags: -Z ui-testing -Z miri -Z unstable-options -C prefer-dynamic -C rpath
[00:59:00] 
[00:59:00] 
[00:59:00] ------------------------------------------
[00:59:00] 
[00:59:00] thread '[compile-fail] compile-fail/simd-intrinsic-generic-reduction.rs' panicked at 'explicit panic', tools/compiletest/src/runtest.rs:2903:9
[00:59:00] 
[00:59:00] 
[00:59:00] failures:
[00:59:00]     [compile-fail] compile-fail/simd-intrinsic-generic-reduction.rs
[00:59:00] 
[00:59:00] test result: FAILED. 2304 passed; 1 failed; 15 ignored; 0 measured; 0 filtered out
[00:59:00] 
[00:59:00] thread 'main' panicked at 'Some tests failed', tools/compiletest/src/main.rs:478:22

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 15, 2018
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 15, 2018

@kennytm fixed

@kennytm
Copy link
Member

kennytm commented Mar 15, 2018

@bors r=alexcrichton

@bors
Copy link
Contributor

bors commented Mar 15, 2018

📌 Commit f173a4c has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 15, 2018
@kennytm
Copy link
Member

kennytm commented Mar 15, 2018

@bors r-

So this PR failed in asmjs because of LLVM 5 issue again.

It seems compilertest rustbuild cannot detect the dynamic librustc_trans, and thinks the asmjs test is using LLVM 6, although in fact it is really using LLVM 4.

Let's // ignore-emscripten here until we have fixed compilertest rustbuild?

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 15, 2018
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 16, 2018

@kennytm I've added // ignore emscripten to both test files :)

@kennytm
Copy link
Member

kennytm commented Mar 16, 2018

@bors r=alexcrichton

@bors
Copy link
Contributor

bors commented Mar 16, 2018

📌 Commit 06148cb has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 16, 2018
kennytm added a commit to kennytm/rust that referenced this pull request Mar 17, 2018
add intrinsics for portable packed simd vector reductions

Adds the following portable vector reduction intrinsics:

* fn simd_reduce_add<T, U>(x: T) -> U;
* fn simd_reduce_mul<T, U>(x: T) -> U;
* fn simd_reduce_min<T, U>(x: T) -> U;
* fn simd_reduce_max<T, U>(x: T) -> U;
* fn simd_reduce_and<T, U>(x: T) -> U;
* fn simd_reduce_or<T, U>(x: T) -> U;
* fn simd_reduce_xor<T, U>(x: T) -> U;

I've also added:

* fn simd_reduce_all<T>(x: T) -> bool;
* fn simd_reduce_any<T>(x: T) -> bool;

These produce better code that what we are currently producing in `stdsimd`, but the code is still not optimal due to this LLVM bug:  https://bugs.llvm.org/show_bug.cgi?id=36702

r? @alexcrichton
bors added a commit that referenced this pull request Mar 17, 2018
Rollup of 8 pull requests

- Successful merges: #48943, #48960, #48983, #49055, #49057, #49077, #49082, #49083
- Failed merges:
@bors bors merged commit 06148cb into rust-lang:master Mar 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants