Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release builds using AVX code produce incorrect output #54583

Closed
vladglv opened this issue Sep 26, 2018 · 6 comments · Fixed by #55073
Closed

Release builds using AVX code produce incorrect output #54583

vladglv opened this issue Sep 26, 2018 · 6 comments · Fixed by #55073

Comments

@vladglv
Copy link

vladglv commented Sep 26, 2018

Code

#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

fn main() {
    unsafe {
        let f = _mm256_set_pd(2.0, 2.0, 2.0, 2.0);
        let r = _mm256_mul_pd(f, f);

        println!("{:?}", r);
    }
}

Output

The expected output is

__m256d(4.0, 4.0, 4.0, 4.0)

The actual output is

__m256d(4.0, 4.0, 0.0, 0.0)

Notes

  • The code built in debug mode produces the expected result. This issue only occurs in a release build.
  • Using _mm instead of _mm256 results in correct output in both debug and release mode.
  • Code using only _mm or _mm256 yields the same performance (I have another piece of code for that. I can provide it if needed).

Versions

The issue can be reproduced with 1.30.0-nightly (2018-09-24), 1.30.0-beta.7, 1.29.0.

@nicokoch
Copy link
Contributor

nicokoch commented Sep 26, 2018

shouldn‘t be a #[target_feature(enable = "avx2")] in there?

Edit: It works when you add the target feature. playground link

I think the compiler should deny the code from the OP though, instead of producing wrong code.

@hanna-kruppe
Copy link
Contributor

Looks like a duplicate of #50154 at a glance.

@vladglv
Copy link
Author

vladglv commented Sep 26, 2018

Yes, I forgot to add #[target_feature(enable = "avx2")]. I agree with @nicokoch. The code should not compile. I think that it should be the case for both debug and release builds.

Do you think that clippy should provide warnings for this kind of code?

@theotherphil
Copy link
Contributor

I just ran into this issue and spent quite a while getting very confused. It's very surprising that code using AVX2 intrinsics compiles and then behaves nonsensically without the correct target feature attribute.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 14, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
bors added a commit that referenced this issue Oct 14, 2018
rustc: Fix (again) simd vectors by-val in ABI

The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes #50154
Closes #52636
Closes #54583
Closes #55059

[quite a lot]: #47743
[discussion]: #44367
[wasn't]: #50154
alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 16, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
kennytm added a commit to kennytm/rust that referenced this issue Oct 18, 2018
rustc: Fix (again) simd vectors by-val in ABI

The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 19, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
Manishearth added a commit to Manishearth/rust that referenced this issue Oct 20, 2018
The issue of passing around SIMD types as values between functions has
seen [quite a lot] of [discussion], and although we thought [we fixed
it][quite a lot] it [wasn't]! This PR is a change to rustc to, again,
try to fix this issue.

The fundamental problem here remains the same, if a SIMD vector argument
is passed by-value in LLVM's function type, then if the caller and
callee disagree on target features a miscompile happens. We solve this
by never passing SIMD vectors by-value, but LLVM will still thwart us
with its argument promotion pass to promote by-ref SIMD arguments to
by-val SIMD arguments.

This commit is an attempt to thwart LLVM thwarting us. We, just before
codegen, will take yet another look at the LLVM module and demote any
by-value SIMD arguments we see. This is a very manual attempt by us to
ensure the codegen for a module keeps working, and it unfortunately is
likely producing suboptimal code, even in release mode. The saving grace
for this, in theory, is that if SIMD types are passed by-value across
a boundary in release mode it's pretty unlikely to be performance
sensitive (as it's already doing a load/store, and otherwise
perf-sensitive bits should be inlined).

The implementation here is basically a big wad of C++. It was largely
copied from LLVM's own argument promotion pass, only doing the reverse.
In local testing this...

Closes rust-lang#50154
Closes rust-lang#52636
Closes rust-lang#54583
Closes rust-lang#55059

[quite a lot]: rust-lang#47743
[discussion]: rust-lang#44367
[wasn't]: rust-lang#50154
@andersk
Copy link
Contributor

andersk commented Dec 22, 2018

This test case still produces the incorrect result with -C opt-level=3 (or Cargo’s --release), in both 1.31.1 and 1.33.0-nightly (e40548b 2018-12-21).

Playground link.

I guess this is because #55073 was reverted as #55281? This should maybe be reopened then.

@nikic
Copy link
Contributor

nikic commented Dec 22, 2018

Yes, the fix was reverted. I don't think it's necessary to reopen this one, as the general issue is already tracked at #50154. This is just one more manifestation of the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants