Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX-512 intrinsics #146

Closed
hdevalence opened this issue Oct 23, 2017 · 5 comments
Closed

AVX-512 intrinsics #146

hdevalence opened this issue Oct 23, 2017 · 5 comments

Comments

@hdevalence
Copy link
Contributor

I tried getting an AVX-512 intrinsic to work and ran into a bunch of difficulties. Some points:

  • It looks like the combination of AVX512's masks and AVX512VL (which lets AVX512 instructions operate on 128/256bit vectors) means that for most instructions there's one C intrinsic for each of {no mask, write mask, zero mask} x {xmm, ymm, zmm}.

  • These would probably be good to generate with a macro?

  • Because AVX512 uses mask registers, the constify! macro hacks are probably not needed for mask instructions.

  • The list of intrinsics linked in the readme doesn't seem to have non-masked versions; I don't know if this is just an accident of how it was made.

  • Trying to use the int_x86_avx512_mask_pmul_dq_512 intrinsic from that list using

#[link_name = "llvm.x86.avx512.mask.pmul.dq.512"]
fn mask_pmul_dq_512(a: i32x16, b: i32x16, src: i64x8, k: i8) -> i64x8;

didn't work, failing with

rustc: /checkout/src/llvm/include/llvm/Support/Casting.h:236: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = llvm::VectorType; Y = llvm::Type; typename llvm::cast_retty<X, Y*>::ret_type = llvm::VectorType*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

which I guess means I was linking to the intrinsic incorrectly?

@alexcrichton reduced to this minimal example for the vpmuldq instruction: https://godbolt.org/g/VMCtYy and found https://github.com/rust-lang/rust/blob/4c053db233d69519b548e5b8ed7192d0783e582a/src/librustc_trans/cabi_x86_64.rs#L30-L31 which hardcodes the biggest vector as 256 bits (the size of a ymm register).

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 24, 2017

@hdevalence is there a rust bug for this?

@alexcrichton
Copy link
Member

I've opened up a PR for the some rustc features at f764eaf453162cd19ef484ece07cc21e14dfb2c1 rust-lang/rust#45528

@hdevalence
Copy link
Contributor Author

Not sure if this is related to the above or unrelated, but building stdsimd at all on skylake-avx512 fails:

$ RUSTFLAGS="-C target_cpu=skylake-avx512" cargo build --release
   Compiling stdsimd v0.0.2 (file:///<stripped>/stdsimd)
LLVM ERROR: Cannot select: t83: v32i16 = X86ISD::VINSERT t82, t78, Constant:i64<0>
  t82: v32i16 = bitcast t81
    t81: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
      t80: i32 = Constant<0>
  t78: i16 = truncate t5
    t5: i64,ch = CopyFromReg t0, Register:i64 %vreg2
      t4: i64 = Register %vreg2
  t79: i64 = Constant<0>
In function: _ZN61_$LT$stdsimd..v512..u16x32$u20$as$u20$core..fmt..LowerHex$GT$3fmt17h59e1149cebeaa07fE
error: Could not compile `stdsimd`.

To learn more, run the command again with --verbose.

This is with rustc 1.22.0-nightly (b7960878b 2017-10-18).

@alexcrichton
Copy link
Member

@hdevalence hm... fascinating! In general you shouldn't need to build stdsimd with -C target-cpu (it uses #[target_feature] to do that on a per-function basis)

That being said this still shouldn't cause a problem! Mind opening a separate issue for that?

@alexcrichton
Copy link
Member

Ok rust-lang/rust#45528 is now merged so I think these bugs should be fixed and we should be ready to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants