Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement repr(packed) for repr(simd) #117116

Merged
merged 4 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion compiler/rustc_codegen_llvm/src/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ use crate::value::Value;
use rustc_codegen_ssa::base::{compare_simd_types, wants_msvc_seh, wants_wasm_eh};
use rustc_codegen_ssa::common::{IntPredicate, TypeKind};
use rustc_codegen_ssa::errors::{ExpectedPointerMutability, InvalidMonomorphization};
use rustc_codegen_ssa::mir::operand::OperandRef;
use rustc_codegen_ssa::mir::operand::{OperandRef, OperandValue};
use rustc_codegen_ssa::mir::place::PlaceRef;
use rustc_codegen_ssa::traits::*;
use rustc_hir as hir;
Expand Down Expand Up @@ -946,6 +946,13 @@ fn generic_simd_intrinsic<'ll, 'tcx>(
tcx.normalize_erasing_late_bound_regions(ty::ParamEnv::reveal_all(), callee_ty.fn_sig(tcx));
let arg_tys = sig.inputs();

// Vectors must be immediates (non-power-of-2 #[repr(packed)] are not)
for (ty, arg) in arg_tys.iter().zip(args) {
if ty.is_simd() && !matches!(arg.val, OperandValue::Immediate(_)) {
return_error!(InvalidMonomorphization::SimdArgument { span, name, ty: *ty });
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems no longer accurate... maybe that got changed by #125311? Doing simd_mul etc on a packed SIMD type works just fine now. Even in debug builds this generates the code one would hope for.


if name == sym::simd_select_bitmask {
let (len, _) = require_simd!(arg_tys[1], SimdArgument);

Expand Down
18 changes: 16 additions & 2 deletions compiler/rustc_ty_utils/src/layout.rs
Original file line number Diff line number Diff line change
Expand Up @@ -435,7 +435,21 @@ fn layout_of_uncached<'tcx>(
.size
.checked_mul(e_len, dl)
.ok_or_else(|| error(cx, LayoutError::SizeOverflow(ty)))?;
let align = dl.vector_align(size);

let (abi, align) = if def.repr().packed() && !e_len.is_power_of_two() {
// Non-power-of-two vectors have padding up to the next power-of-two.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc LLVM doesn't guarantee vectors don't have more padding than just to the next power of two, e.g. padding <2 x i8> to 128-bits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://llvm.org/docs/LangRef.html#vector-type seems to say that LLVM guarantees that vectors have no padding in their in memory representation? Padding <2 x i8> to 128 bits is something that LLVM does inside the vector registers (e.g. on aarch64), but never in memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's only for load/store, LLVM vector types still have big enough alignment that Rust requires padding in many cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's guaranteed, but rustc does make the assumption (somewhere, I forget where)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core::mem::size_of:

More specifically, this is the offset in bytes between successive elements in an array with that item type including alignment padding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I meant rustc assumes that npot vectors round up to the next power of two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's only for load/store, LLVM vector types still have big enough alignment that Rust requires padding in many cases.

I thought LLVM types didn't have any sort of intrinsic alignment? And even if they have a preferred alignment that's too big, can't you just always annotate load/store instructions with the smaller alignment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's only for load/store, LLVM vector types still have big enough alignment that Rust requires padding in many cases.

I thought LLVM types didn't have any sort of intrinsic alignment? And even if they have a preferred alignment that's too big, can't you just always annotate load/store instructions with the smaller alignment?

they do have an intrinsic ABI alignment, it's used for default alignment of load/store/alloca when not explicitly specified and for stack spilling and a few other things

Copy link
Member

@programmerjake programmerjake Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's guaranteed, but rustc does make the assumption (somewhere, I forget where)

just found this (yeah, I know that's not what you meant, but I couldn't resist):
https://lang-team.rust-lang.org/frequently-requested-changes.html#size--stride

Rust assumes that the size of an object is equivalent to the stride of an object

// If we're a packed repr, remove the padding while keeping the alignment as close
// to a vector as possible.
(
Abi::Aggregate { sized: true },
AbiAndPrefAlign {
abi: Align::max_for_offset(size),
pref: dl.vector_align(size).pref,
},
)
} else {
(Abi::Vector { element: e_abi, count: e_len }, dl.vector_align(size))
};
let size = size.align_to(align.abi);

// Compute the placement of the vector fields:
Expand All @@ -448,7 +462,7 @@ fn layout_of_uncached<'tcx>(
tcx.mk_layout(LayoutS {
variants: Variants::Single { index: FIRST_VARIANT },
fields,
abi: Abi::Vector { element: e_abi, count: e_len },
abi,
largest_niche: e_ly.largest_niche,
size,
align,
Expand Down
59 changes: 59 additions & 0 deletions tests/ui/simd/repr_packed.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
// run-pass

#![feature(repr_simd, platform_intrinsics)]
#![allow(non_camel_case_types)]

#[repr(simd, packed)]
struct Simd<T, const N: usize>([T; N]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From how packed usually works, I would expect this to mean that the type has alignment 1. But that doesn't seem to be the case, instead the alignment is the largest possible for the size, or something like that?

What happens with packed(N)?

Would be good to have the interaction of simd and packed documented somewhere.


#[repr(simd)]
struct FullSimd<T, const N: usize>([T; N]);

fn check_size_align<T, const N: usize>() {
use std::mem;
assert_eq!(mem::size_of::<Simd<T, N>>(), mem::size_of::<[T; N]>());
assert_eq!(mem::size_of::<Simd<T, N>>() % mem::align_of::<Simd<T, N>>(), 0);
}

fn check_ty<T>() {
check_size_align::<T, 1>();
check_size_align::<T, 2>();
check_size_align::<T, 3>();
check_size_align::<T, 4>();
check_size_align::<T, 8>();
check_size_align::<T, 9>();
check_size_align::<T, 15>();
}

extern "platform-intrinsic" {
fn simd_add<T>(a: T, b: T) -> T;
}

fn main() {
check_ty::<u8>();
check_ty::<i16>();
check_ty::<u32>();
check_ty::<i64>();
check_ty::<usize>();
check_ty::<f32>();
check_ty::<f64>();

unsafe {
// powers-of-two have no padding and work as usual
let x: Simd<f64, 4> =
simd_add(Simd::<f64, 4>([0., 1., 2., 3.]), Simd::<f64, 4>([2., 2., 2., 2.]));
assert_eq!(std::mem::transmute::<_, [f64; 4]>(x), [2., 3., 4., 5.]);

// non-powers-of-two have padding and need to be expanded to full vectors
fn load<T, const N: usize>(v: Simd<T, N>) -> FullSimd<T, N> {
unsafe {
let mut tmp = core::mem::MaybeUninit::<FullSimd<T, N>>::uninit();
std::ptr::copy_nonoverlapping(&v as *const _, tmp.as_mut_ptr().cast(), 1);
tmp.assume_init()
}
}
let x: FullSimd<f64, 3> =
simd_add(load(Simd::<f64, 3>([0., 1., 2.])), load(Simd::<f64, 3>([2., 2., 2.])));
assert_eq!(x.0, [2., 3., 4.]);
}
}
Loading