Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64-SVE: Fix conditional select for Zeroing predicates #102904 #105737

Merged
merged 9 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 13 additions & 5 deletions src/coreclr/jit/hwintrinsic.h
Original file line number Diff line number Diff line change
Expand Up @@ -236,16 +236,18 @@ enum HWIntrinsicFlag : unsigned int
// then the intrinsic should be switched to a scalar only version.
HW_Flag_HasScalarInputVariant = 0x2000000,

// The intrinsic uses a mask in arg1 to select elements present in the result, and must use a low vector register.
HW_Flag_LowVectorOperation = 0x4000000,

// The intrinsic uses a mask in arg1 to select elements present in the result, which zeros inactive elements
// (instead of merging).
HW_Flag_ZeroingMaskedOperation = 0x8000000,

#endif // TARGET_XARCH

// The intrinsic is a FusedMultiplyAdd intrinsic
HW_Flag_FmaIntrinsic = 0x40000000,

#if defined(TARGET_ARM64)
// The intrinsic uses a mask in arg1 to select elements present in the result, and must use a low vector register.
HW_Flag_LowVectorOperation = 0x4000000,
#endif

HW_Flag_CanBenefitFromConstantProp = 0x80000000,
};

Expand Down Expand Up @@ -955,6 +957,12 @@ struct HWIntrinsicInfo
return (flags & HW_Flag_HasScalarInputVariant) != 0;
}

static bool IsZeroingMaskedOperation(NamedIntrinsic id)
{
const HWIntrinsicFlag flags = lookupFlags(id);
return (flags & HW_Flag_ZeroingMaskedOperation) != 0;
}

static NamedIntrinsic GetScalarInputVariant(NamedIntrinsic id)
{
assert(HasScalarInputVariant(id));
Expand Down
158 changes: 79 additions & 79 deletions src/coreclr/jit/hwintrinsiclistarm64sve.h

Large diffs are not rendered by default.

6 changes: 5 additions & 1 deletion src/coreclr/jit/lowerarmarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4028,9 +4028,13 @@ GenTree* Lowering::LowerHWIntrinsicCndSel(GenTreeHWIntrinsic* cndSelNode)
// `trueValue`
GenTreeHWIntrinsic* nestedCndSel = op2->AsHWIntrinsic();
GenTree* nestedOp1 = nestedCndSel->Op(1);
GenTree* nestedOp2 = nestedCndSel->Op(2);
assert(varTypeIsMask(nestedOp1));
assert(nestedOp2->OperIsHWIntrinsic());

if (nestedOp1->IsMaskAllBitsSet())
NamedIntrinsic nestedOp2Id = nestedOp2->AsHWIntrinsic()->GetHWIntrinsicId();

if (nestedOp1->IsMaskAllBitsSet() && !HWIntrinsicInfo::IsZeroingMaskedOperation(nestedOp2Id))
a74nh marked this conversation as resolved.
Show resolved Hide resolved
{
GenTree* nestedOp2 = nestedCndSel->Op(2);
GenTree* nestedOp3 = nestedCndSel->Op(3);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,17 +56,7 @@ namespace JIT.HardwareIntrinsics.Arm
test.RunStructFldScenario();

// Validates using inside ConditionalSelect with value falseValue
// Currently, using this operation in ConditionalSelect() gives incorrect result
// when falseReg == targetReg because this instruction uses Pg/Z to update the targetReg
// instead of Pg/M to merge it. As such, the value of falseReg is lost. Ideally, such
// instructions should be marked similar to RMW (a different flag name) to make sure that
// we do not assign falseReg/targetReg same. Then, we would do something like this:
//
// ldnf1sh target, pg/z, [x0]
// sel mask, target, target, falseReg
//
// This needs more careful thinking, so disabling it for now.
// test.ConditionalSelect_FalseOp();
test.ConditionalSelect_FalseOp();

// Validates using inside ConditionalSelect with zero falseValue
test.ConditionalSelect_ZeroOp();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,17 +56,7 @@ namespace JIT.HardwareIntrinsics.Arm
test.RunStructFldScenario();

// Validates using inside ConditionalSelect with value falseValue
// Currently, using this operation in ConditionalSelect() gives incorrect result
// when falseReg == targetReg because this instruction uses Pg/Z to update the targetReg
// instead of Pg/M to merge it. As such, the value of falseReg is lost. Ideally, such
// instructions should be marked similar to RMW (a different flag name) to make sure that
// we do not assign falseReg/targetReg same. Then, we would do something like this:
//
// ldnf1sh target, pg/z, [x0]
// sel mask, target, target, falseReg
//
// This needs more careful thinking, so disabling it for now.
// test.ConditionalSelect_FalseOp();
test.ConditionalSelect_FalseOp();

// Validates using inside ConditionalSelect with zero falseValue
test.ConditionalSelect_ZeroOp();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,17 +56,7 @@ namespace JIT.HardwareIntrinsics.Arm
test.RunStructFldScenario();

// Validates using inside ConditionalSelect with value falseValue
// Currently, using this operation in ConditionalSelect() gives incorrect result
// when falseReg == targetReg because this instruction uses Pg/Z to update the targetReg
// instead of Pg/M to merge it. As such, the value of falseReg is lost. Ideally, such
// instructions should be marked similar to RMW (a different flag name) to make sure that
// we do not assign falseReg/targetReg same. Then, we would do something like this:
//
// ldnf1sh target, pg/z, [x0]
// sel mask, target, target, falseReg
//
// This needs more careful thinking, so disabling it for now.
// test.ConditionalSelect_FalseOp();
test.ConditionalSelect_FalseOp();

// Validates using inside ConditionalSelect with zero falseValue
test.ConditionalSelect_ZeroOp();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,17 +47,7 @@ namespace JIT.HardwareIntrinsics.Arm
test.RunStructFldScenario();

// Validates using inside ConditionalSelect with value falseValue
// Currently, using this operation in ConditionalSelect() gives incorrect result
// when falseReg == targetReg because this instruction uses Pg/Z to update the targetReg
// instead of Pg/M to merge it. As such, the value of falseReg is lost. Ideally, such
// instructions should be marked similar to RMW (a different flag name) to make sure that
// we do not assign falseReg/targetReg same. Then, we would do something like this:
//
// ldnf1sh target, pg/z, [x0]
// sel mask, target, target, falseReg
//
// This needs more careful thinking, so disabling it for now.
// test.ConditionalSelect_FalseOp();
test.ConditionalSelect_FalseOp();

// Validates using inside ConditionalSelect with zero falseValue
test.ConditionalSelect_ZeroOp();
Expand Down
Loading