Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64-SVE: Add FloatingPointExponentialAccelerator #104649

Merged
merged 3 commits into from
Jul 12, 2024

Conversation

amanasifkhalid
Copy link
Member

@amanasifkhalid amanasifkhalid commented Jul 9, 2024

Part of #99957. I added a new test template that will probably go away soon; in #104478, I can update the fexpa tests to use the same template as the ConvertTo* APIs, and add some extra templating to wrap the API's result with the appropriate BitConverter method for the ConditionalSelect scenarios (I'm assuming the ConditionalSelect scenarios aren't testing anything interesting for this API, though).

Test output:

Starting test: .\Core_Root\corerun.exe .\HardwareIntrinsics_Arm_r\HardwareIntrinsics_Arm_r.dll Sve_FloatingPointExponentialAccelerator
===================Running default===================
------------------- {} -------------------
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_FloatingPointExponentialAccelerator_float_uint() : 7
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_FloatingPointExponentialAccelerator_double_ulong() : 7
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

Starting test: .\Core_Root\corerun.exe .\HardwareIntrinsics_Arm_ro\HardwareIntrinsics_Arm_ro.dll Sve_FloatingPointExponentialAccelerator
===================Running default===================
------------------- {} -------------------
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_FloatingPointExponentialAccelerator_float_uint() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_FloatingPointExponentialAccelerator_double_ulong() : 7
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

@dotnet/arm64-contrib PTAL, thanks!

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

{
uint index = op1 & 0b111111;
uint coeff = index switch
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tables were copied from the ARM docs.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

Ah, I see we have the ConvertFunc template parameter -- I think I can get rid of the new template now...

@amanasifkhalid
Copy link
Member Author

I tweaked the FloatingPointExponentialAccelerator tests to use the same template as the ConvertTo* APIs. The updated tests pass for both.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to write an equivalent for double.

{Op1BaseType} iterResult = (mask[i] != 0) ? {GetIterResult} : falseVal[i];
if (iterResult != result[i])
{RetBaseType} iterResult = (mask[i] != 0) ? {GetIterResult} : falseVal[i];
if ({ConvertFunc}(iterResult) != {ConvertFunc}(result[i]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please run the tests that uses the templates updated to make sure they pass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure: The only template using this validation logic right now is SveSimpleVecOpDifferentRetTypeTest, which is only used by the ConvertTo* APIs (for now) and FloatingPointExponentialAccelerator. Both are passing.

@@ -5262,6 +5338,82 @@ public static double MultiplyExtended(double op1, double op2)
}
}

public static double FPExponentialAccelerator(ulong op1)
{
ulong index = op1 & 0b111111;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For N=32 and N=64, the index is the first 6 bits instead of the first 5. For this helper, I got the table from the N=64 case.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@amanasifkhalid amanasifkhalid merged commit 72d00a8 into dotnet:main Jul 12, 2024
143 of 167 checks passed
@amanasifkhalid amanasifkhalid deleted the sve-fexpa branch July 12, 2024 16:43
@github-actions github-actions bot locked and limited conversation to collaborators Aug 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants