Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add neon instruction vmaxnm_f* vpmaxnm_f* vminnm_f* vpminnm_f* #1105

Merged
merged 4 commits into from
Apr 6, 2021

Conversation

surechen
Copy link
Contributor

@surechen surechen commented Apr 2, 2021

vmaxnm_f* : Floating-point Maximum Number (vector). This instruction compares corresponding vector elements in the two source SIMD&FP registers, writes the larger of the two floating-point values into a vector, and writes the vector to the destination SIMD&FP register.
vpmaxnm_f* : Floating-point Maximum Number Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the largest of each pair of values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.
vminnm_f* : Floating-point Minimum Number (vector). This instruction compares corresponding vector elements in the two source SIMD&FP registers, writes the smaller of the two floating-point values into a vector, and writes the vector to the destination SIMD&FP register.
vpminnm_f* : Floating-point Minimum Number Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the smallest of each pair of floating-point values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

@Amanieu
Copy link
Member

Amanieu commented Apr 2, 2021

vmaxnm_f32/vmaxnmq_f32 (and the min equivalents) are also available on ARM.

@surechen
Copy link
Contributor Author

surechen commented Apr 2, 2021

vmaxnm_f32/vmaxnmq_f32 (and the min equivalents) are also available on ARM.

Hi, thanks for reviewing. I check these instructions in https://godbolt.org/ and get Compilation failed.

https://godbolt.org/z/1nM7eYvrz
https://godbolt.org/z/nMo63G89W

@Amanieu
Copy link
Member

Amanieu commented Apr 2, 2021

These instruction were added in ARMv8 for both 32-bit and 64-bit mode. If you change armv7 to armv8 in godbolt then it will be accepted.

@surechen
Copy link
Contributor Author

surechen commented Apr 2, 2021

These instruction were added in ARMv8 for both 32-bit and 64-bit mode. If you change armv7 to armv8 in godbolt then it will be accepted.

Hi, Thank you for your guidance.
If I remove line #[cfg_attr(target_arch = "arm", target_feature(enable = "v7"))] . Is the following code right?

#[inline]
#[target_feature(enable = "neon")]
#[cfg_attr(all(test, target_arch = "arm"), assert_instr("vmaxnm"))]
#[cfg_attr(all(test, target_arch = "aarch64"), assert_instr(fmaxnm))]
pub unsafe fn vmaxnm_f32(a: float32x2_t, b: float32x2_t) -> float32x2_t {
#[allow(improper_ctypes)]
extern "C" {
#[cfg_attr(target_arch = "arm", link_name = "llvm.arm.neon.vmaxnm.v2f32")]
#[cfg_attr(target_arch = "aarch64", link_name = "llvm.aarch64.neon.fmaxnm.v2f32")]
fn vmaxnm_f32_(a: float32x2_t, b: float32x2_t) -> float32x2_t;
}
vmaxnm_f32_(a, b)
}

@Amanieu
Copy link
Member

Amanieu commented Apr 2, 2021

I think you need to enable the v8 target feature for this instruction.

@surechen
Copy link
Contributor Author

surechen commented Apr 2, 2021

I think you need to enable the v8 target feature for this instruction.

Ok, Thank you very much.

@bors
Copy link
Contributor

bors commented Apr 2, 2021

☔ The latest upstream changes (presumably 15babf5) made this pull request unmergeable. Please resolve the merge conflicts.

@Amanieu
Copy link
Member

Amanieu commented Apr 2, 2021

I looked into the compiler crash. You need to enable the fp-armv8 feature. But first you need to update rustc to expose this feature on ARM. This can be done by adding it to compiler/rustc_codegen_ssa/src/target_features.rs.

@surechen
Copy link
Contributor Author

surechen commented Apr 3, 2021

I looked into the compiler crash. You need to enable the fp-armv8 feature. But first you need to update rustc to expose this feature on ARM. This can be done by adding it to compiler/rustc_codegen_ssa/src/target_features.rs.

Hi, Thank you very much. I'll try

JohnTitor added a commit to JohnTitor/rust that referenced this pull request Apr 3, 2021
…ochenkov

add fp-armv8 for ARM_ALLOWED_FEATURES

For fixing err in rust-lang/stdarch#1105.
…d_vpmaxnm

# Conflicts:
#	crates/stdarch-gen/src/main.rs
@Amanieu
Copy link
Member

Amanieu commented Apr 5, 2021

You need both v8 and fp-armv8.

@Amanieu Amanieu merged commit daae8f8 into rust-lang:master Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants