Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document safety in memchr, remove unsafe. #76488

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions library/core/src/slice/memchr.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
// Original implementation taken from rust-memchr.
// Copyright 2015 Andrew Gallant, bluss and Nicolas Koch

// ignore-tidy-undocumented-unsafe

use crate::cmp;
use crate::mem;

Expand Down Expand Up @@ -63,6 +61,8 @@ pub fn memchr(x: u8, text: &[u8]) -> Option<usize> {

if len >= 2 * usize_bytes {
while offset <= len - 2 * usize_bytes {
// SAFETY: offset + usize_bytes will never be out of bounds due to the bounds check above.
// All indexing is in range aligned_boundary to len - 2 * size_of::<usize>()
unsafe {
let u = *(ptr.add(offset) as *const usize);
let v = *(ptr.add(offset + usize_bytes) as *const usize);
Expand Down Expand Up @@ -97,8 +97,13 @@ pub fn memrchr(x: u8, text: &[u8]) -> Option<usize> {
let (min_aligned_offset, max_aligned_offset) = {
// We call this just to obtain the length of the prefix and suffix.
// In the middle we always process two chunks at once.
let (prefix, _, suffix) = unsafe { text.align_to::<(Chunk, Chunk)>() };
(prefix.len(), len - suffix.len())
let offset = ptr.align_offset(mem::align_of::<(Chunk, Chunk)>());
if offset > len {
(len, 0)
} else {
let (lower, rest) = text.split_at(offset);
(lower.len(), len - rest.align_to_offsets::<(Chunk, Chunk)>().1)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous code is better in this case. Even though it's unsafe, it is much clearer (IMO) than this replication of the body of align_to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mark-Simulacrum One of the main reasons I removed it is because I was having some trouble figuring out the best way to document it. Alongside that, it wasn't immediately clear to me what align_to did, and why it was being used instead of some other more specialized function (which doesn't exist, in fact. shame.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so the documentation on align_to is fairly clear to my eyes, but if you're not understanding it then we should definitely improve it! It's sort of a "core primitive" in my eyes for this sort of task. In the case of &[u8] and usize as the middle type, align_to is also safe in all cases, because transmutes of byte slices to slices of usize, when respecting alignment (and length), are always safe.

};

let mut offset = max_aligned_offset;
Expand All @@ -113,6 +118,8 @@ pub fn memrchr(x: u8, text: &[u8]) -> Option<usize> {
let chunk_bytes = mem::size_of::<Chunk>();

while offset > min_aligned_offset {
// SAFETY: offset will never be less than zero, guarenteed by the alignment check.
// All indexing is in range max_aligned_offset to min_aligned_offset.
unsafe {
let u = *(ptr.offset(offset as isize - 2 * chunk_bytes as isize) as *const Chunk);
let v = *(ptr.offset(offset as isize - chunk_bytes as isize) as *const Chunk);
Expand Down