Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix (A)Rc<[T]> slices larger than isize::MAX bytes when collecting from a TrustedLen iter #95252

Closed
wants to merge 1 commit into from

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Mar 23, 2022

Due to ptr::add restrictions allocations must be <= isize::MAX bytes (see UCG). Since (A)RcBox is 2 words larger than the slice even when Vec ensures that the allocation is small enough the internal layout calculation of (A)Rc can still exceed isize::MAX, so we have to check it directly.

Affected methods:

  • Arc::new_uninit_slice
  • Arc::from_iter
  • Rc::new_uninit_slice
  • Rc::from_iter

Zulip discussion
Fixes #95334

@rust-highfive
Copy link
Collaborator

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 23, 2022
@the8472

This comment was marked as outdated.

Due to ptr::add restrictions allocations must be <= isize::MAX bytes. The default implementation goes
through Vec which ensures this, but the uninit and TrustedLen code paths lack the necessary check.
@eddyb
Copy link
Member

eddyb commented Mar 24, 2022

I haven't kept up with any other discussions (so this might be redundant), but I just wanted to say that because of the non-trivial presence of both of these among code published on e.g. crates.io:

  1. Layout "producers" / GlobalAlloc "users": smart pointers (including alloc::rc copies with small tweaks), collections, etc.
  2. Layout "consumers" / GlobalAlloc "providers": perhaps fewer of these, but anything built on top of OS APIs like mmap will expose > isize::MAX allocations (on 32-bit hosts) if they lack extra checks

IMO the only responsible option is to enforce the isize::MAX limit in Layout, which:

  • makes Layout sound in terms of only ever allowing allocations where (alloc_base_ptr: *mut u8).offset(size) is never UB
  • frees both "producers" and "consumers" of Layout from manually reimplementing the checks
    • manual checks can be risky, e.g. if the final size passed to the allocator isn't the one being checked
    • this applies retroactively, fixing the overall soundness of existing code with zero transition period or any changes required from users (as long as going through Layout is mandatory, making a "choke point")

Feel free to quote this comment onto any relevant issue, I might not be able to keep track of developments.
cc @RalfJung @rust-lang/wg-allocators

@Gankra
Copy link
Contributor

Gankra commented Mar 25, 2022

As someone who spent way too much time optimizing libcollections checks for this stuff and tried to splatter docs about it everywhere on the belief that it was a reasonable thing for people to manually take care of: I concede the point, it is not reasonable. I am wholy spiritually defeated by the fact that liballoc of all places is getting this stuff wrong. This isn't throwing shade at the folks who implemented these Rc features, but rather a statement of how impractical it is to expect anyone out in the wider ecosystem to enforce them if some of the most audited rust code in the library that defines the very notion of allocating memory can't even reliably do it.

We need the nuclear option of Layout enforcing this rule. Code that breaks this rule is deeply broken and any "regressions" from changing Layout's contract is a correctness fix. Anyone who disagrees and is sufficiently motivated can go around our backs but the standard library should 100% refuse to enable them.

@CAD97
Copy link
Contributor

CAD97 commented Mar 25, 2022

I agree that this limitation is very difficult to hold manually. My only concern to adding the check into Layout is that checking it generically may be more expensive than expected. (Well, that, and that from_size_align_unchecked would no longer be fully unchecked, but eh.)

Specifically, Layout itself doesn't require that size is a multiple of align, just that size rounded up to a multiple of align is less than usizeisize::MAX. This would make the check for too-large size more complicated; rather than size <= isize::MAX, it'd be (perhaps? my modular math brain is not fully awake) something like ((size - 1) & !(align - 1)) + align <= isize::MAX. (And running out of bits also seems like a problematic edge case...)

(EDIT: After checking: the current check is if size > usize::MAX - (align - 1) { return Err }; changing that to isize::MAX is a trivial change.)

And I made pad_to_align infallible.

So:

  • Yes these specific cases should be patched
  • Yes we should make the invariant harder to break
  • But we should be careful that this doesn't cause unnecessary checking with normal use if possible.
    • This probably means that Layout users using the building API need to use .and_then and .unwrap once at the end, rather than unwrapping along the way.
    • Alternatively, introduce a separate LayoutBuilder type which works infallibily (except for debug checked integer overflow) and only checks upon conversion to Layout proper.

Here with Rc the error would come from Layout::extend exceeding the allowed size.

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 24, 2022
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 22, 2022
@apiraino apiraino added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 23, 2022
@the8472
Copy link
Member Author

the8472 commented Jun 27, 2022

Closing since #95295 is getting FCPed

@the8472 the8472 closed this Jun 27, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 10, 2022
Enforce that layout size fits in isize in Layout

As it turns out, enforcing this _in APIs that already enforce `usize` overflow_ is fairly trivial. `Layout::from_size_align_unchecked` continues to "allow" sizes which (when rounded up) would overflow `isize`, but these are now declared as library UB for `Layout`, meaning that consumers of `Layout` no longer have to check this before making an allocation.

(Note that this is "immediate library UB;" IOW it is valid for a future release to make this immediate "language UB," and there is an extant patch to do so, to allow Miri to catch this misuse.)

See also rust-lang#95252, [Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Layout.20Isn't.20Enforcing.20The.20isize.3A.3AMAX.20Rule).
Fixes rust-lang#95334

Some relevant quotes:

`@eddyb,` rust-lang#95252 (comment)

> [B]ecause of the non-trivial presence of both of these among code published on e.g. crates.io:
>
>   1. **`Layout` "producers" / `GlobalAlloc` "users"**: smart pointers (including `alloc::rc` copies with small tweaks), collections, etc.
>   2. **`Layout` "consumers" / `GlobalAlloc` "providers"**: perhaps fewer of these, but anything built on top of OS APIs like `mmap` will expose `> isize::MAX` allocations (on 32-bit hosts) if they lack extra checks
>
> IMO the only responsible option is to enforce the `isize::MAX` limit in `Layout`, which:
>
>   * makes `Layout` _sound_ in terms of only ever allowing allocations where `(alloc_base_ptr: *mut u8).offset(size)` is never UB
>   * frees both "producers" and "consumers" of `Layout` from manually reimplementing the checks
>     * manual checks can be risky, e.g. if the final size passed to the allocator isn't the one being checked
>     * this applies retroactively, fixing the overall soundness of existing code with zero transition period or _any_ changes required from users (as long as going through `Layout` is mandatory, making a "choke point")
>
>
> Feel free to quote this comment onto any relevant issue, I might not be able to keep track of developments.

`@Gankra,` rust-lang#95252 (comment)

> As someone who spent way too much time optimizing libcollections checks for this stuff and tried to splatter docs about it everywhere on the belief that it was a reasonable thing for people to manually take care of: I concede the point, it is not reasonable. I am wholy spiritually defeated by the fact that _liballoc_ of all places is getting this stuff wrong. This isn't throwing shade at the folks who implemented these Rc features, but rather a statement of how impractical it is to expect anyone out in the wider ecosystem to enforce them if _some of the most audited rust code in the library that defines the very notion of allocating memory_ can't even reliably do it.
>
> We need the nuclear option of Layout enforcing this rule. Code that breaks this rule is _deeply_ broken and any "regressions" from changing Layout's contract is a _correctness_ fix. Anyone who disagrees and is sufficiently motivated can go around our backs but the standard library should 100% refuse to enable them.

cc also `@RalfJung` `@rust-lang/wg-allocators.` Even though this technically supersedes rust-lang#95252, those potential failure points should almost certainly still get nicer panics than just "unwrap failed" (which they would get by this PR).

It might additionally be worth recommending to users of the `Layout` API that they should ideally use `.and_then`/`?` to complete the entire layout calculation, and then `panic!` from a single location at the end of `Layout` manipulation, to reduce the overhead of the checks and optimizations preserving the exact location of each `panic` which are conceptually just one failure: allocation too big.

Probably deserves a T-lang and/or T-libs-api FCP (this technically solidifies the [objects must be no larger than `isize::MAX`](https://rust-lang.github.io/unsafe-code-guidelines/layout/scalars.html#isize-and-usize) rule further, and the UCG document says this hasn't been RFCd) and a crater run. Ideally, no code exists that will start failing with this addition; if it does, it was _likely_ (but not certainly) causing UB.

Changes the raw_vec allocation path, thus deserves a perf run as well.

I suggest hiding whitespace-only changes in the diff view.
workingjubilee pushed a commit to tcdi/postgrestd that referenced this pull request Sep 15, 2022
Enforce that layout size fits in isize in Layout

As it turns out, enforcing this _in APIs that already enforce `usize` overflow_ is fairly trivial. `Layout::from_size_align_unchecked` continues to "allow" sizes which (when rounded up) would overflow `isize`, but these are now declared as library UB for `Layout`, meaning that consumers of `Layout` no longer have to check this before making an allocation.

(Note that this is "immediate library UB;" IOW it is valid for a future release to make this immediate "language UB," and there is an extant patch to do so, to allow Miri to catch this misuse.)

See also #95252, [Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Layout.20Isn't.20Enforcing.20The.20isize.3A.3AMAX.20Rule).
Fixes rust-lang/rust#95334

Some relevant quotes:

`@eddyb,` rust-lang/rust#95252 (comment)

> [B]ecause of the non-trivial presence of both of these among code published on e.g. crates.io:
>
>   1. **`Layout` "producers" / `GlobalAlloc` "users"**: smart pointers (including `alloc::rc` copies with small tweaks), collections, etc.
>   2. **`Layout` "consumers" / `GlobalAlloc` "providers"**: perhaps fewer of these, but anything built on top of OS APIs like `mmap` will expose `> isize::MAX` allocations (on 32-bit hosts) if they lack extra checks
>
> IMO the only responsible option is to enforce the `isize::MAX` limit in `Layout`, which:
>
>   * makes `Layout` _sound_ in terms of only ever allowing allocations where `(alloc_base_ptr: *mut u8).offset(size)` is never UB
>   * frees both "producers" and "consumers" of `Layout` from manually reimplementing the checks
>     * manual checks can be risky, e.g. if the final size passed to the allocator isn't the one being checked
>     * this applies retroactively, fixing the overall soundness of existing code with zero transition period or _any_ changes required from users (as long as going through `Layout` is mandatory, making a "choke point")
>
>
> Feel free to quote this comment onto any relevant issue, I might not be able to keep track of developments.

`@Gankra,` rust-lang/rust#95252 (comment)

> As someone who spent way too much time optimizing libcollections checks for this stuff and tried to splatter docs about it everywhere on the belief that it was a reasonable thing for people to manually take care of: I concede the point, it is not reasonable. I am wholy spiritually defeated by the fact that _liballoc_ of all places is getting this stuff wrong. This isn't throwing shade at the folks who implemented these Rc features, but rather a statement of how impractical it is to expect anyone out in the wider ecosystem to enforce them if _some of the most audited rust code in the library that defines the very notion of allocating memory_ can't even reliably do it.
>
> We need the nuclear option of Layout enforcing this rule. Code that breaks this rule is _deeply_ broken and any "regressions" from changing Layout's contract is a _correctness_ fix. Anyone who disagrees and is sufficiently motivated can go around our backs but the standard library should 100% refuse to enable them.

cc also `@RalfJung` `@rust-lang/wg-allocators.` Even though this technically supersedes #95252, those potential failure points should almost certainly still get nicer panics than just "unwrap failed" (which they would get by this PR).

It might additionally be worth recommending to users of the `Layout` API that they should ideally use `.and_then`/`?` to complete the entire layout calculation, and then `panic!` from a single location at the end of `Layout` manipulation, to reduce the overhead of the checks and optimizations preserving the exact location of each `panic` which are conceptually just one failure: allocation too big.

Probably deserves a T-lang and/or T-libs-api FCP (this technically solidifies the [objects must be no larger than `isize::MAX`](https://rust-lang.github.io/unsafe-code-guidelines/layout/scalars.html#isize-and-usize) rule further, and the UCG document says this hasn't been RFCd) and a crater run. Ideally, no code exists that will start failing with this addition; if it does, it was _likely_ (but not certainly) causing UB.

Changes the raw_vec allocation path, thus deserves a perf run as well.

I suggest hiding whitespace-only changes in the diff view.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

{Rc,Arc}::{new_uninit_slice,from_iter} create too large references
8 participants