Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Borrows: who can read frozen data #87

Closed
arielb1 opened this issue Feb 6, 2019 · 3 comments · Fixed by rust-lang/miri#695
Closed

Stacked Borrows: who can read frozen data #87

arielb1 opened this issue Feb 6, 2019 · 3 comments · Fixed by rust-lang/miri#695
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows)

Comments

@arielb1
Copy link

arielb1 commented Feb 6, 2019

In the current specification of stacked borrows, freezing data (e.g., reborrowing a mutable reference) makes it accessible for raw pointers until the freeze is popped.

For example, this code is legal

fn start() {
    let mut x = 0u32;
    let p: *const u32 = &x;
    foo(&mut x, p);
}

fn foo(a: &mut u32, y: *const u32) {
    *a += 1;
    let b = &*a; // this freezes `x`.
    unsafe { use(*y); } // which makes this legal in stacked borrows
}

This has several bad consequences:

  1. Reborrows can't be DCEed or moved forward across something that can access a potentially-invalid pointer.

  2. If we have some code that does a reborrow, we lose the ability to assume that nobody is accessing things behind our back

    For example, in this (quite notorious) extend_from_slice example:

    fn extend_from_slice<T: Clone>(self: &mut Vec<T>, other: &[T]) {
        self.reserve(other.len());
        for datum in other {
            ptr::write(self.as_mut_ptr().offset(self.len()), T::clone(datum));
            self.set_len(self.len() + 1);
        }
    }

    We would like to avoid writing to self.len on every loop iteration, and instead only write to it either after the for-loop finishes or when there is a panic.

    However, this optimization is illegal if T::clone can access self.len, and it is often hard to prove that T::clone does not access self.len without relying on aliasing assumptions (note: it might be possible to optimize this for T = u8, because then within loop we only access locals, self, and an immutable reference that we know is disjoint from self, so all accesses are accounted for, but that is too subtle an argument for LLVM in practice, and in any case does not work if clone can e.g. call an allocator).

    self is an &mut reference, so you would expect that nobody would be able to read from self.len, but if one of the reborrows that happens here (in any of the method calls) had frozen &self, then it would have been possible to access self.len through a raw pointer, invalidating the optimization.

  3. This is (very) incompatible with anything resembling LLVM noalias on mutable reference arguments, even if some fix is done to the subtle issues in Stacked Borrows: Incompatibilities with LLVM noalias #86 (as any sane version of noalias would be able to optimize the extend_from_slice code).

@RalfJung RalfJung added the A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) label Feb 6, 2019
@RalfJung
Copy link
Member

RalfJung commented Feb 6, 2019

Thanks, good point!

We will have to distinguish between "broadcast to raw ptr (read) accesses" and "frozen" to fix this.

@RalfJung
Copy link
Member

any sane version of noalias would be able to optimize the extend_from_slice code

What if T::clone panics? Then if we delay the write, we'd panic with memory in a different state than the original code.


This is actually more subtle than I thought. Consider:

fn test(x: &mut i32) -> i32 {
  *x = 42;
  let x = &*x;
  unk();
  return x;
}

We want to argue that unk cannot read from or write to x, so we can move the write down (ignoring panics). Let's assume raw pointers are ruled out. We still have to argue that unk cannot have a shared reference that is allowed to read this. We allow any shared reference to read if it was created after memory got frozen, so we have to argue that unk cannot create such a shared reference. I think this is indeed true (the shared reference can only be created from another shared reference [continue by induction], from a mutable reference [but these are unique] or a raw pointer), but it is non-trivial and we have to be careful in some aspects of the model (like when creating a raw pointer to a location that has been raw-mutable-broadcast but not raw-shared-broadcast, we must make sure to pop off enough stuff).

Oh, and if you consider interior mutability, things become even worse: &mut Cell<T> is unique, but when you turn that into &Cell<T> we now have to maintain control over who can access this, but we must not broadcast this to all raw pointers. This breaks the idea "shared references to UnsafeCell behave like raw pointers", which means there's now a new case to be considered everywhere. :/

@RalfJung
Copy link
Member

This will be fixed by rust-lang/miri#695: Shared references get tracked with a unique ID ("tag") just like mutable references do, and the per-location borrow stack keeps track of which tags are allowed to read and maybe write to this location. That should fix all of the examples in this issue as well as the analysis concerns I raised above (which still assumed that we only have one "frozen" flag per location, instead of basically a list of shared references that are all allowed to read).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants