Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for as_unsafe_cell stabilization #27708

Closed
aturon opened this issue Aug 12, 2015 · 27 comments
Closed

Tracking issue for as_unsafe_cell stabilization #27708

aturon opened this issue Aug 12, 2015 · 27 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Aug 12, 2015

The Cell::as_unsafe_cell and RefCell::as_unsafe_cell methods return a reference to the underlying UnsafeCell in both cases.

It's not clear what the use cases are for these methods and whether they can be achieved in some alternative way. If you are using these methods, please leave a comment with your use case.

@aturon aturon added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Aug 12, 2015
@aturon
Copy link
Member Author

aturon commented Aug 12, 2015

cc @kmcallister, I think you may have employed these methods in Servo?

@jdm
Copy link
Contributor

jdm commented Aug 12, 2015

We do use the methods for RefCell: http://mxr.mozilla.org/servo/search?string=as_unsafe_cell . We appear to do so in cases where a GC could occur at virtually any time and the there may be outstanding borrows of fields that need to be traced. Similarly, it gets used for cases when we want to ensure certain resources are cleaned up if a panic occurs but there may be outstanding borrows that would prevent doing this through the stable, safe interface. Finally, we also use this for parallel code, but @pcwalton will need to explain the motivation there I think.

@alexcrichton
Copy link
Member

Nominating for resolution in 1.5

@alexcrichton
Copy link
Member

The libs team decided to not move this issue to FCP for this cycle.

@jdm, can you clarify something for us? It looks like this method is used to subvert any active borrows of a RefCell during tracing, but isn't this fundamentally unsafe? How can accessing a RefCell while it's borrowed mutably elsewhere be safe?

@jdm
Copy link
Contributor

jdm commented Sep 24, 2015

When you describe it like that, it does sound pretty incorrect.

@aturon
Copy link
Member Author

aturon commented Sep 24, 2015

@jdm @alexcrichton If this is done during a tracing phase in which the thread owning the RefCell (the "mutator" in GC parlance) is paused, it may be OK. (IMO, any GC has to have some interaction with RefCell that subverts the borrow tracking.)

@alexcrichton
Copy link
Member

I'd be a little worried in terms of optimizations here if we told LLVM something along the lines of "you have a mutable pointer, make any assumption you like knowing that no one else will ever have this pointer unless you hand it to them". That sort of contract would be violated where even if the mutator were paused an optimization may still trigger undefined behavior.

Concretely, if the source looked like:

let slot = a.borrow_mut();
println!("{}", slot.foo);
function();
println!("{}", slot.foo);

It seems like a valid optimization that the value loaded from slot.foo doesn't need to be reloaded across the call to function. The function has mutable access to the contents, so it's "guaranteed" that no one else has access to the pointer, but if some sort of mutation or something happened then badness may arise.

Again though, this is all very hypothetical, I'd certainly be willing to believe that the exact situation here actually works out due to some combination of some actions or whatnot.

@jdm
Copy link
Contributor

jdm commented Sep 24, 2015

Note that there should not be any mutation taking place in function in Servo's case. The use of unsafe cells like this is simply to avoid the difficulty of calling into any SM API that could trigger a GC while a mutable borrow is outstanding.

@seanmonstar
Copy link
Contributor

This function is unsafe because UnsafeCell's field is public.

It seems that the unsafe could be removed from this method, since you can no longer access the UnsafeCell's value directly.

@Columbus240
Copy link

I'm using this function to check if two RefCells point to the same memory location because there's no other way to do this in Rust. Could be stabilized in my opinion.

@SimonSapin
Copy link
Contributor

@Columbus240 Why is this needed to compare memory locations? Or rather, why is comparing the location of RefCells different from comparing that of any other type?

fn same_refs<T>(a: &T, b: &T) -> bool {
    let a: *const T = a;
    let b: *const T = b;
    a == b
}

@Columbus240
Copy link

@SimonSapin Thank you, I didn't know that worked. I couldn't infer from the books page about unsafe pointers how I had to use them. I thought the T in the expression *const T denoted the size of the pointer which makes no sense at all, now that I look at it. More explicit documentation would have made it easier.

@SimonSapin
Copy link
Contributor

I'd be a little worried in terms of optimizations here if we told LLVM something along the lines of "you have a mutable pointer, make any assumption you like knowing that no one else will ever have this pointer unless you hand it to them".

"You hand it to them" is UnsafeCell::get, isn’t it? UnsafeCell is a lang item with #[lang = "unsafe_cell"] and the compile can special-case access to it, but as far as I can tell RefCell is just a library type that could just as well be defined outside of the standard library.

as_unsafe_cell seems fine regarding assumption that LLVM can make. Sure, using it incorrectly can be unsound but so can from_raw_parts_mut, get_unchecked and many other unsafe fn functions/methods in the standard library.

@SimonSapin
Copy link
Contributor

Servo update: with servo/servo#11835 we have a copy of RefCell (based on std::cell::UnsafeCell) without the #[unstable] attributes so that Stylo is not blocked on as_unsafe_cell being stabilized anymore.

Still forking parts of the standard library is not great and we’d still like as_unsafe_cell to be stabilized eventually so that we can go back to using std::cell::RefCell.

@alexcrichton
Copy link
Member

🔔 This issue is now entering a cycle-long final comment period 🔔

The libs team was a bit up in the air about what to do about this method. It was pointed out that we may wish to prefer a method like as_ptr rather than as_unsafe_cell to get access to the innards, but there's still the open question of "when is it safe to actually use this?".

Of the three states that a RefCell can be in:

  • Not borrowed - seems like it's safe to read/write the pointer so long as it's in a controlled situation. This is basically just as-if borrow_mut succeeded but without checking.
  • Borrowed but shared - should be fine to treat as if an extra call to borrow succeeded, but mutation seems like it's definitely out of the question.
  • Borrowed mutably - it's unclear that anything is safe to do here. Some of this is tied up in memory model questions as well, though.

There was a general sentiment that to stabilize we need to document what you can actually do with the return value. That is, we're returning an unsafe pointer, but there should be some safe operations you can do on it which would otherwise be impossible with the borrow/borrow_mut methods.

It was thought that depending on what Servo needs here it may actually be a different abstraction that's needed (e.g. something that can always be traced regardless of borrow state), but we weren't actually 100% clear on what Servo was actually using these methods for still. @SimonSapin or @jdm, could you clarify exactly how Servo is using these methods today?

Overall the libs team feels like FCP is appropriate because:

  • If we determine that these methods should exist, we have clear semantics, and we're comfortable, then they're ready for stabilization.
  • If those conditions aren't met, they're ready for deprecation.

So basically, we'd like to foster discussion during the FCP to see what the fate should be!

@alexcrichton alexcrichton added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed I-nominated labels Jul 12, 2016
@SimonSapin
Copy link
Contributor

I’m not entirely sure why we use this in Servo layout. My current guess is that it’s what I’ll call a "YOLO RW lock". We have one thread (script) that can write to this data, and N threads (layout) that can read it. We don’t want lots of fine-grained actual RWLocks since all that synchronization is expensive, but we already have coarse-grained synchronization that makes sure these sets of threads don’t run at the same time.

But RefCell::borrow is not read-only, it writes to the borrow flag/counter. So we can’t use it from multiple threads without data races. So we use unsafe code to bypass the runtime borrow checking (dereferencing the raw pointer from .as_unsafe_cell().get()) and to bypass RefCell<T>: !Sync.

We don’t use a raw UnsafeCell<T> because the borrow() and borrow_mut() methods are still useful within the script (writing) thread.


It was thought that depending on what Servo needs here it may actually be a different abstraction that's needed

We can always build our own things on top of UnsafeCell<T>. And in a way it’s what we did, except the thing we have now happens to be 100% identical to std::cell::RefCell<T>, and duplicating that code seem unfortunate. (Say if a bug is found and fixed in std::cell::RefCell<T>, we’d have to backport it manually.)

@aturon
Copy link
Member Author

aturon commented Jul 13, 2016

@SimonSapin OK, I figured it was something like that -- a case where an external protocol is providing some additional guarantees that let you access the direct contents safely.

One other question: is there any reason this needs to yield an UnsafeCell rather than a raw pointer?

@SimonSapin
Copy link
Contributor

@aturon On one hand every call to .as_unsafe_cell() in Servo is immediately followed by .get() which returns a raw pointer, and indeed that’s pretty much the only useful thing you can do with &UnsafeCell<T>. On the other hand that seems less general for no particular reason.

@alexcrichton
Copy link
Member

That sounds pretty reasonable to me, so I wouldn't mind stabilizing something like:

/// Get a raw pointer to the data contained in this `RefCell`.
///
/// The returned pointer will point directly at the data contained within, and
/// this method will make no modifications to the borrow flag of this `RefCell`
/// as well.
///
/// Note that it is not safe to use this pointer if the `RefCell` is borrowed
/// mutably at this time, but if this `RefCell` is not mutably borrowed it is
/// safe to use the returned pointer as `&T` at least.
pub fn get_ptr(&self) -> *mut T { /* ... */ }

Or something akin to that.

@aturon
Copy link
Member Author

aturon commented Jul 13, 2016

Yep, I agree. @SimonSapin, the reason not to expose the UnsafeCell is that it's a "data structure" whose sole use is to alert the compiler to interior mutability within a larger data structure. Once you're extracting things out, a raw pointer gives you all the relevant functionality at a simpler type.

@SimonSapin
Copy link
Contributor

What’s the naming convention for as_ vs get_? Slices and strings have as_ptr.

@alexcrichton
Copy link
Member

Oh I'd be fine with as_ here as well

@aturon
Copy link
Member Author

aturon commented Jul 13, 2016

What’s the naming convention for as_ vs get_? Slices and strings have as_ptr.

There's a bit of a clash, since we use get when extracting from cells and containers, and as for conversions, but this is kind of both.

I think as_ptr is probably better, because get suggests some amount of "action" (like touching the borrowing status) that isn't happening.

@alexcrichton
Copy link
Member

Discussed recently, the libs team decided to stabilize.

Our conclusion was to call these methods as_ptr returning a raw pointer.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 18, 2016
Stabilized

* `Cell::as_ptr`
* `RefCell::as_ptr`
* `IpAddr::is_{unspecified,loopback,multicast}`
* `Ipv6Addr::octets`
* `LinkedList::contains`
* `VecDeque::contains`
* `ExitStatusExt::from_raw` - both on Unix and Windows
* `Receiver::recv_timeout`
* `RecvTimeoutError`
* `BinaryHeap::peek_mut`
* `PeekMut`
* `iter::Product`
* `iter::Sum`
* `OccupiedEntry::remove_entry`
* `VacantEntry::into_key`

Deprecated

* `Cell::as_unsafe_cell`
* `RefCell::as_unsafe_cell`
* `OccupiedEntry::remove_pair`

Closes rust-lang#27708
cc rust-lang#27709
Closes rust-lang#32313
Closes rust-lang#32630
Closes rust-lang#32713
Closes rust-lang#34029
Closes rust-lang#34392
Closes rust-lang#34285
Closes rust-lang#34529
bors added a commit that referenced this issue Aug 18, 2016
std: Stabilize APIs for the 1.12 release

Stabilized

* `Cell::as_ptr`
* `RefCell::as_ptr`
* `IpAddr::is_{unspecified,loopback,multicast}`
* `Ipv6Addr::octets`
* `LinkedList::contains`
* `VecDeque::contains`
* `ExitStatusExt::from_raw` - both on Unix and Windows
* `Receiver::recv_timeout`
* `RecvTimeoutError`
* `BinaryHeap::peek_mut`
* `PeekMut`
* `iter::Product`
* `iter::Sum`
* `OccupiedEntry::remove_entry`
* `VacantEntry::into_key`

Deprecated

* `Cell::as_unsafe_cell`
* `RefCell::as_unsafe_cell`
* `OccupiedEntry::remove_pair`

Closes #27708
cc #27709
Closes #32313
Closes #32630
Closes #32713
Closes #34029
Closes #34392
Closes #34285
Closes #34529
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 19, 2016
Stabilized

* `Cell::as_ptr`
* `RefCell::as_ptr`
* `IpAddr::is_{unspecified,loopback,multicast}`
* `Ipv6Addr::octets`
* `LinkedList::contains`
* `VecDeque::contains`
* `ExitStatusExt::from_raw` - both on Unix and Windows
* `Receiver::recv_timeout`
* `RecvTimeoutError`
* `BinaryHeap::peek_mut`
* `PeekMut`
* `iter::Product`
* `iter::Sum`
* `OccupiedEntry::remove_entry`
* `VacantEntry::into_key`

Deprecated

* `Cell::as_unsafe_cell`
* `RefCell::as_unsafe_cell`
* `OccupiedEntry::remove_pair`

Closes rust-lang#27708
cc rust-lang#27709
Closes rust-lang#32313
Closes rust-lang#32630
Closes rust-lang#32713
Closes rust-lang#34029
Closes rust-lang#34392
Closes rust-lang#34285
Closes rust-lang#34529
bors added a commit that referenced this issue Aug 20, 2016
std: Stabilize APIs for the 1.12 release

Stabilized

* `Cell::as_ptr`
* `RefCell::as_ptr`
* `IpAddr::is_{unspecified,loopback,multicast}`
* `Ipv6Addr::octets`
* `LinkedList::contains`
* `VecDeque::contains`
* `ExitStatusExt::from_raw` - both on Unix and Windows
* `Receiver::recv_timeout`
* `RecvTimeoutError`
* `BinaryHeap::peek_mut`
* `PeekMut`
* `iter::Product`
* `iter::Sum`
* `OccupiedEntry::remove_entry`
* `VacantEntry::into_key`

Deprecated

* `Cell::as_unsafe_cell`
* `RefCell::as_unsafe_cell`
* `OccupiedEntry::remove_pair`

Closes #27708
cc #27709
Closes #32313
Closes #32630
Closes #32713
Closes #34029
Closes #34392
Closes #34285
Closes #34529
@nox
Copy link
Contributor

nox commented Aug 20, 2016

Our conclusion was to call these methods as_ptr returning a raw pointer.

So where is that method on Cell? On RefCell? We need this in Servo.

Edit: oh, didn't realise the docs online aren't up-to-date.

@alexcrichton
Copy link
Member

@nox Cell and RefCell

@nox
Copy link
Contributor

nox commented Aug 20, 2016

@alexcrichton Yeah sorry, just didn't wake up enough before opening my mouth.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 23, 2016
Stabilized

* `Cell::as_ptr`
* `RefCell::as_ptr`
* `IpAddr::is_{unspecified,loopback,multicast}`
* `Ipv6Addr::octets`
* `LinkedList::contains`
* `VecDeque::contains`
* `ExitStatusExt::from_raw` - both on Unix and Windows
* `Receiver::recv_timeout`
* `RecvTimeoutError`
* `BinaryHeap::peek_mut`
* `PeekMut`
* `iter::Product`
* `iter::Sum`
* `OccupiedEntry::remove_entry`
* `VacantEntry::into_key`

Deprecated

* `Cell::as_unsafe_cell`
* `RefCell::as_unsafe_cell`
* `OccupiedEntry::remove_pair`

Closes rust-lang#27708
cc rust-lang#27709
Closes rust-lang#32313
Closes rust-lang#32630
Closes rust-lang#32713
Closes rust-lang#34029
Closes rust-lang#34392
Closes rust-lang#34285
Closes rust-lang#34529
pmatos pushed a commit to LinkiTools/rust that referenced this issue Sep 27, 2016
Stabilized

* `Cell::as_ptr`
* `RefCell::as_ptr`
* `IpAddr::is_{unspecified,loopback,multicast}`
* `Ipv6Addr::octets`
* `LinkedList::contains`
* `VecDeque::contains`
* `ExitStatusExt::from_raw` - both on Unix and Windows
* `Receiver::recv_timeout`
* `RecvTimeoutError`
* `BinaryHeap::peek_mut`
* `PeekMut`
* `iter::Product`
* `iter::Sum`
* `OccupiedEntry::remove_entry`
* `VacantEntry::into_key`

Deprecated

* `Cell::as_unsafe_cell`
* `RefCell::as_unsafe_cell`
* `OccupiedEntry::remove_pair`

Closes rust-lang#27708
cc rust-lang#27709
Closes rust-lang#32313
Closes rust-lang#32630
Closes rust-lang#32713
Closes rust-lang#34029
Closes rust-lang#34392
Closes rust-lang#34285
Closes rust-lang#34529
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants