Rewrite nomicon references section #27911

Gankra · 2015-08-20T02:47:37Z

https://doc.rust-lang.org/nightly/nomicon/references.html

This involves solving the incredibly difficult question of "what on earth are Rust's True Pointer Aliasing Rules".

CC @aturon @arielb1 @nikomatsakis @pnkfelix @sunfishcode

Gankra · 2015-08-20T02:50:42Z

It has been argued that the references section should be written in terms of lvalue paths. I believe this is what the borrow checker reasons in terms of, and is at very least a concrete concept. However this section does not want to simply model how the borrow checker thinks -- the entire point is that there needs to be a more fundamental model that the borrow checker models a subset of, but is fundamentally unable to model all of. This is the model unsafe code should be written against, and that the borrow checker can grow into if improved (e.g. nonlexical borrows).

Gankra · 2015-08-20T02:52:52Z

CC @RalfJung and co who are working on formally modeling Rust's semantics.

RalfJung · 2015-08-21T09:35:52Z

Indeed, we'll have some fun figuring this out ;-)

Speaking of which, "A reference cannot outlive its referent" is already something that's not actually enforced in my model. It's only when you use a reference that you have to prove that the referent is still alive, by showing that the lifetime of the reference is still active. As long as you don't use the reference, the model doesn't care whether it is valid.

I should also mention that "path" is not a thing in my formal model. I don't even have a stack. It's all about owning locations, or knowing the protocols that some locations are currently subject to. (Like, a shared borrow to a basic datatype follows the protocol that everybody can read it, and that multiple reads are guaranteed to deliver the same result. A mutable borrow to a basic datatype has the protocol that you can temporarily exchange your borrow for actual ownership of the referent, but until you change this back, it is impossible for the lifetime of the borrow to end.) The challenge will be to translate these protocols, and the even more implicit notions of separation/disjointness, back to something that makes sense when looking at surface Rust code...

aturon · 2015-08-21T17:03:02Z

@RalfJung

Speaking of which, "A reference cannot outlive its referent" is already something that's not actually enforced in my model. It's only when you use a reference that you have to prove that the referent is still alive, by showing that the lifetime of the reference is still active. As long as you don't use the reference, the model doesn't care whether it is valid.

This was essentially what the whole mem::forget/thread::scoped drama was about, and it's equally true of Rust the language: the type system ensures that lifetimes are not usefully reachable outside the scope they describe, but you can e.g. stash them in a leaked Rc cycle.

I think the "path" description in the current document is a bit of a dead end for what the book is ultimately trying to do -- describe the constraints on unsafe code. I do think that a more precise version of the path explanation would be a good way to explain borrow checking, though.

Parakleta · 2015-11-10T02:04:48Z

I assume this is the correct issue to add this question to. I've discovered through some experimentation and by reading #10488 that

let _ = Iron::new(hello_world).http("localhost:3000").unwrap();

for example causes the destructor to be run immediately (i.e. the end of the statement) and so joins the thread and blocks further execution, but

let _ = &Iron::new(hello_world).http("localhost:3000").unwrap();

extends the lifetime to the enclosing block.

I can understand that

let _listen = Iron::new(hello_world).http("localhost:3000").unwrap();

extends the lifetime to the enclosing block because that is the scope of the variable _listen, even though it is unused. What I don't understand is how the lifetime of let _ = <rvalue> differs from let _ = &<rvalue>. Is this a difference I should be relying on? What is the correct method to control the lifetime of unused/anonymous objects?

Gankra · 2015-11-10T03:50:23Z

Interesting! @eddyb any thoughts on this?

nikomatsakis · 2015-11-10T19:48:19Z

On Mon, Nov 09, 2015 at 06:05:20PM -0800, Parakleta wrote:

I assume this is the correct issue to add this question to. I've discovered through some experimentation and by reading #10488 that
let _ = Iron::new(hello_world).http("localhost:3000").unwrap();
for example causes the destructor to be run immediately (i.e. the end of the statement) and so joins the thread and blocks further execution, but
let _ = &Iron::new(hello_world).http("localhost:3000").unwrap();
extends the lifetime to the enclosing block.

I can understand that
let _listen = Iron::new(hello_world).http("localhost:3000").unwrap();
extends the lifetime to the enclosing block because that is the scope of the variable _listen, even though it is unused. What I don't understand is how the lifetime of _ = <rvalue> differs from _ = &<rvalue>. Is this a difference I should be relying on? What is the correct method to control the lifetime of unused/anonymous objects?

Yes, these are all different. It's kind of the intersection of two
distinct rules. The mental model is roughly that the initializer is
stored into a temporary which has the lifetime of the statement. When
you do let <pat> = <initializer>, then, the pattern is matched
against this temporary. It may move things out of the temporary and
place them into fresh bindings, which then live as long as the block,
but things it does not move get dropped along with the temporary.

So something like let (foo, _) = <expr> is roughly as if you did:

let foo;
{
    let temp = <expr>;
    foo = temp.0;
}

Note that _ is not an identifier, it is a pattern which means "ignore this value".

So in terms of your examples:

let _ = foo.unwrap() means: call unwrap and discard result (drops
immediately).

let _x = foo.unwrap() means: call unwrap and store result into a
variable called _x (drops when _x is dropped)

Meanwhile, orthogonally: &foo.unwrap() means "create a temporary
stack slot" and store foo.unwrap() into it. Because it is being
stored into a let binding, the lifetime of this temporary is
extended to the enclosing block.

It's possible that the lifetime of the temporary we create when doing
pattern matching in a let should be the enclosing block, rather than
the let statement. This would be perhaps more analogous with the &
rules. But I wonder if this would break existing code; it's hard to
know. I'm not 100% sure why I didn't do it this way at the time,
because I remember being annoyed that let _ = foo() and let _x = foo() were not equivalent. That said, there are many who believe
they should not be; I can't find the issue now, but there was at one
point specific code in trans to ensure that let _ = foo() would drop
the result of foo() immediately.

Parakleta · 2015-11-10T20:38:28Z

The discussion in #10488 for the distinction between _ and _x makes sense to me, and I'm happy with the rationale that let _ = <rvalue> is essentially a no-op. I'm just confused by the & case. Does it mean that an & anywhere in an expression always creates a temporary that has the lifetime at least as long as the enclosing block? The statement &Iron::new()::http()::unwrap() doesn't but is assume that's because it's a statement and not an expression.

nikomatsakis · 2015-11-11T23:50:07Z

On Tue, Nov 10, 2015 at 12:39:02PM -0800, Parakleta wrote:

Does it mean that an & anywhere in an expression always creates a
temporary that has the lifetime at least as long as the enclosing
block?

No. The rules are more subtle than that. Temporaries usually live
until the end of the current statement (the let, in this case) but
if they appear in specific places, they are extended until the end of
the block. Basically, if they appear in a place where it is unambiguous
that they would be stored into the result of a let.

Hence, the following temporaries will last until end of block:

let <pat> = &<expr>
let <pat> = StructName { field: &<expr>, ... }

But (under current rules) this would not:

let <pat> = method(&<expr>);

See http://doc.rust-lang.org/reference.html#temporary-lifetimes for
more details and more examples.

Parakleta · 2015-11-12T00:03:35Z

So the let _ = <rvalue>; statement lifetime and the let <pat> = &<rvalue>; statement lifetimes are different but my confusion comes from the idea (maybe I'm missing something though) that _ is a valid <pat> and &<rvalue> is also a valid <rvalue> so the statement let _ = &<rvalue> would match both rules?

Does this mean that let <pat> = &<rvalue> has higher priority than let _ = ... and should we assume this will always be true?

eddyb · 2015-11-12T08:02:11Z

@Parakleta let _ = <rvalue>; drops the RHS immediatelly but always evaluates it, so the rules for &<rvalue> still apply, even if the reference is dropped (which is a no-op because references are Copy).

Parakleta · 2015-11-13T00:05:31Z

Thanks, I just noticed the text "The compiler uses simple syntactic rules to decide" in the reference manual. This all seems a bit fragile, considering that the decision is made on Syntax rather than Semantics. For example, if I have let _ = nop!(&<rvalue>); the lifetime is unknown without knowing exactly the contents of the macro. It seems RFC66 (Issue #15023) has the potential to end up changing this anyway. I'll steer clear of let _ = &<rvalue> for now and stick with let _tmp = <rvalue> instead.

nikomatsakis · 2015-11-30T17:36:13Z

On Thu, Nov 12, 2015 at 04:06:03PM -0800, Parakleta wrote:

Thanks, I just noticed the text "The compiler uses simple syntactic rules to decide" in the reference manual. This all seems a bit fragile, considering that the decision is made on Syntax rather than Semantics. For example, if I have let _ = nop!(&<rvalue>); the lifetime is unknown without knowing exactly the contents of the macro. It seems RFC66 (Issue #15023) has the potential to end up changing this anyway. I'll steer clear of let _ = &<rvalue> for now and stick with let _tmp = <rvalue> instead.

It is true that you have to know the contents of the macro. However,
the motivation for using syntax was precisely to make it easier to
follow -- people were nervous about relying on inference to decide
when a destructor runs, since inference algorithms might be overly
conservative, or change over time.

steveklabnik · 2017-03-24T19:27:47Z

Moving this to rust-lang/nomicon#7

Gankra added the A-nomicon label Aug 20, 2015

steveklabnik mentioned this issue Mar 24, 2017

Rewrite references section rust-lang/nomicon#7

Open

steveklabnik closed this as completed Mar 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite nomicon references section #27911

Rewrite nomicon references section #27911

Gankra commented Aug 20, 2015

Gankra commented Aug 20, 2015

Gankra commented Aug 20, 2015

RalfJung commented Aug 21, 2015

aturon commented Aug 21, 2015

Parakleta commented Nov 10, 2015

Gankra commented Nov 10, 2015

nikomatsakis commented Nov 10, 2015

Parakleta commented Nov 10, 2015

nikomatsakis commented Nov 11, 2015

Parakleta commented Nov 12, 2015

eddyb commented Nov 12, 2015

Parakleta commented Nov 13, 2015

nikomatsakis commented Nov 30, 2015

steveklabnik commented Mar 24, 2017

Rewrite nomicon references section #27911

Rewrite nomicon references section #27911

Comments

Gankra commented Aug 20, 2015

Gankra commented Aug 20, 2015

Gankra commented Aug 20, 2015

RalfJung commented Aug 21, 2015

aturon commented Aug 21, 2015

Parakleta commented Nov 10, 2015

Gankra commented Nov 10, 2015

nikomatsakis commented Nov 10, 2015

Parakleta commented Nov 10, 2015

nikomatsakis commented Nov 11, 2015

Parakleta commented Nov 12, 2015

eddyb commented Nov 12, 2015

Parakleta commented Nov 13, 2015

nikomatsakis commented Nov 30, 2015

steveklabnik commented Mar 24, 2017