Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine scopes around temporaries generated in local accesses #92508

Conversation

dingxiangfei2009
Copy link
Contributor

@dingxiangfei2009 dingxiangfei2009 commented Jan 3, 2022

Fix #57017
Fix #72956

Related to #69663

Sorry for the long sabbatical. This PR takes inspiration from the comments by @pnkfelix and further extends the idea to at least allow refined scopes for temporaries generated while evaluating expressions like x.status(). In details, expressions like x.status() generates temporaries for place or place references to a local variable x, but they could be discarded at an earlier opportunity. What is left to improve the region analysis in a generator body is to identify regions where we can restrict those temporaries. They are tentatively set to if and match sub-expressions.

I have come to a rather late realization that #91032 could be solving the related problems in a more comprehensive setting. However, I would like to gather some feedback and join the relevant discussion with this opportunity.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 3, 2022
@rust-highfive
Copy link
Collaborator

r? @petrochenkov

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 3, 2022
@petrochenkov
Copy link
Contributor

r? @pnkfelix

@eholk
Copy link
Contributor

eholk commented Jan 5, 2022

Author of #91032 here; I saw this show up as a reference when I was looking over comments in my PR. I'll add a couple review comments, but I wanted to leave a few comments here too.

It doesn't look like this PR fixes all of #69663. For example, I don't think this handles the case where you explicitly drop a variable before an await point.

This PR seems pretty orthogonal to mine (#91032), so I think it makes sense to merge both of them. My PR mostly addresses the drop case, while yours seems to handle temporaries in indexing or match positions. I don't think these will conflict and between the two of them there's a good chance it will completely take care of #69663!

Anyway, this is awesome to see this work!

@@ -358,6 +367,34 @@ impl ScopeTree {
self.rvalue_scopes.insert(var, lifetime);
}

pub fn record_local_access_scope(&mut self, var: hir::ItemLocalId, proposed_lifetime: Scope) {
debug!("record_local_access_scope(sub={:?}, sup={:?})", var, proposed_lifetime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have this as var={:?}, proposed_lifetime={:?} instead of sub={:?}, sup={:?}? I tend to find it tricky to remember what sub and sup are when I just see those two terms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, updated with your suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

}
let mut ref_level = SmallVec::<[_; 4]>::new();
let mut ops_iter = ops.into_iter().rev();
ops_iter.next().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth a comment here saying that ops_iter always has at least one element because all of the arms from the previous match either push an item or return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified the tracking a little bit, so that it does not need to track the local variable itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, even better. Thanks!

async move {
match client.status() {
async {
match Client(Box::new(true)).status() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just cleaning up the test, or were changes needed so that it still tested the same behavior as before? In other words, did the error from the previous version go away due to your changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error goes away with this change. However, I checked the issue #64130 and it seems that this test is intended for prompting hints for remedies when values living through yield points make generators non-Send or non-Sync. Therefore, I figured that it is good to keep this test by reproducing the previous errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@@ -0,0 +1,51 @@
// edition:2018
// run-pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build-pass is probably enough here, or probably even check-pass (which I think is the default if you don't do anything).

Copy link
Contributor Author

@dingxiangfei2009 dingxiangfei2009 Jan 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think build-pass is better. It seems that there might be changes to MIR outputs, at least to intermediate MIR outputs.

@@ -0,0 +1,32 @@
// edition:2018
// run-pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too, it looks like build-pass or check-pass should be enough.

For #91032, I tried to use check-pass when I could, and then build-pass when I had a case where the new algorithm might remove a type that would still be in the MIR, since build-pass runs the check that the types MIR found were a subset of those found in typeck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is updated to perform build-pass test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dingxiangfei2009
Copy link
Contributor Author

Thank you @eholk !

It doesn't look like this PR fixes all of #69663. For example, I don't think this handles the case where you explicitly drop a variable before an await point.

That is true. I think I incorrectly stated that this PR will fix #69663 entirely. This is mistakenly included as there are quite a few issues that this PR may affect. I have labelled it to be relevant instead. Hopefully our works could together improve the usability of generators. 😄

@dingxiangfei2009 dingxiangfei2009 force-pushed the fix-64130-refine-scopes branch 2 times, most recently from 9d8c451 to 2b56bcc Compare January 11, 2022 15:24
@pnkfelix
Copy link
Member

Looking at this now; thanks for your patience, @dingxiangfei2009 !

/// all temporaries are dropped while it may be the case that
/// only a subset of temporaries is requested to be dropped.
/// For instance, `x[y.z()]` indexing operation can discard the temporaries
/// generated in evaluating `y.z()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, x[y.z()] indexing operation can discard the temporaries generated in evaluating y.z().

I'll admit, I still need to reload my mental cache here, but: Is this actually true? Aren't there cases where the value produced by y.z() may hold references to those temporaries, so you'd need to keep them alive across the evaluation of the indexing operation?

I'll do some investigation to double-check my understanding here, and see if I can convince myself that you are correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(While talking to niko about this, he notes the example of mutex.lock().something() as a potentially problematic example...)

Copy link
Contributor Author

@dingxiangfei2009 dingxiangfei2009 Feb 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was initially bending my mind as well. Let me try to elaborate. So the indexing operation, either through Index or IndexMut, produces borrows with lifetimes that must outlives self.
Let us take Index as an example. In this example, we attempt to allow indexing into A with a temporary reference &(), producing a value of type &B<'_>. The result of this indexing will need the temporary index &() to be kept alive

use std::ops::Index;

struct A;

struct B<'a>(&'a ());

impl<'a> B<'a> {
    fn foo(&self) {}
}

impl<'a> Index<&'a ()> for A {
    type Output = B<'a>;
    fn index<'s>(&'s self, idx: &'a ()) -> &'s Self::Output {
        unimplemented!() // (1)
    }
}

fn main() {
    let a = A;
    a[&()].foo();
}

At (1), we are expected to construct a borrow with a lifetime 's. However, the Index trait definition does allow us to impose Self::Output: 's or any outliving relation between 'a and 's in this scope. Overall, this means that if the index, which is '_ () in this example, has a lifetime 'a, 's must outlives 'a.

A side track I think there should be an implicit bound of Self::Output: 's to be satisfied, but the current compiler does not complain as long as the return value is !.
So the remaining question is in which way we shall honor this contract. Indeed, this may have been upheld so far through enlarging the terminating scope.

Regarding the example from @nikomatsakis, let us example the following elaborated code.

use std::ops::{Index, IndexMut, DerefMut};
use std::cell::Cell;
use std::sync::Mutex;

struct A<'a>(Cell<&'a mut ()>);

impl<'a> Index<&'a mut ()> for A<'a> {
    type Output = ();
    fn index(&self, idx: &'a mut ()) -> &Self::Output {
        &*self.0.replace(idx)
    }
}

impl<'a> IndexMut<&'a mut ()> for A<'a> {
    fn index_mut(&mut self, idx: &'a mut ()) -> &mut Self::Output {
        self.0.replace(idx)
    }
}

struct B(());

impl B {
    fn new() -> B {
        B(())
    }
    fn foo(&mut self) -> &mut () {
        &mut self.0
    }
}

fn main() {
    let t = &mut ();
    let mut a = A(Cell::new(t));
    let i = Mutex::new(());
    a[i.lock().unwrap().deref_mut()] = (); // <- temporary value is freed at the end of this statement
    a[B::new().foo()] = (); // <- same reason
    a[&mut ()] = ();
}

The expression a[i.lock().unwrap().deref_mut()] = () is currently rejected because the temporary mutex guard mutex.lock() is preserved only until the end of this assignment.

In any case, I am also experimenting ways to produce examples that requiring living temporaries after index to be extra sure.

Copy link
Member

@pnkfelix pnkfelix Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

It looks like you are making an argument based on the lifetimes embedded in the Idx type of trait Index<Idx>.

I think I can see the reasoning there; its something like "the definition of the Index trait: pub trait Index<Idx: ?Sized> { type Output: ?Sized; fn index(&self, index: Idx) -> &Self::Output; }, will force any free lifetimes associated with the Idx to be bound at the scope of the impl<'a, 'b, ...> Index<...> for Concrete<...> for any concrete type Concrete<...>."

My follow-up question then is: Is there a case where the Idx type doesn't have any lifetimes in it, but the scope of temporaries of the expression computing Idx are nonetheless significant (and where the change suggested here could break code)?

My current suspicion is that the most natural cases where that could arise is that it could only happen in unsafe code that is incorrect (i.e., unsafe code that should be using lifetimes to express the constraints between the temporaries and the dynamic extent of the produced value, but is sidestepping doing so, and in the process opens up potential issues where user code can violate those now implicit constraints).

  • I think we have precedent for making changes like that, but I also think we need to be careful whenever we do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In case its not clear, I'm going to poke around a bit more and see if I can make my concerns here concrete. My current thinking is that you may indeed be safe in the assertions you are making here, but I want to make sure I understand the edge cases really well.)

@bors
Copy link
Contributor

bors commented Feb 8, 2022

☔ The latest upstream changes (presumably #92007) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Feb 11, 2022

☔ The latest upstream changes (presumably #93893) made this pull request unmergeable. Please resolve the merge conflicts.

@pnkfelix
Copy link
Member

While playing around with the example provoided by @dingxiangfei2009 over here, I happened upon this small variant of what they posted which changes behavior under this PR:

use std::ops::{Index, IndexMut};
use std::cell::Cell;

struct A<'a>(Cell<&'a mut ()>);

impl<'a> Index<&'a mut ()> for A<'a> {
    type Output = ();
    fn index(&self, idx: &'a mut ()) -> &Self::Output {
        &*self.0.replace(idx)
    }
}

impl<'a> IndexMut<&'a mut ()> for A<'a> {
    fn index_mut(&mut self, idx: &'a mut ()) -> &mut Self::Output {
        self.0.replace(idx)
    }
}

fn main() {
    let t = &mut ();
    let mut a = A(Cell::new(t));
    a[&mut ()] = ();
}

Namely, the nightly (and stable) compiler accept the above code, but the compiler patched with your PR rejects it:

% rustc +stable --version
rustc 1.58.1 (db9d1b20b 2022-01-20)
% rustc +nightly --version
rustc 1.60.0-nightly (1e12aef3f 2022-02-13)
% rustc +stable demo-change.rs
% rustc +nightly demo-change.rs
16-33-45 rust-92508/objdir-dbgopt (git:fix-64130-refine-scopes) % ./build/x86_64-unknown-linux-gnu/stage1/bin/rustc demo-change.rs
error[E0716]: temporary value dropped while borrowed
  --> demo-change.rs:22:12
   |
22 |     a[&mut ()] = ();
   |     -------^^------
   |     |      | |
   |     |      | temporary value is freed at the end of this statement
   |     |      creates a temporary which is freed while still in use
   |     borrow later used here
   |
   = note: consider using a `let` binding to create a longer lived value

error: aborting due to previous error

For more information about this error, try `rustc --explain E0716`.

That implies to me that this PR cannot land as is.


I talked to @nikomatsakis a little while about this, and they were thinking that a better way to go here would be a more holistic approach and attacking RFC 66 "Better temporary lifetimes", (potentially on an edition boundary, if need be), rather than trying to make small adustments with as-yet unknown impact.

@dingxiangfei2009
Copy link
Contributor Author

dingxiangfei2009 commented Feb 15, 2022

@pnkfelix Thank you for providing this example. I did not realize that some temporaries like &mut () here could still be lifted to the block level.

I can restructure the PR to discard the scope rule concerning indexing and keep the scope rule for borrows of local variable. From here, I can continue to look at RFC 66. This would at least admit the program in the original issue so that in the future while working on RFC 66 the allowed programs here can be used as reference.

By the way, I am interested in attacking RFC 66 because it has the potential to replace the work here. It does seem like a great effort, so I will greatly appreciate any mentorship. 😄 In fact, if there is any zulip thread about this RFC, I can join the discussion there.

@dingxiangfei2009
Copy link
Contributor Author

dingxiangfei2009 commented Feb 15, 2022

@pnkfelix Actually, I would like to revisit your example again. I extend your example by duplicating the last expression.

use std::ops::{Index, IndexMut};
use std::cell::Cell;

struct A<'a>(Cell<&'a mut ()>);

impl<'a> Index<&'a mut ()> for A<'a> {
    type Output = ();
    fn index(&self, idx: &'a mut ()) -> &Self::Output {
        &*self.0.replace(idx)
    }
}

impl<'a> IndexMut<&'a mut ()> for A<'a> {
    fn index_mut(&mut self, idx: &'a mut ()) -> &mut Self::Output {
        self.0.replace(idx)
    }
}

fn main() {
    let t = &mut ();
    let mut a = A(Cell::new(t));
    a[&mut ()] = (); // <- Error
    a[&mut ()] = ();
}

This, on stable, gives this error.

error[[E0716]](https://doc.rust-lang.org/stable/error-index.html#E0716): temporary value dropped while borrowed
  [--> src/main.rs:22:12
](https://play.rust-lang.org/#)   |
22 |     a[&mut ()] = ();
   |            ^^      - temporary value is freed at the end of this statement
   |            |
   |            creates a temporary which is freed while still in use
23 |     a[&mut ()] = ();
   |     - borrow later used here
   |
   = note: consider using a `let` binding to create a longer lived value

Of cause, any access to a is made impossible afterwards. It is also impossible to implement Drop trait for A as well.

use std::ops::{Index, IndexMut};
use std::cell::Cell;

struct A<'a>(Cell<&'a mut ()>);

const UNIT: () = ();

impl<'a> Index<usize> for A<'a> {
    type Output = ();
    fn index(&self, _: usize) -> &Self::Output {
        &UNIT
    }
}

impl<'a> Index<&'a mut ()> for A<'a> {
    type Output = ();
    fn index(&self, idx: &'a mut ()) -> &Self::Output {
        &*self.0.replace(idx)
    }
}

impl<'a> IndexMut<&'a mut ()> for A<'a> {
    fn index_mut(&mut self, idx: &'a mut ()) -> &mut Self::Output {
        self.0.replace(idx)
    }
}

fn main() {
    let t = &mut ();
    let mut a = A(Cell::new(t));
    a[&mut ()] = ();
    a[0]; // <- same error here
}

// Similarly, even without the `a[0];` line, implementing Drop also leads to the same compilation error.
impl<'a> Drop for A<'a> {
    fn drop(&mut self) {
        println!("bye")
    }
}

However, I still intend to remove the scope rules for indexing. I just wanted to share that this example is very thought-provoking.

@pnkfelix
Copy link
Member

pnkfelix commented Feb 15, 2022

I can restructure the PR to discard the scope rule concerning indexing and keep the scope rule for borrows of local variable.

Okay, that might be good to look at.

I might recommend closing this PR and opening a fresh one if you go down that route, just because it might be confusing to have metadata linking from #72956 claiming that this PR (PR #92508) fixes that issue (issue #72956) when in fact the scope of PR #92508 has been reduced to not cover that issue. What do you think about that idea?

@nikomatsakis
Copy link
Contributor

@dingxiangfei2009 regarding RFC 66, I'd be open to working with you on that.

@bors
Copy link
Contributor

bors commented Feb 28, 2022

☔ The latest upstream changes (presumably #94427) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 20, 2022
@apiraino
Copy link
Contributor

by reading the latest comments, I think I'll flip the review status to waiting on author to reflect that there seems to be some design work (in case, feel free to adjust, thanks!)

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 24, 2022
@JohnCSimon
Copy link
Member

@dingxiangfei2009 -
Ping from triage: Can you please post the status of this PR?
Thank you.

can you please address the design concerns

FYI: when a PR is ready for review, send a message containing
@rustbot ready to switch to S-waiting-on-review so the PR is in the reviewer's backlog.

apiraino commented on Mar 24
by reading the latest comments, I think I'll flip the review status to waiting on author to reflect that there seems to be some design work (in case, feel free to adjust, thanks!)

@JohnCSimon
Copy link
Member

@dingxiangfei2009
Ping from triage: I'm closing this due to inactivity, Please reopen when you are ready to continue with this.
Note: if you do please open the PR BEFORE you push to it, else you won't be able to reopen. This is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

@JohnCSimon JohnCSimon closed this May 22, 2022
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label May 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
10 participants