Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new rules for merging expected and supplied types in closure signatures #45072

Merged
merged 4 commits into from
Nov 5, 2017

Conversation

nikomatsakis
Copy link
Contributor

As uncovered in #38714, we currently have some pretty bogus code for combining the "expected signature" of a closure with the "supplied signature". To set the scene, consider a case like this:

fn foo<F>(f: F)
where
  F: for<'a> FnOnce(&'a u32) -> &'a u32
  // ^ *expected* signature comes from this where-clause
{
    ...
}

fn main() {
    foo(|x: &u32| -> &u32 { .. }
     // ^^^^^^^^^^^^^^^^^ supplied signature
     // comes from here 
}

In this case, the supplied signature (a) includes all the parts and (b) is the same as the expected signature, modulo the names used for the regions. But often people supply only some parts of the signature. For example, one might write foo(|x| ..), leaving everything to be inferred, or perhaps foo(|x: &u32| ...), which leaves the return type to be inferred.

In the current code, we use the expected type to supply the types that are not given, but otherwise use the type the user gave, except for one case: if the user writes fn foo(|x: _| ..) (i.e., an underscore at the outermost level), then we will take the expected type (rather than instantiating a fresh type variable). This can result in nonsensical situations, particularly with bound regions that link the types of parameters to one another or to the return type. Consider foo(|x: &u32| ...) -- if we literally splice the expected return type of &'a u32 together with what the user gave, we wind up with a signature like for<'a> fn(&u32) -> &'a u32. This is not even permitted as a type, because bound regions like 'a must appear also in the arguments somewhere, which is why #38714 leads to an ICE.

This PR institutes some new rules. These are not meant to be the final set of rules, but they are a kind of "lower bar" for what kind of code we accept (i.e., we can extend these rules in the future to be smarter in some cases, but -- as we will see -- these rules do accept some things that we then would not be able to back off from).

These rules are derived from a few premises:

  • First and foremost, anonymous regions in closure annotation are mostly requests for the code to "figure out the right lifetime" and shouldn't be read too closely. So for example when people write a closure signature like |x: &u32|, they are really intended for us to "figure out" the right region for x.
    • In contrast, the current code treats this supplied type as being more definitive. In particular, writing |x: &u32| would always result in the region of x being bound in the closure type. In other words, the signature would be something like for<'a> fn(&'a u32) -- this is derived from the fact that fn(&u32) expands to a type where the region is bound in the fn type.
    • This PR takes a different approach. The "binding level" for reference types appearing in closure signatures can be informed in some cases by the expected signature. So, for example, if the expected signature is something like (&'f u32), where the region of the first argument appears free, then for |x: &u32|, the new code would infer x to also have the free region 'f.
      • This inference has some limits. We don't do this for bindings that appear within the selected types themselves. So e.g. |x: fn(&u32)|, when combined with an expected type of fn(fn(&'f u32)), would still result in a closure that expects for<'a> fn(&'a u32). Such an annotation will ultimately result in an error, as it happens, since foo is supplying a fn(&'f u32) to the closure, but the closure signature demands a for<'a> fn(&'a u32). But still we choose to trust it and have the user change it.
      • I wanted to preserve the rough intuition that one can copy-and-paste a type out of the fn signature and into the fn body without dramatically changing its meaning. Interestingly, if one has |x: &u32|, then regardless of whether the region of x is bound or free in the closure signature, it is also free in the region body, and that is also true when one writes let x: &u32, so that intuition holds here. But the same would not be true for fn(&u32), hence the different behavior.
  • Second, we must take either all the references to bound regions from the expected type or none. The current code, as we saw, will happily take a bound region in the return type but drop the other place where it is used, in the parameters. Since bound regions are all about linking multiple things together, I think it's important not to do that. (That said, we could conceivably be a bit less strict here, since the subtyping rules will get our back, but we definitely don't want any bound regions that appear only in the return type.)
  • Finally, we cannot take the bound region names from the supplied types and "intermix" them with the names from the expected types.
    • We could potentially do some alpha renaming, but I didn't do that.
  • Ultimately, if the types the user supplied do not match expectations in some way that we cannot recover from, we fallback to deriving the closure signature solely from those expected types.
    • For example, if the expected type is u32 but the user wrote i32.
    • Or, more subtle, if the user wrote e.g. &'x u32 for some named lifetime 'x, but the expected type includes a bound lifetime (for<'a> (&'a u32)). In that case, preferring the type that the user explicitly wrote would hide an appearance of a bound name from the expected type, and we try to never do that.

The detailed rules that I came up with are found in the code, but for ease of reading I've also excerpted them into a gist. I am not convinced they are correct and would welcome feedback for alternative approaches.

(As an aside, the way I think I would ultimately prefer to think about this is that the conversion from HIR types to internal types could be parameterized by an "expected type" that it uses to guide itself. However, since that would be a pain, I opted in the code to first instantiate the supplied types as Ty<'tcx> and then "merge" those types with the Ty<'tcx> from the expected signature.)

I think we should probably FCP this before landing.

cc @rust-lang/lang
r? @arielb1

@nikomatsakis
Copy link
Contributor Author

(As an aside, the way I think I would ultimately prefer to think about this is that the conversion from HIR types to internal types could be parameterized by an "expected type" that it uses to guide itself. However, since that would be a pain, I opted in the code to first instantiate the supplied types as Ty<'tcx> and then "merge" those types with the Ty<'tcx> from the expected signature.)

FML. This PR #44633 kind of broke this approach. That PR causes us to store the "supplied" types as the entries for the various HIR types, but those supplied types may include inference variables or regions etc that never wind up in the final type. Annoying. Well, I'll think about how to resolve this, but the greater question of which rules we want still stands.

@carols10cents carols10cents added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 9, 2017
@arielb1
Copy link
Contributor

arielb1 commented Oct 12, 2017

Travis failure:

[00:52:16] stderr:
[00:52:16] ------------------------------------------
[00:52:16] error[E0282]: type annotations needed
[00:52:16]   --> /checkout/src/test/run-pass/closure-expected-type/expect-infer-supply-two-infers.rs:17:30
[00:52:16]    |
[00:52:16] 17 |     with_closure(|mut x: Vec<_>, y| {
[00:52:16]    |                              ^ cannot infer type for `_`
[00:52:16] 
[00:52:16] error: aborting due to previous error
[00:52:16] 
[00:52:16] 
[00:52:16] ------------------------------------------
[00:52:16] 
[00:52:16] thread '[run-pass] run-pass/closure-expected-type/expect-infer-supply-two-infers.rs' panicked at 'explicit panic', /checkout/src/tools/compiletest/src/runtest.rs:2433:8
[00:52:16] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:52:16] 
[00:52:16] ---- [run-pass] run-pass/closure-expected-type/supply-just-return-type.rs stdout ----
[00:52:16] 	
[00:52:16] error: compilation failed!
[00:52:16] status: exit code: 101
[00:52:16] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/run-pass/closure-expected-type/supply-just-return-type.rs" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass" "--target=x86_64-unknown-linux-gnu" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/closure-expected-type/supply-just-return-type.stage2-x86_64-unknown-linux-gnu" "-Crpath" "-O" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/closure-expected-type/supply-just-return-type.stage2-x86_64-unknown-linux-gnu.run-pass.libaux"
[00:52:16] stdout:
[00:52:16] ------------------------------------------
[00:52:16] 
[00:52:16] ------------------------------------------
[00:52:16] stderr:
[00:52:16] ------------------------------------------
[00:52:16] error[E0282]: type annotations needed
[00:52:16]   --> /checkout/src/test/run-pass/closure-expected-type/supply-just-return-type.rs:33:31
[00:52:16]    |
[00:52:16] 33 |     let z = with_closure(|x: &_| -> Result<char, ()> { Ok(*x) });
[00:52:16]    |                               ^ cannot infer type for `_`
[00:52:16] 
[00:52:16] error: aborting due to previous error
[00:52:16] 
[00:52:16] 

/// due to the limits of Rust's syntax -- must incidentally be a named region).
/// - If E supplies a bound region but S does not, we error out.
///
/// # Why we use the supplied type with bound regions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll force the expected type to be equal to the supplied type anyway when we resolve the "closure: expectation" obligation. so we always want to use the supplied type except for bound regions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. Keep in mind that the output of this algorithm is the ultimate type T which is assigned to the closure -- it may not be equal to what I've been calling the "supplied type", which is the type that the user actually wrote, at least as presently interpreted (in particular with respect to anonymous regions, which may wind up being mapped to other things).

It is this ultimate type T that will be equated with the expected type.

In other words, we never force the expected type to be equal to the supplied type.

/// Regions are allowed to differ as follows:
///
/// - If S supplies a late-bound region bound at the closure depth (1):
/// - If that region is anonymous, we use the region from E (which may also be free).
Copy link
Contributor

@arielb1 arielb1 Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works because closures don't do return type elision e.g. |x: &u32| -> &u32 is for<'a> |x: &'a u32| -> &'_ u32 rather than for<'a> |x: &'a u32| -> &'a u32. Worth noting and testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah. I .. didn't realize this was the case actually! Doesn't that seem like a bug? I at least expect |x: &u32| -> &u32 in a closure signature to mean the same as fn(&u32) -> &u32.

(Also, '_ in that position does do elision, as far as I know, but I suspect you mean "fresh region"? If so, hopefully a fresh free region, since a fresh bound region in that position would be broken.)

///
/// - If the expected type E is an (unbound) inference variable,
/// and S is not an inference variable, then the resulting type will be S,
/// so long as S does not contain anonymous regions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if S does contain anonymous regions, what happens (this needs to be specified).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to make this clearer. The intention was that, if none of the conditions in this list apply, then the "supplied types" will be used "as is" as the resulting closure type. Typically, this will then result in an error unless the "expected type" is precisely equal (because of invariance).

So, in this particular case, if S contains anonymous regions, then the supplied types (for all parameters) are used "as is".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(At least that's how I remember it. It's been a while, so I'll double check.)

/// use the supplied type):
/// - the supplied type includes any regions bound at the closure depth
/// - in this case, using the supplied type would inject names from the supplied
/// side into the closure binder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That takes some effort to create and I would like a test for it:

type Foo<'a> = fn(&'a u32);

fn main() {
    foo(|_x: Foo| {});
}

Copy link
Contributor

@arielb1 arielb1 Oct 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, replacing anonymous regions that occur multiple times with different regions would also allow this to compile (I would like a test):

type Foo<'a, 'b> = &'a mut (&'a (), &'b ());
type Foo1<'a> = Foo<'a, 'a>;
fn give_foo<F: for<'a, 'b> Fn(Foo<'a, 'b>)>(_f: F) {}
fn assert_foo1(f: Foo1) {}

fn this_must_error() {
    give_foo(|f: Foo1| {
        // `f` here is a `Foo`, not necessarily a `Foo1`
        assert_foo1(f) //~ ERROR
    });
}

fn but_this_doesnt() {
    give_foo(|_f: Foo1| {
        // `f` here is a `Foo`, not necessarily a `Foo1`
    });
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arielb1

That takes some effort to create and I would like a test for it:

I had the following test:

fn expect_free_supply_free<'x>(x: &'x u32) {
    // Here, the type given for `'x` "obscures" a region from the
    // expected signature that is bound at closure level, so we don't
    // take anything from the expected type. You can see this in bits
    // of error message I have extracted.
    with_closure_expecting_fn_with_free_region(|x: fn(&'x u32), y| {});
    //~^ ERROR implements the trait `std::ops::FnOnce<(fn(&'x u32), _)>`
    //~| ERROR type mismatch
}

but I will add also the test you requested. (It errors.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your second test, it covers an interesting case I had not considered. I think that it is troubling that but_this_doesnt does not error. It seems like if the user wrote Foo1 than they have "asserted" that they want the two regions to be the same. I suspect we want to treat repeated anonymous regions carefully -- i.e., rather like named regions?

@arielb1
Copy link
Contributor

arielb1 commented Oct 12, 2017

I think this meshes well with the view of "use the type specified, but take late-bound regions from the expected type if we can" - the specified and expected type will be unified later on anyway, so the only important thing in the expected type is LBRs.

@arielb1
Copy link
Contributor

arielb1 commented Oct 12, 2017

About the test failure: we're getting a E0282 which is the ambiguity you get from _: Sized. Someone must be pushing WF predicates.

@bors
Copy link
Contributor

bors commented Oct 13, 2017

☔ The latest upstream changes (presumably #45069) made this pull request unmergeable. Please resolve the merge conflicts.

@kennytm kennytm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 14, 2017
@alexcrichton
Copy link
Member

triage ping for @nikomatsakis!

@nikomatsakis
Copy link
Contributor Author

@alexcrichton workin' on it.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Oct 31, 2017

OK, I think I found a new approach to this that I like better, and which doesn't require anything too complex. The new approach is as follows:

  • Let S be the (higher-ranked) signature that we derive from the user's annotations.
  • Let E be the (higher-ranked) signature that we derive from the expectations, if any.
  • If we have no expectation E, then the signature of the closure is S.
  • Otherwise, the signature of the closure is E. Moreover:
    • Instantiate all the late-bound regions bound in the closure within S with fresh variables, yielding S'
    • Require that E = S'

This seems to meet all of the criteria I wanted. For example, if the user writes |x: &u32| ..., that is effectively a request for "some region bound somewhere". Moreover, it doesn't require any really complex or subtle code / rules.

I implemented this idea in the last commit above but didn't have time to go through all the tests and correct them etc.

@arielb1
Copy link
Contributor

arielb1 commented Oct 31, 2017

@nikomatsakis

Nice idea - this looks better than iterating over types like that, and this is a good way of having "the non-higher-ranked structure should be identical"

Also:

[00:03:45] tidy error: /checkout/src/librustc_typeck/check/closure.rs:368: line longer than 100 chars

supplied_output_ty);
}

/// If there is no expected signature, then we will convert the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

{
}

fn supplying_different() {
Copy link
Contributor

@arielb1 arielb1 Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no error pattern? this shouldn't have errors in it, so add the #[rustc_error] attribute

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also have tests for all 4 ways of mixing Different/Same, to check that they have the same behavior (if the bound is F: FnOnce(Different<'_, '_>) but the closure is |x: Same|, that would be expected not to compile).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are just not passing yet. Will fix.

@arielb1
Copy link
Contributor

arielb1 commented Oct 31, 2017

This however does have the unfortunate effect where free lifetimes in the closure signature can be equated against bound lifetimes in the expected signature:

fn foo<F: Fn(&u32)>(f: F) { let x = 42; f(&x); }
fn bar<'a>(_x: &'a u32) {
    foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
}

fn main() {
    bar(&0);
}

Actually, this might correctly be an error, because you liberate (which is equivalent to skolemizing) the expected signature, so you basically do a higher-ranked subtyping operation between the expected and supplied signatures using the "closure liberated lifetimes" instead of ReSkol. A test for both this and the Different/Same case (all versions) would be nice.

@nikomatsakis
Copy link
Contributor Author

@arielb1

Actually, this might correctly be an error, because you liberate (which is equivalent to skolemizing) the expected signature

Right, exactly. Your example gives:

error[E0308]: mismatched types
 --> /home/nmatsakis/tmp/bar.rs:3:14
  |
3 |     foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
  |              ^^^^^^^ lifetime mismatch
  |
  = note: expected type `&u32`
             found type `&'a u32`
note: the lifetime 'a as defined on the function body at 2:1...
 --> /home/nmatsakis/tmp/bar.rs:2:1
  |
2 | / fn bar<'a>(_x: &'a u32) {
3 | |     foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
4 | | }
  | |_^
note: ...does not necessarily outlive the anonymous lifetime #2 defined on the body at 3:9
 --> /home/nmatsakis/tmp/bar.rs:3:9
  |
3 |     foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
  |         ^^^^^^^^^^^^^^^^

error[E0308]: mismatched types
 --> /home/nmatsakis/tmp/bar.rs:3:14
  |
3 |     foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
  |              ^^^^^^^ lifetime mismatch
  |
  = note: expected type `&u32`
             found type `&'a u32`
note: the anonymous lifetime #2 defined on the body at 3:9...
 --> /home/nmatsakis/tmp/bar.rs:3:9
  |
3 |     foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
  |         ^^^^^^^^^^^^^^^^
note: ...does not necessarily outlive the lifetime 'a as defined on the function body at 2:1
 --> /home/nmatsakis/tmp/bar.rs:2:1
  |
2 | / fn bar<'a>(_x: &'a u32) {
3 | |     foo(|_y: &'a u32| {}); // this should be an error: `_y` isn't an `&'a u32`
4 | | }
  | |_^

@nikomatsakis
Copy link
Contributor Author

Kind of annoying that it gives two errors though.

@nikomatsakis
Copy link
Contributor Author

@arielb1 also, that case is already covered by the expect-region-supply-region.rs (along with many other permutations).

fn expect_bound_supply_named<'x>() {
    let mut f: Option<&u32> = None;

    // Here we give a type annotation that `x` should be free. We get
    // an error because of that.
    closure_expecting_bound(|x: &'x u32| {
        //~^ ERROR mismatched types
        //~| ERROR mismatched types

        // And we still cannot let `x` escape into `f`.
        f = Some(x);
        //~^ ERROR cannot infer
    });
}

@nikomatsakis
Copy link
Contributor Author

OK @arielb1 I think the tests all pass now.

@arielb1
Copy link
Contributor

arielb1 commented Nov 1, 2017

@nikomatsakis

This seems to make closure errors messages worse when there are inconsistent type annotations (see test failures), because the expected type "overrides" the provided type. Maybe we should just wrap this logic in a commit_if_ok, and if we fail to match the expectation, continue with the provided type ignoring the expectation - this should cause a "type mismatch in closure arguments" error later on (I think in all cases - there's no other way to make a closure implement an fn trait except having the types match, but there might be some edge case).

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Nov 1, 2017

@arielb1 Hmm. Another way to handle it would be to just not use demand but instead something that handles error reporting in a more customized way. For example, while the current stuff prints:

[00:39:53] error[E0631]: type mismatch in closure arguments
[00:39:53]   --> $DIR/anonymous-higher-ranked-lifetime.rs:18:5
[00:39:53]    |
[00:39:53] 18 |     g2(|_: (), _: ()| {});
[00:39:53]    |     ^^ ----------------- found signature of `fn((), ()) -> _`
[00:39:53]    |     |
[00:39:53]    |     expected signature of `for<'r> fn(&'r (), for<'s> fn(&'s ())) -> _`
[00:39:53]    |
[00:39:53]    = note: required by `g2`

we could readily identify a particular argument and issue a more customized error:

[00:39:53] error[E0631]: type mismatch in closure argument
[00:39:53]   --> $DIR/anonymous-higher-ranked-lifetime.rs:18:5
[00:39:53]    |
[00:39:53] 18 |     g2(|_: (), _: ()| {});
[00:39:53]    |            ^^ argument needs type `&()`
[00:39:53]    |     
[00:39:53]    = note: expected type `&()`
[00:39:53]               found type `()`

(One thing I don't know, however, is where the double error reporting currently comes from.)

That said, it might be easier to do as you suggest, and perhaps open a FIXME to make further improvements. I'd like to close this PR out and focus on other things after all.

@arielb1
Copy link
Contributor

arielb1 commented Nov 2, 2017

@nikomatsakis

A custom error message sounds like something you could open an A-diagnostics issue for, I'll just want to do the "avoid override thing" so we don't regress error messages.

@nikomatsakis
Copy link
Contributor Author

@arielb1 yep filed #45727

@bors
Copy link
Contributor

bors commented Nov 2, 2017

☔ The latest upstream changes (presumably #45409) made this pull request unmergeable. Please resolve the merge conflicts.

@kennytm kennytm added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Nov 2, 2017
Also, fix numbering in mir-opt tests. We are now anonymizing more
consistently, I think, and hence some of the `TyAnon` indices shifted.
This prevents regressions on some annoying cases.
@nikomatsakis
Copy link
Contributor Author

@arielb1 ok see last commit.

@arielb1
Copy link
Contributor

arielb1 commented Nov 5, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Nov 5, 2017

📌 Commit e8a96c9 has been approved by arielb1

@kennytm kennytm added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 5, 2017
@bors
Copy link
Contributor

bors commented Nov 5, 2017

⌛ Testing commit e8a96c9 with merge 666687a...

bors added a commit that referenced this pull request Nov 5, 2017
new rules for merging expected and supplied types in closure signatures

As uncovered in #38714, we currently have some pretty bogus code for combining the "expected signature" of a closure with the "supplied signature". To set the scene, consider a case like this:

```rust
fn foo<F>(f: F)
where
  F: for<'a> FnOnce(&'a u32) -> &'a u32
  // ^ *expected* signature comes from this where-clause
{
    ...
}

fn main() {
    foo(|x: &u32| -> &u32 { .. }
     // ^^^^^^^^^^^^^^^^^ supplied signature
     // comes from here
}
```

In this case, the supplied signature (a) includes all the parts and (b) is the same as the expected signature, modulo the names used for the regions. But often people supply only *some* parts of the signature. For example, one might write `foo(|x| ..)`, leaving *everything* to be inferred, or perhaps `foo(|x: &u32| ...)`, which leaves the return type to be inferred.

In the current code, we use the expected type to supply the types that are not given, but otherwise use the type the user gave, except for one case: if the user writes `fn foo(|x: _| ..)` (i.e., an underscore at the outermost level), then we will take the expected type (rather than instantiating a fresh type variable). This can result in nonsensical situations, particularly with bound regions that link the types of parameters to one another or to the return type. Consider `foo(|x: &u32| ...)` -- if we *literally* splice the expected return type of `&'a u32` together with what the user gave, we wind up with a signature like `for<'a> fn(&u32) -> &'a u32`. This is not even permitted as a type, because bound regions like `'a` must appear also in the arguments somewhere, which is why #38714 leads to an ICE.

This PR institutes some new rules. These are not meant to be the *final* set of rules, but they are a kind of "lower bar" for what kind of code we accept (i.e., we can extend these rules in the future to be smarter in some cases, but -- as we will see -- these rules do accept some things that we then would not be able to back off from).

These rules are derived from a few premises:

- First and foremost, anonymous regions in closure annotation are mostly requests for the code to "figure out the right lifetime" and shouldn't be read too closely. So for example when people write a closure signature like `|x: &u32|`, they are really intended for us to "figure out" the right region for `x`.
    - In contrast, the current code treats this supplied type as being more definitive. In particular, writing `|x: &u32|` would always result in the region of `x` being bound in the closure type. In other words, the signature would be something like `for<'a> fn(&'a u32)` -- this is derived from the fact that `fn(&u32)` expands to a type where the region is bound in the fn type.
    - This PR takes a different approach. The "binding level" for reference types appearing in closure signatures can be informed in some cases by the expected signature. So, for example, if the expected signature is something like `(&'f u32)`, where the region of the first argument appears free, then for `|x: &u32|`, the new code would infer `x` to also have the free region `'f`.
        - This inference has some limits. We don't do this for bindings that appear within the selected types themselves. So e.g. `|x: fn(&u32)|`, when combined with an expected type of `fn(fn(&'f u32))`, would still result in a closure that expects `for<'a> fn(&'a u32)`. Such an annotation will ultimately result in an error, as it happens, since `foo` is supplying a `fn(&'f u32)` to the closure, but the closure signature demands a `for<'a> fn(&'a u32)`. But still we choose to trust it and have the user change it.
        - I wanted to preserve the rough intuition that one can copy-and-paste a type out of the fn signature and into the fn body without dramatically changing its meaning. Interestingly, if one has `|x: &u32|`, then regardless of whether the region of `x` is bound or free in the closure signature, it is also free in the region body, and that is also true when one writes `let x: &u32`, so that intuition holds here. But the same would not be true for `fn(&u32)`, hence the different behavior.
- Second, we must take either **all** the references to bound regions from the expected type or **none**. The current code, as we saw, will happily take a bound region in the return type but drop the other place where it is used, in the parameters. Since bound regions are all about linking multiple things together, I think it's important not to do that. (That said, we could conceivably be a bit less strict here, since the subtyping rules will get our back, but we definitely don't want any bound regions that appear only in the return type.)
- Finally, we cannot take the bound region names from the supplied types and "intermix" them with the names from the expected types.
    - We *could* potentially do some alpha renaming, but I didn't do that.
- Ultimately, if the types the user supplied do not match expectations in some way that we cannot recover from, we fallback to deriving the closure signature solely from those expected types.
    - For example, if the expected type is `u32` but the user wrote `i32`.
    - Or, more subtle, if the user wrote e.g. `&'x u32` for some named lifetime `'x`, but the expected type includes a bound lifetime (`for<'a> (&'a u32)`). In that case, preferring the type that the user explicitly wrote would hide an appearance of a bound name from the expected type, and we try to never do that.

The detailed rules that I came up with are found in the code, but for ease of reading I've also [excerpted them into a gist](https://gist.github.com/nikomatsakis/e69252a2b57e6d97d044c2f254c177f1). I am not convinced they are correct and would welcome feedback for alternative approaches.

(As an aside, the way I think I would ultimately *prefer* to think about this is that the conversion from HIR types to internal types could be parameterized by an "expected type" that it uses to guide itself. However, since that would be a pain, I opted *in the code* to first instantiate the supplied types as `Ty<'tcx>` and then "merge" those types with the `Ty<'tcx>` from the expected signature.)

I think we should probably FCP this before landing.

cc @rust-lang/lang
r? @arielb1
@bors
Copy link
Contributor

bors commented Nov 5, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: arielb1
Pushing 666687a to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants