Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change & to be a borrow operator. #248

Closed
wants to merge 4 commits into from
Closed

Change & to be a borrow operator. #248

wants to merge 4 commits into from

Conversation

nrc
Copy link
Member

@nrc nrc commented Sep 19, 2014

Change the address-of operator (&) to a borrow operator. This is an
alternative to #241 and #226 (cross-borrowing coercions). The borrow operator
would perform as many dereferences as possible and then take the address of the
result. The result of &expr would always have type &T where T does not
implement Deref.

Change the address-of operator (`&`) to a borrow operator. This is an
alternative to rust-lang#241 and rust-lang#226 (cross-borrowing coercions). The borrow operator
would perform as many dereferences as possible and then take the address of the
result. The result of `&expr` would always have type `&T` where `T` does not
implement `Deref`.
@nrc nrc self-assigned this Sep 19, 2014
@CloudiDust
Copy link
Contributor

Repurposing & surprises C/C++ programmers.

It also breaks the following symmetry:

If value v is of type T, then &v is a &T and &mut v is a &mut T.

There are too many surprises. If we do need a borrow operator, it should be another sigil.

But I still believe a semi-explicit coercion operator is a better solution. I'll try to address the reference should be explicit requirement in this RFC.

@CloudiDust
Copy link
Contributor

Actually I think it is doable with my ~ operator proposal. But it will be changed to be a little more explicit than the one in the discuss forum.

&a and &mut a is unchanged.
a~ will pick a corecion rule that turn a into a value.
&a~ will pick a coercion rule that turn a into a reference of something other than a, normally it will be something that is inside a.
&mut a~ likewise.

@reem
Copy link

reem commented Sep 19, 2014

I think that this additional complexity may be worth it.

I recently spent some time explaining Rust's ownership semantics, borrowing rules, and syntax to a relatively new user. The trickiest thing to explain was not the rules of ownership, but what is going on with Vec<T> and &[T] etc. and how that is different from Box<T> and &T and how that is different from T and &T.

Having & always produce the "borrowed" version of a type instead of always taking the literal address will confuse C/C++ programmers but is a huge win in consistency and understanding - to borrow a Box I use &, to borrow a Vec I use &, to borrow a regular type I use &, etc.

@CloudiDust
Copy link
Contributor

Even better, we can enable both prefix and postfix variants of ~, so &a~ above becomes ~a, &mut a~ becomes ~mut a, while a~ is not changed.

~ in ~a would just be the alternative sigil for the borrow operator, while suffix ~ means arbitrary coercion inference.

We can in fact adopt both proposal.

@CloudiDust
Copy link
Contributor

@reem, & as a borrow operator is surprising even if we do not consider C/C++ programmers. Please see my first comment.

And having &v and &(v) do different things is even more surprising.

Let's use ~ as the borrow operator.

@thestinger
Copy link

It wouldn't just confuse C / C++ programmers. It would make Rust less usable as a systems language and would break generics. This is the kind of operator overloading trickery that people look down on C++ for.

@reem
Copy link

reem commented Sep 19, 2014

With & as the borrow operator, the symmetry is now not syntactic, but semantic, and I think that's better. Rather than & always being the address of something, it becomes always the borrowed form of something, which is rather more significant in Rust.

For instance, in implementing Hyper we have needed a way to represent an immutable view into Headers. Normally, this can be done through &Headers, however, Headers requires &mut self even for an immutable getter. As a result, we will have to implement a new HeadersView type that offers an immutable view into a Headers collection.

This is a general problem - often times you do not want the address of something, you want an immutable or mutable borrow of it.

@thestinger, you could always just use &(thing) to get the actual address, so I'm not sure why this makes Rust less usable as a systems language.

The interaction with generics needs to be explored further.

@thestinger
Copy link

Needing to use &(T) to get an address in any generic code makes it less usable as a systems language. It's already inferior to C++ for writing low-level code due to extra verbosity and hoops to jump through and this would be yet another step in the wrong direction. Writing low-level code in Rust should be more clear and concise than in C / C++, not significantly worse. Readable unsafe code means fewer serious memory safety issues. The core goals of the language shouldn't be compromised for unimportant sugar.

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

This does make conversions from T to &T (where T is Deref ) less ergonomic. However, it makes conversions from Rc<T>, Box<T>, etc. to &T, more ergonomic. I suspect there are vastly more of the latter, but we should get data to see if this is a good idea or not.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 19, 2014

Sounds potentially unsafe in combination with FFI code

#[repr(C)]
struct X {
    i: i16,
    j: i8,
    k: i8,
}

impl Deref<i8> for X {
    fn deref(&self) -> &i8 {
        &self.k
    }
}

fn main() {
    let x = X { i: 0, j: 0, k: 0 };
    let _: *const i16 = &x as *const _ as *const _;
}

@reem
Copy link

reem commented Sep 19, 2014

@mahkoh You'd use &(x) in that case.

@thestinger
Copy link

@reem: The need to use &(x) doesn't change the fact that it makes unsafe code more error prone and less readable. This will cause memory safety bugs, and for what?

@CloudiDust
Copy link
Contributor

I'd say just the fact that &v and &(v) do different things is enough to say "no" to this proposal as it stands, which also signifies that we indeed need another sigil anyway. So just introduce another sigil for the borrow operator then.

Also, the borrow operator is actually a syntax sugar, so it should not supersede the more fundamental operation that is taking the address.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 19, 2014

@reem: If the code above compiles then it stores an invalid pointer in _.

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@mahkoh could you explain why please? The &x in your example would have the exact same behaviour as &*x today, so I'd worry if we got an invalid pointer out of it.

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@CloudiDust the &(expr) syntax is not integral to the proposal, I'd be happy to see a better suggestion since I'm not too happy with it myself.

The principle here is that in Rust, borrowing is actually a more fundamental operation than taking the address. (I'm not sure I agree 100% with this principle, but I am warming to it).

@reem
Copy link

reem commented Sep 19, 2014

@mahkoh It doesn't seem like that is a new issue. I understand that this makes it slightly more error prone and I dislike that, but I think that the advantages in ergonomics for many extremely common types possibly outweighs the downsides.

@thestinger
Copy link

&T is implemented as taking an address, but semantically it simply produces &T. In a systems language, that fundamental operation is very important.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 19, 2014

@nick29581 Maybe I'm misunderstanding something but &*x == &x.k which is a reference to an i8 variable.

  1. It's not properly aligned and casting it to *const i16 is undefined behavior (not 100% sure about that, in combination with 2) it certainly is.)
  2. If you access it you read memory beyond the object.

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@mahkoh it seems the problem here is the cast, not the & operator

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@thestinger would you feel better about using a new operator? Leave & as is and use ~ or something for a borrow operator? I would prefer this option, but am wary about adding another sigil to Rust.

@reem
Copy link

reem commented Sep 19, 2014

A possibility if we were to use ~ would be for it to work on types too, i.e. you could write -> ~Vec<T> to mean &[T].

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@reem with ~ or & this would fall out naturally is we implement Deref<[T]> for Vec<T>

@thestinger
Copy link

@nick29581: I'm happy with how it works today... :P

I'm strongly against changing it because I think it's already the sanest solution and I don't think we need another operator. Obscure operators are really syntactic salt rather than sugar; most people don't like sigils.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 19, 2014

@reem @nick29581

I understand that this makes it slightly more error prone and I dislike that, but I think that the advantages in ergonomics for many extremely common types possibly outweighs the downsides.

The downsides are undefined behavior and segfaults (best case.) If you want to make both things safe at the same time you have to disallow the wildcard in &x as *const _ which would make FFI code much more verbose.


How would this work with types that deref to themselves?

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@thestinger fair enough.

There does seem to be motivation for sugaring ref/deref somewhat though - &*expr and &**expr all over the place are pretty ugly and don't make the code easier to write or understand. I prefer a predictable operator to a less predictable coercion (such as #241), but I agree that adding another operator is annoying.

@nrc
Copy link
Member Author

nrc commented Sep 19, 2014

@mahkoh I don't think having a borrow operator will lead to any more segfaults/undefined behaviour than the existing set of operators, in particular for your example, I don't see things getting worse just because you type one less *.

The question about types which deref to themselves is an interesting one! I expect we would have to detect that statically and forbid using the borrow operator with such types.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 19, 2014

@nick29581 I think every piece of FFI code would have to use &(x) even if x doesn't implement Deref, especially if they don't control the code. If upstream ever implements Deref, their programs would most likely start to crash.

@CloudiDust
Copy link
Contributor

@thestinger I think the plan for unification would be:

  1. Fix foo.bar so it does either direct access or "inner borrow";
  2. Do NOT implement Deref on anything that isn't pointer-like. (Vec<T> and String are not pointer-like);
  3. Introduce ~ for applying "inner borrow" elsewhere. (This is actually the optional part for now, as it is backwards compatible.)

@reem The problem is that "borrow" implies only one level of indirection, it simply means "borrow/take the address of the argument", and it doesn't mean "borrow the content in the most often used way". Though tempting, for different containers, the ways would be different, and they are hard to generalize into a single unary operator. But this operator will work for pointer-like types as those types have the same use pattern.

@thestinger
Copy link

Introduce ~ for applying "inner borrow" elsewhere. (This is actually the optional part for now, as it is backwards compatible.)

I don't see the need for another operator. Why does everyone want to make the language into a complicated mess? There's no appreciation for orthogonality and simplicity at all in the discussions on all of these proposals. There's only the desire to keep adding every feature under the sun. Language design is as much about which features are left out but Rust's community doesn't seem to get that. It's headed full force ahead to being a more complicated beast than both C++ and Scala and I seem to be the only person who cares.

@CloudiDust
Copy link
Contributor

@thestinger Well personally I don't mind writing &* too much, but I have seen code that take &mut **foo and that is confusing on first glance.

Anyway this is personal preference, and syntax sugar can either be added now or later, or never.

@buster
Copy link

buster commented Sep 20, 2014

Absolutely like the proposal. I don't care for confusion of C/C++ programmers. A good C/C++ programmer typically has enough technical background and understanding to get a new concept. But the goal should be to lower the barriers of entry for everyone. I very much prefer more convenience around borrowing to "non-confusion of group X".

@CloudiDust
Copy link
Contributor

@buster Even if we don't care about C/C++ programmers (we should, the principle of least surprise and all that, let alone the fact they are our main target audience), using & as "magic borrow" is not even consistent with Rust's own type signatures, like I said in the first comment.

And I don't see how the concept of "magic borrow" (not "borrow" in the "take a reference to" sense, but the magic borrow proposed in this RFC proper) is easier to grasp for a programmer who has never been exposed to Rust or other systems programming language, as "magic borrow" or not, we'll have to tell him/her what a reference is first, anyway. And just saying "use the borrow operator" without an explanation doesn't seem like a good way to teach Rust, a systems language.

@thestinger
Copy link

@buster: Adding more complexity is not reducing the barrier to entry. The issue isn't that it would cause confusion for C++ programmers. It would make low-level code harder to write and would add more confusing magic to the language. It wouldn't result in any significant improvements to convenience.

@phaylon
Copy link

phaylon commented Sep 20, 2014

I do agree that with the proposed changes writing borrowing code can be much easier. However, I'd say that reading code with explicit dereferences for borrows is much more informative about what the code is actually working on.

For method calls, implicit dereference is great, but there you have the context of the method name you call. With a borrow operator, you have much less context.

…nge the discussion to focus more on smart pointers than the Deref trait.
@CloudiDust
Copy link
Contributor

With the borrow operator here and Borrow trait proposed in #235, I think we have three definitions of borrow:

  1. The original borrow: taking a reference to/address of something (directly).
  2. The borrow operator: recursively deref something and take a reference to the non-Deref content.
  3. The Borrow trait: take a reference to most types, but take a slice view into Vec<T> and String.

Honestly I think the situation is sub-optimal. We need a coherent solution.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 22, 2014

When a function does want to borrow an owning reference (e.g., takes a &Box or &mut Vec), it would be more painful to call that function. I believe this situation is rare, however.

I don't think &mut Vec is less common than &mut [].

I believe the advantages of this approach vs an implicit coercion are:

* better integration with type inference (note no explicit type in the above
example);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your proposal doesn't seem to do any inference at all because it simply derefs as much as possible without looking at the context in which the reference is use. Proposal #241 on the other hand does look like it's doing type inference and the only reason you have to use y: &T in the example above is that y isn't constrained further. I'd expect let y = x; to create a &Rc<T> as expected and

fn foo(x: &Rc<T>) {
    bar(x);
}
fn bar(x: &T) { }

to just work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, as long as there is some more context to infer the type from. In practice you sometimes need to give the inferencer some hints

@mahkoh
Copy link
Contributor

mahkoh commented Sep 22, 2014

Rust is a safe language and sometimes more verbose than other languages to make this possible. One of the few places where Rust is not safe is FFI code and every change that makes unsafe code even less safe and even less predictable should have a very good justification.

The rust repo contains over 452k lines of Rust code (including comments) and only 1.7k lines (0.38%) contain "&", over 1k of them in librustc (111k LOC.) This "borrow problem" seems to be nothing but a mild inconvenience. &** looks a bit ridiculous but that's really all there is to it.

When you remove librustc and only leave the libraries you can see that &* occurs about once every 430 lines.

@CloudiDust
Copy link
Contributor

@mahkoh, thanks for the statistics. It seems that the need of the borrow operator is not that urgent. And &mut **foo is also not that "alien" now in my eyes. It just need a little getting used to.

After I noticed that #235 has yet another different take on the meaning of borrow, I think we just cannot live with the confusion that three different definitions of borrow would create.

On the other hand, the autoderef problem in foo.bar does need fixing.

@nrc
Copy link
Member Author

nrc commented Sep 22, 2014

@mahkoh

I don't think &mut Vec is less common than &mut [].
Why not? The former is a bit weird since Vec is already an (owning) reference to a sequence of data. I'm not sure why you'd use it rather than mut Vec unless it was working with generics. The slice type should be common because it is a generic view over different sequence types.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 22, 2014

@nick29581: I'm using &mut Vec quite often to have multiple functions push into the same vector, but maybe that's just me.

@45ujksy5l03jcvf
Copy link

With the borrow operator here and Borrow trait proposed in #235, I think we have three definitions of borrow

Three definitions of borrow, Deref for String and Vec, and thestinger seems to be the only person who cares..
How horrible.

@ben0x539
Copy link

I'm also not a fan of this plan and would rather just have cross-borrowing an similar magical behavior go away at least for 1.0, but I don't really have any better arguments than thestinger, and me grouching on every RFC that I don't think adds more convenience than complexity doesn't help anyone. If the core team wanted an opinion poll on RFCs they'd probably use a more appropriate platform for that than pull request comments.

@CloudiDust
Copy link
Contributor

@45ujksy5l03jcvf, well I think many cares.

Reading through the recent RFCs and having some thoughts makes me wonder whether we are increasing language complexity for questionable gain, and whether we are dealing with complexity by piling up more complexity, because we asked the wrong question. And I'll admit maybe I've fallen victim to the temptation of feature creep. I'll try to get out. :)

@CloudiDust
Copy link
Contributor

I think the following three questions are related:

  1. what should deref do?
  2. what exactly is a borrow?
  3. what does equality mean in Rust?

#245, this, the coercion proposals and the autoderef problem of foo.bar are signs that we should indeed provide a coherent view on these.

@45ujksy5l03jcvf
Copy link

@CloudiDust
For what it's worth (who am I to argue with contributors), to borrow an object is to take some handle (pointer reference, iterator, slice, subslice, whatever) with read or write access to its internal representation, which have a potential for aliasing. So, there can't be a single borrow operator.
Of course, it's possible to have a trait BorrowTheWholeMeaninfulContentInAPreferredWay, but I think it would have questionable value and it certainly shouldn't be pulled into the core language. Essentially, I'm repeating the Daniel Micay's words again.

@CloudiDust
Copy link
Contributor

@45ujksy5l03jcvf, in case it's not clear, I was agreeing with @thestinger too. :)

I think your definition of borrow is not the same as the "original" one. Originally in Rust, borrow = address-of, simple like that, and it has nothing to do with "internal representation". But recently there are some redefinitions happening, which worries me. I tried to come up with terms like inner/outer borrow but on a second thought, I was not satisfied.

I think part of the problem is, we cannot come up with a better name for the new concept so borrow is reused, which may or may not indicate that we are trying to name the wrong thing.

@nrc
Copy link
Member Author

nrc commented Sep 30, 2014

Some data and a little discussion: https://gist.github.com/nrc/809614adb2bbb38232b7

@nrc
Copy link
Member Author

nrc commented Oct 28, 2014

Closing - we don't have time to do this pre-1.0 (and we're not sure if we even want to) and it would be too big a breaking change for post-1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.