Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"discriminant_value" intrinsic returns u64, cannot fit all discriminants #70509

Closed
RalfJung opened this issue Mar 28, 2020 · 27 comments · Fixed by #70705
Closed

"discriminant_value" intrinsic returns u64, cannot fit all discriminants #70509

RalfJung opened this issue Mar 28, 2020 · 27 comments · Fixed by #70705
Assignees
Labels
A-intrinsics Area: intrinsics C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Mar 28, 2020

Ever since we had enums with repr(u128)/repr(i128), the type of the discriminant_value intrinsic was no longer adequate: it returns a u64, which simply cannot fit all discriminant values.

@eddyb says the type of the intrinsic should be adjusted. This will affect both its codegen and Miri implementation.

This issue has been assigned to @lcnr via this comment.

@eddyb
Copy link
Member

eddyb commented Mar 28, 2020

cc @rust-lang/libs I wonder if we're safe to change std::mem::discriminant.
We might need a crater run as I suspect there are people who transmute the result to u64.

@jonas-schievink jonas-schievink added A-intrinsics Area: intrinsics C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 28, 2020
@sfackler
Copy link
Member

A crater run would be fine, but we should not block a change on people doing things like that IMO.

@dtolnay
Copy link
Member

dtolnay commented Mar 28, 2020

I wouldn't have any sympathy for people transmuting to u64.

If you want to be slick, you could make std::mem::Discriminant<T> have the same size as T's discriminant, rather than always 8 bytes. This supports enums with a 128 bit discriminant without increasing the size of Discriminant in the common case.

@lcnr
Copy link
Contributor

lcnr commented Mar 28, 2020

Would the following work?

Add a (potentially hidden) permanently unstable trait

trait DiscriminantKind {
    // the traits are required by `mem::Discriminant`
    type Discriminant: Clone + Copy + Debug + Eq + PartialEq + Hash + Send + Sync + Unpin;
}

which is automatically implemented for every type and change intrinsics::discriminant to return DiscriminantKind::Discriminant instead.

mem::Discriminant should now be able to use <T as DiscriminantKind>::Discriminant as its only field.

This allows for mem::Discriminant to be smaller in most cases while still being fairly simple to implement afaict.

@nagisa
Copy link
Member

nagisa commented Mar 28, 2020

The Discriminant type was originally made opaque due to exactly this issue IIRC (and a couple other reasons). If only there was a way to prohibit transmuting it, or mark the type Unsized...


@lcnr There’s no point in making the intrinsic hide its implementation details, as the intrinsics are inherently unstable.

@lcnr
Copy link
Contributor

lcnr commented Mar 28, 2020

The reason why I considered hiding it is to not pollute rustdoc by adding this (for the user irrelevant) impl to every type.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Mar 29, 2020
@lcnr
Copy link
Contributor

lcnr commented Apr 1, 2020

@rustbot claim

@rustbot rustbot self-assigned this Apr 1, 2020
@SimonSapin
Copy link
Contributor

Rather than perma-unstable, I think the trait could be private to libcore. (The Discriminant struct doesn’t need a trait bound since the trait needs to be implemented for all types anyway: for example std::mem::discriminant(&true) compiles today.)

An associated type in a trait is required because that’s how the compiler supports having a field of different types in struct Discriminant<T> { … } depending on T, but these are implementation details that don’t need to be visible in any public API.

@lcnr
Copy link
Contributor

lcnr commented Apr 2, 2020

@SimonSapin While it is not required as a bound for Discriminant, it is required for the return type of intrinsics::discriminant_value which is public

@eddyb
Copy link
Member

eddyb commented Apr 2, 2020

intrinsics::discriminant_value which is public

We can make the intrinsic private, they're not really meant to be used directly anyway.

@SimonSapin
Copy link
Contributor

Couldn’t the return type of intrinsics::discriminant_value be changed to Discriminant?

@SimonSapin
Copy link
Contributor

This might not even require a change to the compiler side of the intrinsic, if we add #[repr(transparent)] to Discriminant

@nagisa
Copy link
Member

nagisa commented Apr 2, 2020

I’m not sure what’s exactly the point of changing the intrinsic. Intrinsics are inherently unstable, have changed in the past multiple times and people are expected to not rely on them being stable. It is an implementation detail.

What for are we trying to make our own lives harder by thinking about it at all?

We can make the intrinsic private, they're not really meant to be used directly anyway.

Not sure there’s any difference as people can just (unstably) extern it if they want.

@sfackler
Copy link
Member

sfackler commented Apr 2, 2020

Yeah, I would change the intrinsic in whatever way makes the implementation easier.

@lcnr
Copy link
Contributor

lcnr commented Apr 6, 2020

While implementing #70705 I realized that this breaks PartialEq for repr128 enums:

#![feature(core_intrinsics, repr128)]

use std::intrinsics::discriminant_value;

#[derive(PartialEq, Debug)]
#[repr(i128)]
enum Test {
    A = 0,
    B = u64::max_value() as i128 + 1, 
}

fn main() {
    println!("{}, {}", discriminant_value(&Test::A), discriminant_value(&Test::B));
    assert_eq!(Test::A, Test::B);
}

playground

@lcnr
Copy link
Contributor

lcnr commented Apr 6, 2020

memory unsafety ❤️

#![feature(repr128, arbitrary_enum_discriminant)]

#[derive(PartialEq, Debug)]
#[repr(i128)]
enum Test {
    A(Box<u64>) = 0,
    B(usize) = u64::max_value() as i128 + 1, 
}

fn main() {
    assert_eq!(Test::A(Box::new(2)), Test::B(0));
    //~^ Illegal instruction (core dumped)
}

@eddyb
Copy link
Member

eddyb commented Apr 6, 2020

At least #[repr(i128)] seems to still be unstable, I didn't realize that was the case.
I guess fixing this brings it closer to stabilization?

@nikomatsakis
Copy link
Contributor

@lcnr to clarify, when you say

I realized that this breaks PartialEq for repr128 enums:

you meant that the failure to represent 128-bit discriminants breaks PartialEq, right? That is, it's not that the approach of using an associated type of a magic trait breaks PartialEq, but rather the underlying bug.

@lcnr
Copy link
Contributor

lcnr commented Apr 28, 2020

Yes, in the current state PartialEq can lead to undefined behavior as discriminant_value truncates 128 bit discriminants. #70705 correctly compares the untruncated 128 bit values and fixes this issue.

@lcnr
Copy link
Contributor

lcnr commented Apr 28, 2020

#![derive(PartialEq)]
enum Foo {
    A(u8),
    B,
}

can be thought of as

impl PartialEq for Foo {
    fn eq(&self, rhs: &Self) -> bool {
        use core::intrinsics::discriminant_value;
        if discriminant_value(self) != discriminant_value(rhs) {
            return false;
        }

        match (self, rhs) {
            (Foo::A(l), Foo::A(r)) => l == r,
            (Foo::B, Foo::B) => true,
            _ => unsafe { core::hint::unreachable_unchecked() },
        }
    }
}

@SimonSapin
Copy link
Contributor

The output of --pretty expanded for #70509 (comment) includes:

impl ::core::cmp::PartialEq for Test {
    #[inline]
    fn eq(&self, other: &Test) -> bool {
        {
            let __self_vi =
                unsafe { ::core::intrinsics::discriminant_value(&*self) } as
                    i128;
            let __arg_1_vi =
                unsafe { ::core::intrinsics::discriminant_value(&*other) } as
                    i128;
            if true && __self_vi == __arg_1_vi {
                match (&*self, &*other) {
                    (&Test::A(ref __self_0), &Test::A(ref __arg_1_0)) =>
                    (*__self_0) == (*__arg_1_0),
                    (&Test::B(ref __self_0), &Test::B(ref __arg_1_0)) =>
                    (*__self_0) == (*__arg_1_0),
                    _ => unsafe { ::core::intrinsics::unreachable() }
                }
            } else { false }
        }
    }

My understanding is that in current Nightly, intrinsics::discriminant_value returns a u64 so u64::max_value() as i128 + 1 is incorrectly truncated to zero. Thus __self_vi == __arg_1_vi is true even though the two values have different a variant, so control flow would go to the last branch of match but with intrinsics::unreachable we get UB.

I believe #70705 would fix this.

@nikomatsakis
Copy link
Contributor

One note would be that, eventually, it seems like we could stabilize the DiscriminantKind trait, though I would never permit user-provided implementations. It'd be nice to work towards a "minimal set of lang items" that we can use as the basis for the rest of the stdlib.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 28, 2020

Although I guess that's an orthogonal thing -- i.e., in user code, Discriminant<T> is the stable way to access the "discriminant for this type". And the lang item itself is specific to how our stdlib impl works but not user exposed. Ok, never mind what I just wrote, except to note that exposing associated types of "magic traits" like T::Discriminant might be a nice option in the future.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

So what is needed to unblock this? This PR is blocking some Miri cleanup of read_discriminant (that would be awkward to do right now because of the bug this fixes).

@nikomatsakis
Copy link
Contributor

@RalfJung by "this PR" you mean #70705 ?

@nikomatsakis
Copy link
Contributor

It seems like we're mostly good to go forward, there aren't (afaict) many public-facing implications of this beyond correctly handling larger discriminants.

@RalfJung
Copy link
Member Author

RalfJung commented May 4, 2020

Sorry, I meant to post that in the PR #70705, yes.

@bors bors closed this as completed in 963bf52 May 21, 2020
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 18, 2021
Stabilize `arbitrary_enum_discriminant`

Closes rust-lang#60553.

----

## Stabilization Report

_copied from rust-lang#60553 (comment)

### Summary

Enables a user to specify *explicit* discriminants on arbitrary enums.

Previously, this was hard to achieve:

```rust
#[repr(u8)]
enum Foo {
    A(u8) = 0,
    B(i8) = 1,
    C(bool) = 42,
}
```

Someone would need to add 41 hidden variants in between as a workaround with implicit discriminants.

In conjunction with [RFC 2195](https://github.com/rust-lang/rfcs/blob/master/text/2195-really-tagged-unions.md), this feature would provide more flexibility for FFI and unsafe code involving enums.

### Test cases

Most tests are in [`src/test/ui/enum-discriminant`](https://github.com/rust-lang/rust/tree/master/src/test/ui/enum-discriminant), there are two [historical](https://github.com/rust-lang/rust/blob/master/src/test/ui/parser/tag-variant-disr-non-nullary.rs) [tests](https://github.com/rust-lang/rust/blob/master/src/test/ui/parser/issue-17383.rs) that are now covered by the feature (removed by this pr due to them being obsolete).

### Edge cases

The feature is well defined and does not have many edge cases.
One [edge case](rust-lang#70509) was related to another unstable feature named `repr128` and is resolved.

### Previous PRs

The [implementation PR](rust-lang#60732) added documentation to the Unstable Book, rust-lang/reference#1055 was opened as a continuation of rust-lang/reference#639.

### Resolution of unresolved questions

The questions are resolved in rust-lang#60553 (comment).

----

(someone please add `needs-fcp`)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-intrinsics Area: intrinsics C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.