Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a pod language item and marker trait. #1387

Closed
wants to merge 2 commits into from
Closed

Conversation

mahkoh
Copy link
Contributor

@mahkoh mahkoh commented Nov 30, 2015

Add a pod language item and marker trait. The trait can only be implemented by
a subset of all types and identifies objects that are valid when they contain
arbitrary bit patterns.

@mahkoh
Copy link
Contributor Author

mahkoh commented Nov 30, 2015

Forgot that rust doesn't have function pointers, only function references. It's not clear if those function references that are unsafe to call should be Pod. (Edit: extern functions are never unsafe to call unless they appear in an extern block.) Having extern fn() in ffi structs might be useful.

Edit: Since those references are not allowed to be 0, they cannot be Pod, even if they are extern.

@alexcrichton alexcrichton added T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels Nov 30, 2015
@mahkoh
Copy link
Contributor Author

mahkoh commented Nov 30, 2015

I've looked a bit into how LLVM treats padding:

  • The contents of padding bytes is undefined.
  • Even if the padding bytes are zeroed, the padding bytes in a copy of the same object are again undefined.

In order to avoid reading undef bytes, the compiler should fill up the padding with hidden fields and initialize them with unspecified values. Two objects of the same type need not have the same values stored in their padding bytes. When an object is overwritten (e.g., through a &mut) the compiler need not overwrite the padding fields. The only thing the user can rely on is that reading the bytes of Pod types does not produce undef values.

Note that this only affects Pod structs and tuples where padding should be rare anyway. For example, in the following struct the compiler would insert the padding field:

#[derive(Pod)]
struct X {
    a: u8,
    __padding: [u8; 7], // or 3 depending on the alignment of u64
    b: u64
}

Once C unions are added, something similar will have to be done for them.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 1, 2015

All Option-like enums with a NonZero inner type with a 'static lifetime could also be Pod. Thus allowing c-like function pointers, while preventing the accidental creation of an Option<&'a T>.

@killercup
Copy link
Member

Rendered

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 1, 2015

@oli-obk Option<&'static u8>?

@oli-obk
Copy link
Contributor

oli-obk commented Dec 1, 2015

Uhm right. That's not better.

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 1, 2015

The lack of function pointers is a safety problem anyway. This RFC will not try to solve it.

@retep998
Copy link
Member

retep998 commented Dec 6, 2015

extern functions are never unsafe to call unless they appear in an extern block

Actually this is because extern fn in an extern block implicitly adds an unsafe to it. The actual type of such a function is really unsafe extern fn which is always unsafe to call.

@diwic
Copy link

diwic commented Dec 7, 2015

In general I like this RFC. Solving the padding problem by adding never used fields, seems like a reasonable workaround (that said - I haven't actually looked into LLVM, so not sure).

cannot modify the `len` field in order to guarantee safety. Since the compiler
cannot know if the private fields must satisfy certain invariants, an
implementation of the safe (!) `Pod` trait cannot be allowed for such types.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with the reasoning w r t allowing safe Pod for types with private internals. If a struct is declared Pod, then it's an error to assume that there are private invariants. (A struct might want to keep its internals private for other reasons, such as being able to rename the fields without breaking the API.)

@strega-nil
Copy link

I disagree with the idea that all fields have to be public. Why would you have that restriction? Just make it a trait like Copy, all inside types have to be Pod in order for the outside type to be Pod.

@tbelaire
Copy link

tbelaire commented Dec 8, 2015

I feel like this would be quite useful for my embedded project.

I feel like I would otherwise be using the byteorder crate. Speaking of byteorder, could you amend this to clarify what happens when you cast a struct with 4 u8s into a struct with only a u32.

@strega-nil
Copy link

@tbelaire Well, it's implementation defined. On x86, it'll be little endian, and on ARM, it'll be big endian. The point is it's not memory unsafe.

Also, this makes me think of a second thing. Currently, we do not specify our struct layout. This would force us to make our struct layout equal to repr(C). I would argue that we should only be able to make repr(C) types PoD. Unless, perhaps, you have a different idea @mahkoh?

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 8, 2015

Changes in the last update:

  • Removed the struct restriction
  • Clarified how padding bytes are treated
  • Removed the justification section

Regarding padding: The behavior specified in the RFC is sufficient unless undef values have been explicitly stored in a pod object. In this case, the user has to explicitly overwrite the padding bytes with non-undef values.

@tbelaire: Whatever currently happens.

@GBGamer: How does it imply C layout?

@arcnmx
Copy link

arcnmx commented Dec 8, 2015

I've experimented with pod types and generally found it to be something better suited for an external library than stdlib. However, integrating it cleanly requires (the stabilization of) either OIBITs or syntax extensions.

The proposed feature of getting around LLVM's undef'd member padding is interesting though, since syntax extensions can't do that. My solution has been to simply require the structure to be packed to an alignment of 1.

@retep998
Copy link
Member

retep998 commented Dec 8, 2015

@arcnmx Note that if you do force it to be packed using #[repr(packed)], borrowing any fields of the struct will require unsafe.

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 8, 2015

@arcnmx: I don't see how it can work with either since 1. enums must not be pod and 2. syntax extensions don't have access to type information.

found it to be something better suited for an external library than stdlib

Then it is good that this RFC does not propose any changes to the stdlib. It provides the necessary building blocks to safely implement pod tools in any library.

@glaebhoerl
Copy link
Contributor

This seems closely related to ye olde trait Transmute (the trait with a thousand names - Cast in the linked RFC). In particular it seems like you could achieve a similar effect using Foo: Transmute<[u8; sizeof Foo]>, although the reverse direction is a bit finickier due to alignment, but could be solved in a number of ways (e.g. [u16; sizeof Foo / 2]: Transmute<Foo>, if Foo happens to have 2-byte alignment, or perhaps we could have an opaque Blob type parameterized over size and alignment constants, and then Blob<sizeof Foo, alignof Foo>: Transmute<Foo> and vice versa as well). This would need to be fleshed out in more detail to qualify as an actual proposal (what level of support for type-level integers it would require, for example, or how to work around its absence); at the moment it's just food for thought. (cc @gereeter)

@arcnmx
Copy link

arcnmx commented Dec 8, 2015

Note that if you do force it to be packed using #[repr(packed)], borrowing any fields of the struct will require unsafe.

Yeah, I wrote it all before that RFC existed. I also have a marker trait (could be OIBIT, isn't currently because they're unstable) specifying that all members are 1-aligned, which also works.

I don't see how it can work with either since 1. enums must not be pod and 2. syntax extensions don't have access to type information.

It does, though. Derive syntax extensions can make assertions that all member types satisfy a given trait constraint (i.e. is POD), and they can also refuse to apply to enum types. This is what my library does, and can be used on stable with syntex, etc.

Then it is good that this RFC does not propose any changes to the stdlib. It provides the necessary building blocks to safely implement pod tools in any library.

Sorry, misspoke. I meant "language" as I see little point for it to be a language item. The new things this brings to the table are:

  1. Making padding defined, which isn't quite specified in the RFC. Could fit as a separate #[repr(DefinedPadding)] feature/RFC.
  2. Making types with all public members automatically POD without an explicit derive, mark, or newtype.

@strega-nil
Copy link

@mahkoh It's currently undefined behavior to transmute non-repr(C). Therefore, it's UB to, say, do this:

struct Test {
    inner: u8
}

let array: &[u8] = [0u8; 128];
let slice = std::slice::from_raw_parts(array.as_ptr(), array.len());

for example.

Also, it should be something like Copy, imo:

#[repr(C)]
#[derive(Copy, Clone, Pod)]
pub struct ControllerAxis {
    pub start: Point<f32>,
    pub min: Point<f32>,
    pub max: Point<f32>,
    pub end: Point<f32>,
}

for an example from a game I'm currently working on.

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 9, 2015

@arcnmx

Derive syntax extensions can make assertions that all member types satisfy a given trait constraint (i.e. is POD), and they can also refuse to apply to enum types.

That's a clever trick and much more flexible than this RFC. I've implemented this in my code and it's working nicely apart from the usual array problems. The #[DefinedPadding] issue still has to be solved but that can be done in another RFC.

@GBGamer

It's currently undefined behavior to transmute non-repr(C).

The concept of undefined behavior does not apply to rust. So little of the language is actually defined that, if it were to apply to rust, you'd find that if you're using the standard library, the behavior is undefined.

For example, the manual does not define the behavior upon a transmute. Hence the behavior is undefined whether it's repr(C) or not.

@mahkoh
Copy link
Contributor Author

mahkoh commented Dec 9, 2015

Closing because this can be solved with compiler plugins apart from the padding issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants