Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Type privacy and private-in-public lints #2145

Merged
merged 4 commits into from
Feb 7, 2018

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented Sep 10, 2017

Type privacy rules are documented.
Private-in-public errors are relaxed and turned into lints.

Rendered

@aturon aturon self-assigned this Sep 11, 2017
@aturon aturon added Ergonomics Initiative Part of the ergonomics initiative T-lang Relevant to the language team, which will review and decide on the RFC. labels Sep 11, 2017
@aturon
Copy link
Member

aturon commented Sep 13, 2017

Thanks much, @petrochenkov!

This came out of a long-running discussion on internals with oversight from the lang team. I'm going to go ahead and kick off the full review process:

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 13, 2017

Team member @aturon has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Sep 13, 2017
@aturon
Copy link
Member

aturon commented Sep 13, 2017

I've now reviewed this RFC myself, and have to say: I'm very impressed by this writeup (and continue to love the Guide/Reference distinction)! Kudos.

I don't have a ton to add to the technical content, but as to the unresolved question around associated types: I think the approach taken here is quite reasonable, and at any rate is the conservative choice. It does mean that there's a slight inconsistency in the style of the rules, but I expect this to be a much smaller papercut than the current rules are.

I'm also a fan of having the new lints be based on reachability. That was deemed problematic before when determining the meaning of privacy, but as a lint we have a lot more leeway to be smart about the analysis.

All told: this seems like a total slam dunk to me.

@nrc
Copy link
Member

nrc commented Sep 14, 2017

Just to be clear, the privacy rules description here changes nothing and the only change is changing the pub-in-priv error into lints?

Note that type privacy is not quite as bullet proof as made out here - if you can name the type you know its size and thus can transmute it to another type and get access to its data or even mutate it (possibly via C interop) therefore there is some benefit in the conservative pub-in-priv error over type privacy (its also not clear to me how this plays with macro hygiene).

I think we should not have two separate lints for the primary/secondary variations - seems like too much detail to be worth exposing to users.

Otherwise I am wildly in favour of this RFC!

I would actually like to go a bit further (though perhaps not right now in this RFC) - I'd like to have two lints - one which works like the current error at the module level and is warning (or even off) by default and one that works at crate scope and is error by default. My reasoning is that within a crate, a privacy leak is usually just annoying, nothing worse. But leaking private data outside a crate could easily be a security vulnerability.

@petrochenkov
Copy link
Contributor Author

@nrc

Just to be clear, the privacy rules description here changes nothing and the only change is changing the pub-in-priv error into lints?

Yes.
There are some minor holes in the current privacy implementation, but I hope to patch them this or next weekend.

(its also not clear to me how this plays with macro hygiene).

Type privacy checks are unhygienic currently, i.e. private types leaked through macros 2.0 will still generate errors to prevent cases like this

struct Priv {}
pub macro m() {
    Priv
}

// Another module
type A = m!(); // Priv is leaked

The rule can be relaxed to ignore intermediate types/values used by macros, but not "returned" by them, but this requires some more thought, has linkage implications and I don't want to include it in this RFC.

I think we should not have two separate lints for the primary/secondary variations - seems like too much detail to be worth exposing to users.

I separated them due to very different error probabilities.
The first lint indicates an almost guaranteed error on client side, the second is more in the "missing documentation" category.

Note that type privacy is not quite as bullet proof as made out here - if you can name the type you know its size and thus can transmute it to another type and get access to its data or even mutate it (possibly via C interop) therefore there is some benefit in the conservative pub-in-priv error over type privacy

I don't understand this paragraph.
With proposed rules you can't name a private type, so you can't call size_of on it.
(Also, privacy was never supposed to prevent brute force measures like converting to *mut u8 and reading or overwriting bytes.)

But leaking private data outside a crate could easily be a security vulnerability.

I don't understand this as well.
Leaked private data will still be unusable in other crates due to type privacy errors (the lints just try to catch these client-side errors early, on definition site). Also, privacy doesn't give any security guarantees due to conversions to *mut u8 still being available.

@nrc
Copy link
Member

nrc commented Sep 16, 2017

I separated them due to very different error probabilities.
The first lint indicates an almost guaranteed error on client side, the second is more in the "missing documentation" category.

I see the benefit, but I'm not sure its worth the cost of having two separate lints.

With proposed rules you can't name a private type, so you can't call size_of on it.

I don't think that is necessary to call transmute - the compiler will know the size and so you can transmute, aiui.

(Also, privacy was never supposed to prevent brute force measures like converting to *mut u8 and reading or overwriting bytes.)

Hmm, I guess you can always convert &T to *mut u8 so before or after this proposal, you'll still be able to access the memory.

I don't understand this as well.

Let's discuss this later. My intuition is that leaking a private type out of a module isn't that important, but leaking it out of a crate is.

@cramertj
Copy link
Member

cramertj commented Sep 19, 2017

Apologies for my delay in responding to this RFC!

The RFC itself is well-specified and the guide and reference-level explanations make sense to me. However, I'm struggling to understand the motivations behind this change. Primarily, it seems to me like allowing a private type or trait to appear in a public interface is nearly always going to result in a sub-par end-user experience. Documentation won't be able to provide information about the non-public type and values of type will be strangely unusable.

One major point that came up in previous discussions was the ability to make a private supertrait of a public trait so as to prevent users from providing their own implementation. Personally, I think this design would be really confusing to me as an end-user of a library: I'd still try to implement the trait, but then I'd get confused and stuck trying to understand how to properly name and implement a trait that I could clearly see in the documentation. If this is a primary motivation for this RFC, I'd strongly prefer we look to more natural and self-explanatory solutions such as first-class sealed traits, rather than hard-to-understand exceptions to Rust's privacy rules.

@rfcbot concern motivation

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 20, 2017

@cramertj

However, I'm struggling to understand the motivations behind this change.

The current Rules are based on local pub annotations, so they give relatively many false positives (see the inner/outer example) and don't meet people's expectations.

The private-in-public rules can be potentially relaxed in any way since they are not technically necessary (and not sufficient).
So, the motivation is to relax them to remove false positives and match expectations

The problem is that reachability-based rules that match those expectations are pretty vague, especially in presence of macros 2.0, so it would be really unfortunate to make them hard language rules rather than lints.

Those lints can certainly be deny-by-default (especially private_interfaces) if the practice becomes discouraged, but the issue is that "undocumentable" bounds you are talking about are actually useful and used right now and there are no first-class sealed traits on the horizon, and situations similar to "perfect derive" exist and not covered even by sealed traits, so we can't outright prohibit all this. In short term I'd rather improve support for "unnameable" bounds in rustdoc.
In long term they can be deprecated if/when some better replacement appears.

@cramertj
Copy link
Member

cramertj commented Sep 20, 2017

@petrochenkov
RE reachability false-positives / the inner/outer example: I don't really see this as a false positive. To me, it seems like the compiler should enforce that f is at most pub(super)/pub(in outer) (i.e. its annotated visibility scope should be at most as large as that of the types it uses). In my opinion, items should only be marked with pub if there is an intention to export them from a crate. If a pub-annotated item refers to a type which is not reachable from outside of the crate, an error should result.

The "perfect derive" question seems much thornier to me:

in short term I'd rather improve support for "unnameable" bounds in rustdoc.

Allowing these types (e.g. Wrap) to be externally visible exposes the trait implementations and other features of the internal types. If rustdoc or the compiler are to provide information about Wrap to external users, then Wrap should have to be pub.

But perhaps we're thinking along the same lines: IMO bounds should be able to use private types, and it should be the compiler's and rustdoc's responsibility to translate those bounds into "external-facing" terms:

struct Wrap<T>(T);
pub struct Foo<T>(Wrap(T));

impl<T> Clone for Wrap<T> where T: Clone + Copy + Sync {
    ....
}

// Autogenerated by derive:
impl<T> Clone for Foo<T> where Wrap<T>: Clone {
    ....
}
// translates to this in compiler errors / rustdoc:
"Foo<T> implements Clone only if T: Clone + Copy + Sync"

IIRC, bounds propagations like this are already done in some compiler errors. I wouldn't think it would be too much work to extend it to allow for things like Foo<T> implements Clone only for T=MyType1 and T=MyType2.

I don't want to support writing code that makes non-pub items visible in compiler errors or rustdoc. If we can find ways to hide certain patterns (e.g. Wrap and private supertraits) in rustdoc or compiler errors, then I'm fully supportive of allowing those usages of pub-in-priv.

@petrochenkov
Copy link
Contributor Author

@cramertj

I don't really see this as a false positive.

Me too! It's simple if you know the rules, but it's not expected for some reason.
I had to explain that this is indeed the expected behavior and not a bug too often, the last time
being just couple of weeks ago.

@petrochenkov
Copy link
Contributor Author

@cramertj

In my opinion, items should only be marked with pub if there is an intention to export them from a crate.

Aaand the lint about this is already a part of rust-lang/rust#44660 (this is not a universally supported opinion though).

@cramertj
Copy link
Member

As I'm the only remaining holdout on this RFC, I thought I should directly address some of the comments that had been made previously in support of the RFC. I reread the internals thread and noticed this comment by @nikomatsakis:

Presuming we want to keep that general property, then we have to make some practical choices. I think all things being equal the definition-side enforcement is better, but if we wish to avoid regressions, then it seems pretty acceptable to me do the strict enforcement at use-site, but supply lints at the definition site.

In fact, this may be the more ergonomic option: it means that you can get away with slightly inconsistent public/private declarations (such as public functions that return private types) so long as you don't use private types from the wrong places (that is the hard limit). Then you can come back and fix the lints once you've got your stuff setup.

As I said in my earlier comment, I'm worried that allowing users to ignore the public/private declaration rules will give them an incorrect understanding of how privacy works in Rust, and in particular an incorrect impression of the meaning of pub.

I'm also worried that users won't understand why their particular usage is "incorrect," and so rather than reaching out, learning about privacy, and fixing the lints, they'll just allow the lint and forget about it, casting it off as an annoyance caused by an overzealous compiler.

@aturon
Copy link
Member

aturon commented Sep 25, 2017

@cramertj

So, to me, a key point here is having a clear delineation between what is a hard error versus a lint. Personally, I prefer to tie hard errors to crisp, hard, valuable guarantees. I think there's a core set of such guarantees related to privacy that should, indeed, be a hard error, because you crucially rely on these guarantees when reasoning about your code.

On the other hand, lints are for things that are likely bugs or non-standard style, but are not critical components of guarantees. That's why, for example, in the modules RFC the "misuse" of pub on an item that is not, in fact, publicly exported is merely a lint; the kind of reasoning enabled by making it a hard error does not seem worth it.

Yes, people are more likely to ignore lints, but just as importantly, people can ignore lints when prototyping or first learning, and still get a useful, runnable result. Yes, that may represent a missed teaching opportunity, but the timing of teaching is important -- you're often in the middle of trying to do something else when you hit an error like this, and not in a great place to stop and learn something about the privacy system.

Finally, there's a strong enough culture in Rust that allow should be avoided that I don't worry too much about people shifting to allow this lint en masse.

@cramertj
Copy link
Member

cramertj commented Sep 25, 2017

@aturon @petrochenkov

Thanks for your responses! After talking this through, I feel like you've both listened to and understood my concerns. I'd still personally prefer alternative solutions that don't allow private types to appear in public interfaces, but I'm confident that we can find solutions to make this case more ergonomic.

For posterity's sake: as part of stabilizing this feature (or shortly after stabilizing this feature) I think there should be a work item to experiment with making the compiler and rustdoc omit the names of private items from types, bounds, and error messages. I believe this could prevent a great deal of confusion when using libraries that have private types their public interfaces.

@cramertj cramertj added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Sep 25, 2017
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 25, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Sep 25, 2017
@rfcbot
Copy link
Collaborator

rfcbot commented Oct 5, 2017

The final comment period is now complete.

bors added a commit to rust-lang/rust that referenced this pull request Dec 21, 2017
Type privacy polishing

Various preparations before implementing rust-lang/rfcs#2145 containing final minor breaking changes (mostly for unstable code or code using `allow(private_in_public)`).
(Continuation of #42125, #44633 and #41332.)

It would be good to run crater on this.
r? @eddyb
@vks
Copy link

vks commented Feb 1, 2018

@cramertj

For posterity's sake: as part of stabilizing this feature (or shortly after stabilizing this feature) I think there should be a work item to experiment with making the compiler and rustdoc omit the names of private items from types, bounds, and error messages. I believe this could prevent a great deal of confusion when using libraries that have private types their public interfaces.

Maybe open an issue for this?

@aturon aturon merged commit 28bdbbc into rust-lang:master Feb 7, 2018
@aturon
Copy link
Member

aturon commented Feb 7, 2018

This RFC has been (very belatedly!) merged!

Tracking issue

@jswrenn
Copy link
Member

jswrenn commented Nov 14, 2020

@petrochenkov the term "private impl" is used in quite a few places, but I don't think it's precisely defined. The gist I get from the RFC is that a trait implementation is as visible the least visible of its:

  • trait
  • trait parameters
  • impl type
  • impl type parameters

I think this is what's meant by "traits or trait types are as visible as their least visible type argument or trait constructor."

So, given:

/* This type is private to this module. */
enum Private {}

/* This type is public outside of this module. */
pub enum Public {}

/* This type is public outside of this module */
pub trait Trait<Parameter> {
    fn function();
}

This is a public impl:

impl<Anything> Trait<Anything> for Public {
    fn function() {
        println!("Callable anywhere.");
    }
}

This is a private impl:

impl<Anything> Trait<Anything> for Private {
    fn function() {
        println!("Callable only where `Private` is in scope.");
    }
}

This is a private impl:

impl<Anything> Trait<Private> for Anything {
    fn function() {
        println!("Callable only where `Private` is in scope.");
    }
}

Do I have the right impression?

@petrochenkov
Copy link
Contributor Author

@jswrenn
There is this line

When associated function is defined in a private impl (i.e. the impl type or trait is private) it's guaranteed that the function can't be used outside of the impl's area of visibility.

where "impl type or trait" include their generic parameters as well.

So, everything you wrote matches this logic and looks correct.

@jswrenn
Copy link
Member

jswrenn commented Nov 14, 2020

Gotcha! This

where "impl type or trait" include their generic parameters as well.

was really the only thing I was uncertain about. Thanks for the quick reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-privacy Privacy related proposals & ideas A-typesystem Type system related proposals & ideas Ergonomics Initiative Part of the ergonomics initiative final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants