Implementation of RFC 2151, Raw Identifiers #48942

Lymia · 2018-03-11T22:57:55Z

See issue #48589.

This implements most of the actual compiler part, though diagnostics and documentation are somewhat lacking. Namely, rustdoc and errors do not yet refer to items with keyword names using raw identifiers.

rust-highfive · 2018-03-11T22:58:06Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @pnkfelix (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

bors · 2018-03-14T15:45:57Z

☔ The latest upstream changes (presumably #48811) made this pull request unmergeable. Please resolve the merge conflicts.

Lymia · 2018-03-15T09:13:41Z

I don't think there's much else worth adding to this, unless it'd be better to also modify rustdoc in this PR? I don't even know where to begin with diagonstics, so, better to leave that to someone else.

bors · 2018-03-16T02:51:52Z

☔ The latest upstream changes (presumably #49051) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2018-03-16T22:48:10Z

☔ The latest upstream changes (presumably #48097) made this pull request unmergeable. Please resolve the merge conflicts.

Lymia · 2018-03-17T03:18:20Z

Help with this conflict? I'm not sure what's the right thing to do with this.

petrochenkov · 2018-03-17T11:09:44Z

@Lymia

# Reset to the commit before merge
git reset --hard d644e652e65eea4e104c4df63a56781bc5648803
# Add rust-lang/rust as a remote in case this wasn't done before
git remote add upstream https://github.com/rust-lang/rust.git
# Fetch everything from rust-lang/rust
git fetch --all
# Rebase your changes on top of rust-lang/rust master
# In case of conflicts use `git rebase --continue` after resolving them
git rebase upstream/master
# Update this PR
git push -f origin master

petrochenkov · 2018-03-17T12:26:18Z

src/libsyntax/feature_gate.rs

@@ -455,6 +455,9 @@ declare_features! (

    // Parentheses in patterns
    (active, pattern_parentheses, "1.26.0", None, None),
+
+    // Raw identifiers allowing items with keyword names


allowing keyword names to be used in arbitrary identifier positions, not only item names

petrochenkov · 2018-03-17T12:35:40Z

src/libsyntax/parse/parser.rs

@@ -793,6 +793,9 @@ impl<'a> Parser<'a> {
                        return Err(err);
                    }
                }
+                if is_raw {
+                    self.sess.raw_identifier_spans.borrow_mut().push(self.span);


I'm not entirely sure all identifiers go through parse_ident_common, it's probably better to do in the lexer when an identifier is created.

Oh huh. I didn't catch that ParseSess was available in the lexer too. Definitely a better place for it.

petrochenkov · 2018-03-17T13:04:13Z

src/libsyntax/parse/token.rs

@@ -273,17 +310,32 @@ impl Token {
        }
    }

-    pub fn ident(&self) -> Option<ast::Ident> {
+    fn ident_common(&self, allow_raw: bool) -> Option<ast::Ident> {


This should be just pub fn ident(&self) -> Option<(ast::Ident, /* is_raw */ bool)> IMO.
It will avoid introducing all those many little helper functions.
(The fn nonraw_ident vs fn ident distinction also looks error-prone.)

petrochenkov · 2018-03-17T13:07:15Z

20d0962bfe851db9ce3e19da0c7c7674a091c266 updates the LLVM submodule unintendedly.

petrochenkov · 2018-03-17T14:57:58Z

src/libsyntax/parse/token.rs

@@ -203,6 +233,11 @@ impl Token {
        Token::Interpolated(Lrc::new((nt, LazyTokenStream::new())))
    }

+    /// Recovers a `Token` from an `ast::Ident`. This creates a raw identifier if necessary.
+    pub fn from_ast_ident(ident: ast::Ident) -> Token {
+        Ident(ident, is_reserved_ident(ident))


This function and all its uses are very suspicious.
It's used when we previously lost the "rawness" property and then we are trying to guess it - if it's a keyword, then it was a raw identifier otherwise not. We shouldn't do this guessing, we need to keep the bool instead or use the "r#{}" string encoding like in proc macros if keeping bool is not possible for some reason.

This problem exists when we go from tokens to AST (including MetaItem) and then have to back to tokens.

Where in the AST should this information be encoded? MetaItem can be handled with a new struct field, but in the other use of this method was impl ToTokens for ast::Ident (how is ToToken used anyway?). In the issue, you mentioned that ast::Ident would be a bad place for an is_raw parameter, because of how size could be an issue for it.

Putting it in the Symbol doesn't work in this case, because Symbol comparison is simply an integer comparison, and r#foo should match foo.

I tend to leave this as is for now, it's possible that we won't have this roundtrip "Tokens -> AST -> Tokens" in the future and will just keep the original tokens.
I need to think about this a bit more.

petrochenkov · 2018-03-17T15:08:26Z

src/libsyntax/diagnostics/plugin.rs

@@ -82,10 +82,10 @@ pub fn expand_register_diagnostic<'cx>(ecx: &'cx mut ExtCtxt,
        token_tree.get(1),
        token_tree.get(2)
    ) {
-        (1, Some(&TokenTree::Token(_, token::Ident(ref code))), None, None) => {
+        (1, Some(&TokenTree::Token(_, token::Ident(ref code, false))), None, None) => {


Looks like the rawness property is not relevant to diagnostics/plugin.rs, these falses should probably be _.

Is this a huge deal either way? These are internal macros, aren't they?

Is this a huge deal either way?

No :)

petrochenkov · 2018-03-17T15:17:22Z

src/libsyntax/parse/lexer/mod.rs

+                return Ok(self.with_str_from(start, |string| {
+                    // FIXME: perform NFKC normalization here. (Issue #2253)
+                    if string == "_" {
+                        token::Underscore


What happens with r#_? It should be an error.
Could you add a test for it?

petrochenkov · 2018-03-17T15:19:43Z

src/libsyntax/parse/lexer/mod.rs

+                        if is_raw_ident && (
+                            ident.name == keywords::SelfValue.name() ||
+                            ident.name == keywords::SelfType.name() ||
+                            ident.name == keywords::Super.name()


The full list is Self, self, super, extern and crate, see fn is_path_segment_keyword.

(Well, there's also $crate, but it's never produced by lexer.)

petrochenkov · 2018-03-17T15:36:38Z

Questions from #48589:

I've found some other unexpected places where raw identifiers might show up while implementing this

Raw identifiers can appear in any context normal identifiers can appear. It looks like this PR already implements this behavior.

Given this macro definition, which branch should test_macro!(r#a) match:

macro_rules! test_macro {
    (a) => { ... };
    (r#a) => { ... };
}

This is an interesting question.
Macros accept DSLs in which language keywords don't play any special role, and, on the other hand, a macro can invent its own "keywords" treated differently from other identifiers.
So the question is whether r# should provide an opt-out from "keywords" invented by macros (aka idents on the left side of the macro) or not. I think that it should.
I case of the test_macro above a is such macro-specific keyword treated specially by DSL of that macro, and r#a provides an opt-out.
I see the implementation in this PR already implements this behavior. Nice!

petrochenkov · 2018-03-19T23:14:09Z

src/libsyntax/feature_gate.rs

@@ -452,6 +452,9 @@ declare_features! (

    // `use path as _;` and `extern crate c as _;`
    (active, underscore_imports, "1.26.0", Some(48216), None),
+
+    // Raw identifiers allowing keyword names to be used


The sentence seems unfinished.

petrochenkov · 2018-03-19T23:21:24Z

r=me after addressing the new comments

petrochenkov · 2018-03-19T23:25:35Z

src/librustc_resolve/macros.rs

@@ -268,7 +268,7 @@ impl<'a> base::Resolver for Resolver<'a> {
                                if k > 0 {
                                    tokens.push(TokenTree::Token(path.span, Token::ModSep).into());
                                }
-                                let tok = Token::Ident(segment.identifier);
+                                let tok = Token::from_ast_ident(segment.identifier);


Same as https://github.com/rust-lang/rust/pull/48942/files#r175613878

I changed this in from_ast_ident, since it seems this is the best behavior in all these cases.

Lymia · 2018-03-22T15:45:41Z

@bors r=petrochenkov

bors · 2018-03-22T15:45:42Z

@Lymia: 🔑 Insufficient privileges: Not in reviewers

Lymia · 2018-03-22T15:46:00Z

@petrochenkov I'm not sure what you want me to do here. :D

I clearly can't do that.

petrochenkov · 2018-03-22T15:52:44Z

@bors r+
Thanks!

bors · 2018-03-22T15:52:45Z

📌 Commit 57f9c4d has been approved by petrochenkov

Rollup of 15 pull requests - Successful merges: #48265, #48528, #48552, #48624, #48883, #48909, #49028, #49030, #49102, #49160, #49169, #49203, #49262, #49272, #49295 - Failed merges: #48942, #49035

alexcrichton · 2018-03-23T18:41:04Z

@bors: r-

This unfortunately failed in a rollup, but fear not! I'm preemptively r-'ing this in case the rollup fails and we have to abandon it, but otherwise I've already fixed the issue.

If the rollup lands it's best to not rebase/push new commits here so this PR will automatically get closed, but if the rollup ends up getting closed then those fixes can be folded into this PR and we can re-r+

Lymia · 2018-03-24T03:33:55Z

What was the issue that ignore-pretty fixed, and how could I have detected that beforehand? For future reference.

alexcrichton · 2018-03-24T13:53:55Z

Oh the builders run more tests than the default ones locally (typically those test suites fail very rarely), but I think this one could in theory have been detected by ./x.py test src/test/run-pass/pretty

petrochenkov · 2018-03-25T23:51:34Z

@Lymia
As alexcrichton said, some tests are not run by x.py test by default because they are not super important and rarely break (but they are still run on PR merge). This includes pretty-printing tests.
#49369 fixes the pretty-printing issue caught by those tests.

rust-highfive assigned pnkfelix Mar 11, 2018

petrochenkov self-assigned this Mar 11, 2018

pietroalbini added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 12, 2018

petrochenkov added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 12, 2018

Lymia force-pushed the master branch 3 times, most recently from 3d9ecae to d644e65 Compare March 14, 2018 08:35

Lymia changed the title ~~[WIP] Implementation of RFC 2151, Raw Identifiers~~ Implementation of RFC 2151, Raw Identifiers Mar 14, 2018

petrochenkov removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 15, 2018

petrochenkov unassigned pnkfelix Mar 15, 2018

petrochenkov mentioned this pull request Mar 15, 2018

syntax: Make _ a reserved identifier #48842

Merged

petrochenkov reviewed Mar 17, 2018

View reviewed changes

petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 17, 2018

Initial implementation of RFC 2151, Raw Identifiers

fad1648

petrochenkov reviewed Mar 19, 2018

View reviewed changes

petrochenkov mentioned this pull request Mar 20, 2018

Tracking issue for RFC 2151, Raw Identifiers #48589

Closed

7 tasks

Lymia added 2 commits March 22, 2018 10:34

Clean up raw identifier handling when recovering tokens from AST.

bfb94ac

Clarify description of raw_identifiers feature flag.

57f9c4d

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 22, 2018

alexcrichton mentioned this pull request Mar 23, 2018

Rollup of 15 pull requests #49308

Merged

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 23, 2018

alexcrichton merged commit 57f9c4d into rust-lang:master Mar 23, 2018

nikomatsakis mentioned this pull request Mar 30, 2018

macros can observe raw identifier state [discuss] #49520

Open

scooter-dangle mentioned this pull request Apr 18, 2019

Add linter to check for keywords in the package name uber/prototool#281

Merged

Implementation of RFC 2151, Raw Identifiers #48942

Implementation of RFC 2151, Raw Identifiers #48942

Conversation

Lymia commented Mar 11, 2018 • edited Loading

rust-highfive commented Mar 11, 2018

bors commented Mar 14, 2018

Lymia commented Mar 15, 2018 • edited Loading

bors commented Mar 16, 2018

bors commented Mar 16, 2018

Lymia commented Mar 17, 2018

petrochenkov commented Mar 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrochenkov commented Mar 17, 2018

petrochenkov Mar 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lymia Mar 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petrochenkov commented Mar 17, 2018 • edited Loading

Choose a reason for hiding this comment

petrochenkov commented Mar 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lymia commented Mar 22, 2018

bors commented Mar 22, 2018

Lymia commented Mar 22, 2018 • edited Loading

petrochenkov commented Mar 22, 2018

bors commented Mar 22, 2018

alexcrichton commented Mar 23, 2018

Lymia commented Mar 24, 2018

alexcrichton commented Mar 24, 2018

petrochenkov commented Mar 25, 2018

Lymia commented Mar 11, 2018 •

edited

Loading

Lymia commented Mar 15, 2018 •

edited

Loading

petrochenkov commented Mar 17, 2018 •

edited

Loading

petrochenkov Mar 17, 2018 •

edited

Loading

Lymia Mar 18, 2018 •

edited

Loading

petrochenkov commented Mar 17, 2018 •

edited

Loading

Lymia commented Mar 22, 2018 •

edited

Loading