Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macro_rules! should support gensym for creating items #1266

Open
geofft opened this issue Aug 28, 2015 · 26 comments
Open

macro_rules! should support gensym for creating items #1266

geofft opened this issue Aug 28, 2015 · 26 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@geofft
Copy link

geofft commented Aug 28, 2015

Currently macro_rules! doesn't support the ability to generate items with unique names. This generates an error that I'm defining struct Foo multiple times:

macro_rules! x {
    () => {struct Foo;}
}

x! {}
x! {}

The standard workaround is to accept an ident parameter in the macro, and name everything after that identifier somehow, even if only one item with that name is intended to be visible. lazy_static!, for instance, uses the ability of (non-tuple) structs to share a name with a value. Another option is to name a module after the ident and stuff everything inside that module. But these are incomplete workarounds: the former approach lets you create at most one type and at most one value, and the latter approach causes other hygiene problems because not everything expands the same inside a module. (For instance, ty parameters don't work right because paths to types aren't necessarily valid in the module.)

Having an actual gensym would solve this problem. I think you can do a simple syntax where a macro_rules! parameter of type gensym consumes no tokens and expands to a new, unique identifier per macro invocation, so you'd have e.g.

macro_rules! x {
    ($a:gensym) => {struct $a;}
}

which could still be invoked as x! {}, but can be expanded multiple times, creating a unique structure each time. And you could do $a:gensym $b:gensym if you needed multiple names. If that's sensible, I can write this up more formally or attempt implementing it.

Note that this is different from rust-lang/rust#19700: that one's about hygiene for existing items invoked by the macro, not the creation of new items. (Also I think $crate is a sufficient workaround for that issue in practice, but doesn't help at all here.) However, the backwards-compatibility rationale in that issue applies: we can't simply gensym all literal item names in macros the way we do for let-bindings, since that would break macros that create items with specific names. So there needs to be specific syntax for gensym.

One answer is to put this off until macro! (or whatever you want to call macro_rules! 2.0), but if this can be done simply, clearly, and backwards-compatibly, it seems worth doing now, unless the new system is happening very soon (which it doesn't seem like it is).

@Kimundi
Copy link
Member

Kimundi commented Aug 29, 2015

Alternatively there could be some new macro syntax for declaring a identifier on the RHS of a macro:

macro_rules! x {
    () => {
        $gensym a;
        struct $a;
    }
}

@durka
Copy link
Contributor

durka commented Sep 1, 2015

To throw out another idea, clojure's syntax for gensyms is a hash after the symbol, like struct $a#.

@durka
Copy link
Contributor

durka commented Sep 1, 2015

For the record, I think it would be really great to have this functionality in some form.

@kylewlacy
Copy link

+1 for the concept, although I think there's a much more sound solution:

First, there'd need to be support for macros in all ident positions (see the discussion around concat_idents! in rust-lang/rust#12249, rust-lang/rust#13294, rust-lang/rust#14266, and #215). Second, a new compiler-provided gensym! macro would be added. With both of these, the above example could like this:

macro_rules! x {
    () => {
        struct gensym!();
    }
}

(Although maybe this is more appropriate as a longer-term solution, following the upcoming and ever-elusive "macro reform"; in which case, I'd be fine with either of the other two syntax proposals)

@durka
Copy link
Contributor

durka commented Sep 1, 2015

@kylewlacy yes, that would be better. However, as you say if every macro_rules improvement is blocked on another one, none of them will ever happen. Also, as a detail you need to be able to refer back to a gensym (for example struct A; let a = A; where A was gensymmed) so the gensym!() you suggest would need to take a parameter, say an ident or a string literal.

@blaenk
Copy link
Contributor

blaenk commented Sep 1, 2015

Yeah, I've run into this many times and expected gensym from lisp languages to have been available, but it wasn't 😞 Accepting an ident is what I've used but I ended up just avoiding creating a macro altogether because I didn't want the user to have to worry about the ident. It's leaking information/work that isn't relevant to the user.

@Stebalien
Copy link
Contributor

@durka

To throw out another idea, clojure's syntax for gensyms is a hash after the symbol, like struct $a#.

$a# would be backwards incompatible but putting a symbol after the dollar sign would be safe ($$a, $&a, $!a...).

@durka
Copy link
Contributor

durka commented Sep 5, 2015

@Stebalien is it backwards incompatible?

@jonas-schievink
Copy link
Contributor

It might be compatible if you force the metavariable to be unused on the LHS, but that's a bit of a hack

@durka
Copy link
Contributor

durka commented Sep 5, 2015

Actually I'd think you would want to at least lint if you have a capture
and a gensym with the same identifier, but anyway. As far as I can tell
"$foo#" isn't currently accepted at all, so no compatibility issues, but
maybe there is a situation where it is.

On Sat, Sep 5, 2015 at 1:42 PM, Jonas Schievink notifications@github.com
wrote:

It might be compatible if you force the metavariable to be unused on the
LHS, but that's a bit of a hack


Reply to this email directly or view it on GitHub
#1266 (comment).

@jonas-schievink
Copy link
Contributor

This is accepted:

macro_rules! m {
    ( $a:item ) => ( $a# [test] fn t() {} );
}

fn main() {
    m!(fn t() {});
}

@durka
Copy link
Contributor

durka commented Sep 5, 2015

Yeah, you are right, I just found a similar example.

On Sat, Sep 5, 2015 at 2:02 PM, Jonas Schievink notifications@github.com
wrote:

This is accepted:

macro_rules! m {
( $a:item ) => ( $a# [test] fn t() {} );
}
fn main() {
m!(fn t() {});
}


Reply to this email directly or view it on GitHub
#1266 (comment).

@Stebalien
Copy link
Contributor

Also,

macro_rules! m {
    ( $a:ident# ) => ();
    ( $a:ident ) => ( m!($a#) );
}

fn main() {
    m!(test);
}

@geofft
Copy link
Author

geofft commented Sep 5, 2015

I feel like I'd prefer either gensym!(a) or $a:gensym for clarity over $a# (you can google what a gensym is, or search for it in the docs), anyway.

@Stebalien
Copy link
Contributor

gensym!(a) is actually a really nice way of doing this because a can be a variable (any ident). This way, if $a == $b, gensym!($a) == gensym!($b).

@durka
Copy link
Contributor

durka commented Sep 5, 2015

The doc point is a good one, but the question is where to put the
$a:gensym -- it doesn't really fit in the LHS because it's not getting
passed into the macro. Maybe the macro_rules syntax could be extended like
this

macro_rules! m {
    ($a:expr) ($b:gensym) => { foo($a, $b); }
}

On Sat, Sep 5, 2015 at 3:51 PM, Geoffrey Thomas notifications@github.com
wrote:

I feel like I'd prefer either gensym!(a) or $a:gensym for clarity over $a#
(you can google what a gensym is, or search for it in the docs), anyway.


Reply to this email directly or view it on GitHub
#1266 (comment).

@durka
Copy link
Contributor

durka commented Sep 5, 2015

Unfortunately gensym!(a) would be nearly useless for the same reasons
that concat_idents! isn't currently usable.

On Sat, Sep 5, 2015 at 3:57 PM, Steven Allen notifications@github.com
wrote:

gensym!(a) is actually a really nice way of doing this because a can be a
variable (any ident). This way, if $a == $b, gensym!($a) == gensym!($b).


Reply to this email directly or view it on GitHub
#1266 (comment).

@nrc
Copy link
Member

nrc commented Nov 12, 2015

Would having hygienic items solve this problem? If you write let x = ... in a macro, x is gensym'ed because of the hygiene rules, if we applied hygiene to item names, then the same ought to be true.

@durka
Copy link
Contributor

durka commented Nov 12, 2015

Seems like it would definitely have to be controllable somehow, because
backwards compatibility.
On Nov 12, 2015 12:59 AM, "Nick Cameron" notifications@github.com wrote:

Would having hygienic items solve this problem? If you write let x = ...
in a macro, x is gensym'ed because of the hygiene rules, if we applied
hygiene to item names, then the same ought to be true.


Reply to this email directly or view it on GitHub
#1266 (comment).

@nagisa
Copy link
Member

nagisa commented Nov 12, 2015

@nrc it is not fully clear where the implicit gensyming should happen. Implicit doesn’t cover all the cases.

I.e. consider macro that generates this

impl A for B {
    fn symbol(){}
}

gensym'ing symbol here would be invalid, and I believe there's a non-trivial amount of macroses that do this. On the other hand gensyming in

impl A {
    fn some_temp_fn1(){}
    fn some_temp_fn2(){}
    fn some_temp_fn3(){}
}

might be desirable.

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 22, 2016
@gyscos
Copy link

gyscos commented Nov 12, 2019

Any news on this? Is there a currently recommended workaround?

@RustyYato
Copy link

@gyscos the only workaround right now is to use procedural macros

@jhpratt
Copy link
Member

jhpratt commented Dec 9, 2020

For what it's worth, this is trivially worked around using a procedural macro:

// dependencies: sha2, hex
use proc_macro::{Ident, Span, TokenStream, TokenTree};
use sha2::{Digest, Sha256};

#[proc_macro]
pub fn gensym(input: TokenStream) -> TokenStream {
    TokenStream::from(TokenTree::Ident(Ident::new(
        &format!(
            "_{}",
            hex::encode(Sha256::digest(input.to_string().as_bytes()))
        ),
        Span::call_site(),
    )))
}

I haven't tested this beyond "this can create an ident", but I presume it's quite finicky, as it's likely sensitive to whitespace, even. The result is essentially guaranteed to be unique given a unique input, given that it relies on SHA-256.

No idea how useful this will be to people seeking workaround (personally I'm fine providing a list of unique identifiers by hand), but it's possible. I would like to see a proper solution implemented in stdlib, however.

@kennytm
Copy link
Member

kennytm commented Dec 9, 2020

proc macro or not, i don't think gensym!(stuff) will work the intended way, the same reason why concat_ident! doesn't work:

macro_rules! outer {
    (...) => {
        struct gensym!(for some private type);
        //           ^ the macro won't be eagerly expanded and thus an "unexpected '!'" error here.
    }
}

outer!(...);

@jhpratt
Copy link
Member

jhpratt commented Dec 9, 2020

That's where Rust needs eager macro expansion, which is a separate issue. I tested `let gensym!(foo bar) = …", which worked as expected. Like I said, it has its limitations currently.

@kennytm
Copy link
Member

kennytm commented Dec 9, 2020

For the actual workaround you need to wrap the proc macro in the entire macro definition like how paste works:

macro_rules! outer {
    (...) => {
        gensym_wrapper! {
            struct gensym!(whatever);
        }
    }
}

the gensym_wrapper proc macro then replaces the gensym!(…) by the generated ident. Though in most cases using paste should be good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests