Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider including byte literals (alternative to integer literals for [u8] and u8) #4334

Closed
thestinger opened this issue Jan 2, 2013 · 29 comments
Labels
A-frontend Area: frontend (errors, parsing and HIR) A-unicode Area: Unicode C-enhancement Category: An issue proposing an enhancement or a PR with one. P-low Low priority

Comments

@thestinger
Copy link
Contributor

It would be nice if b"foo" could be used instead of str::as_bytes_slice("foo"), and especially ~b"bar" instead of vec::from_slice(str::as_bytes_slice("bar")). Python uses this convention and only allows ASCII inside the byte string literals (along with byte escape codes). I don't really think it's necessary to forbid Unicode though - Python does that to make a very clear distinction between bytes and strings.

There could also be a byte version of a char literal, allowing any ASCII character (b'\n') - a 55u8 literal removes the need for escape codes there.

@catamorphism
Copy link
Contributor

This could be implemented as a syntax extension, maybe? But it would have to be b!("foo") or ~b("bar").

@thestinger
Copy link
Contributor Author

@catamorphism: that's a good idea, but do you think that could work for an &static/str equivalent? It can definitely work for the single byte literal form though (just verify that it's in the range, and convert).

@jdm
Copy link
Contributor

jdm commented Jan 4, 2013

I believe that allowing byte literals like this would make using some C APIs like Spidermonkey significantly easier, since things like JSClass instances could finally just be constant data.

@graydon
Copy link
Contributor

graydon commented Jan 24, 2013

A syntax extension would be able to produce &static/[u8], sure. It'd just be lowering bytes!("abc") to &[97_u8, 98_u8, 99_u8]

@Aatch
Copy link
Contributor

Aatch commented Apr 29, 2013

Nominating for well-covered

@graydon
Copy link
Contributor

graydon commented May 2, 2013

accepted for feature complete

@sanxiyn
Copy link
Member

sanxiyn commented May 16, 2013

@graydon, bytes! is now done, how should byte! (or b!?) work? byte!('a') expands to 97u8?

@emberian
Copy link
Member

afaict he means byte!/b! is the same is what you added for bytes!

@bors bors closed this as completed in f198832 May 18, 2013
@sanxiyn sanxiyn reopened this May 20, 2013
@sanxiyn
Copy link
Member

sanxiyn commented May 20, 2013

Now we have [u8] literal, but no u8 literal yet.

@emberian
Copy link
Member

4u8, 0x3Au8?

@sanxiyn
Copy link
Member

sanxiyn commented May 20, 2013

Well, if bytes!("a") is better than &[97u8], byte!('a') could be better than 97u8.

@emberian
Copy link
Member

There's also 'a' as u8, which is shorter than byte!('a').

@sanxiyn
Copy link
Member

sanxiyn commented May 20, 2013

One consideration is that since #6417 'a' as u8 can't be used as a pattern, while 97u8 (and a macro that expands to it) can.

@emberian
Copy link
Member

Ok, good reason :)

@huonw
Copy link
Member

huonw commented Jul 1, 2013

@sanxiyn I don't think macros parse as patterns yet:

rusti> match 1 { some_macro!() => 1 }
<anon>:2:21: 2:22 error: expected `=>` but found `!`
<anon>:2  match 1 { some_macro!() => 1 } 
                              ^

@bstrie
Copy link
Contributor

bstrie commented Jul 15, 2013

Syntax extensions can inspect types, right? Could bytes!() simply be made smart enough to generate [u8] when given a string and u8 when given a char? Is this way too magical? It just seems a tad redundant to have two syntax extensions where one of them is just sugar for 97u8.

@huonw
Copy link
Member

huonw commented Jul 15, 2013

Syntax extensions can't inspect types properly, but they can tell whether a given literal is a char or string. (Personally I think that would be a little bit magical, since bytes!('a', 'b') would be a [u8] but bytes!('a') would be u8: could lead to head scratching.)

@chris-morgan
Copy link
Member

Here are the basics of what I want. I've been needing to deal with bytes a lot in rust-http, and there will be many other similar projects that do have to deal with bytes a lot.

Firstly: b'x', equivalent to ('x' as u8), but a literal able to be used in patterns and such.

Secondly: b"x", equivalent to &[b'x', b'y'].

Thirdly: sigil support for the two above which removes the implied &, as with "x" being &'static str and ~"x" being ~str. This is something that cannot presently be done.

@lilyball
Copy link
Contributor

I'm a fan of 'a'u8 syntax (hence #9302) for char literals, which has the benefit of allowing other types too, e.g. 'a'u32 if you need a u32 instead of a u8.

@thestinger
Copy link
Contributor Author

@kballard: the problem I have with that syntax is it not being clear what the corresponding (if any) string literals would be, like "foo"u32 or "foo"u64

@lilyball
Copy link
Contributor

@thestinger String literals like that don't make sense. The only sensible string literal is the byte literal, and I don't think there's any need to use the same syntax for string-literal-as-[u8] vs char-literal-as-specified-integral-type, as they're distinct concepts.

@thestinger
Copy link
Contributor Author

So b"foo" for &'static [u8] and 'a'u8, 'a'i64, etc. as new ways of declaring integers. I guess it would just fail to compile if the code point was out-of-range for the type.

@lilyball
Copy link
Contributor

@thestinger Yeah, that's what I'm thinking is the most sensible.

@thestinger
Copy link
Contributor Author

Added back the RFC tag. It's not clear that bytes!() is satisfactory (it may be once syntax extensions work in more places) or what we want as the syntax (or syntax extensions).

@SimonSapin
Copy link
Contributor

As @huonw said earlier, I’d like something that is usable in patterns. This is currently invalid:

match c { 'a' as u8 .. 'z' as u8 => /* … */ }

@SimonSapin
Copy link
Contributor

+1 to @chris-morgan’s proposal: #4334 (comment)

@pnkfelix
Copy link
Member

pnkfelix commented Feb 6, 2014

P-low, not 1.0 blocker.

@SimonSapin
Copy link
Contributor

This should be a new-style RFC.

@huonw
Copy link
Member

huonw commented May 6, 2014

Thanks @SimonSapin (rust-lang/rfcs#69).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-frontend Area: frontend (errors, parsing and HIR) A-unicode Area: Unicode C-enhancement Category: An issue proposing an enhancement or a PR with one. P-low Low priority
Projects
None yet
Development

No branches or pull requests