Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JavaScript identifiers not parsed correctly (function ident. with diacritics) shouldn't extend C-like #400

Closed
SphinxKnight opened this issue Nov 22, 2014 · 3 comments

Comments

@SphinxKnight
Copy link

Accented characters (and others) are not correctly handled.
After looking at the code, it seems that Javascript extends C-like languages (see components/prism-javascript.js). For C-like languages, a function identifiers is defined with this pattern:

/[a-z0-9_]+\(/ig

As this might be correct for C where identifiers follow this pattern [a-zA-Z_][a-zA-Z0-9_]*, this is not true for JavaScript identifiers see (https://es5.github.io/#x7.6 )

I am able to reproduce the bug in http://prismjs.com/test.html by typing this snippet and choosing JavaScript as the language:

function créer(){
}
@Golmote
Copy link
Contributor

Golmote commented Aug 19, 2015

@axelduch: Your regexp seems to tolerate too many characters. []^`( is definitely not a valid identifier, but it matches your regexp. ^^

@Golmote
Copy link
Contributor

Golmote commented Aug 19, 2015

(This thread on Stack Overflow may help.)

@xldh
Copy link

xldh commented Aug 20, 2015

Damn you are right, my RegExp is accepting total garbage characters :p, I'll delete my comment as it is harmful

@Golmote Golmote closed this as completed in 29e26dc Sep 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants