Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py2vs3 HTMLParser html.parser #4

Closed
wants to merge 3 commits into from
Closed

Py2vs3 HTMLParser html.parser #4

wants to merge 3 commits into from

Conversation

pbulsink
Copy link

This correctly imports HTMLParser or html.parser for python 2.x or 3.x. For 3.x, imports html.parser as HTMLParser to simplify & reduce further incompatibility. See https://docs.python.org/2/library/htmlparser.html.

@mcs07
Copy link
Owner

mcs07 commented Oct 10, 2016

Nice catch, thanks. Elsewhere I've used six for handling Python 2/3 compatibility, so for consistency I've used six.moves.html_parser instead of your manual version check.

And I just discovered it's a bit more complicated - we are only using the HTMLParser for the unescape method, which is undocumented and apparently deprecated. So for Python 3.4+ we should actually use html.unescape.

@mcs07 mcs07 closed this in 8d30d20 Oct 10, 2016
mcs07 added a commit that referenced this pull request Oct 10, 2016
HTML unescape py2/3 compat - fixes #4
@pbulsink pbulsink deleted the py2vs3-HTMLParser-html.parser branch October 11, 2016 16:01
@mcs07 mcs07 added the bug label Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants