Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: allow WHATWG URL and file: URLs as paths #10739

Closed
wants to merge 1 commit into from

Conversation

jasnell
Copy link
Member

@jasnell jasnell commented Jan 11, 2017

Updates the fs module APIs to allow 'file://' URL strings and WHATWG URL objects using 'file:' protocol to be passed as the path.

For example:

const URL = require('url').URL;
const myURL = new URL('file:///C:/path/to/file');
fs.readFile(myURL, (err, data) => {});

// or
// EDIT(addaleax): this was removed from this PR
fs.readFile('file:///C:/path/to/file', (err, data) => {});

On Windows, file: URLs with a hostname convert to UNC paths, while file: URLs with drive letters convert to local absolute paths:

file://hostname/a/b/c => \\hostname\a\b\c
file:///c:/a/b/c => c:\a\b\c

On all other platforms, file: URLs with a hostname are unsupported and will result in a throw:

file://hostname/a/b/c => throw!
file:///a/b/c => /a/b/c

The documentation for the fs API is intentionally not updated in this commit because the URL API is still considered experimental and is not officially documented at this time

Note that file: URLs are required by spec to always be absolute paths from the file system root.

This is a semver-major commit because it changes error handling on the fs APIs.

Refs: #10703

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

fs, url

@jasnell jasnell added fs Issues and PRs related to the fs subsystem / file system. whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Jan 11, 2017
@nodejs-github-bot nodejs-github-bot added dont-land-on-v4.x fs Issues and PRs related to the fs subsystem / file system. url Issues and PRs related to the legacy built-in url module. labels Jan 11, 2017
@jasnell
Copy link
Member Author

jasnell commented Jan 11, 2017

@jasnell jasnell added semver-major PRs that contain breaking changes and should be released in the next major version. and removed dont-land-on-v4.x labels Jan 11, 2017
@joyeecheung
Copy link
Member

I think the url label can be removed here? (Also, can you take a look at nodejs/github-bot#115 so that the bot will stop labeling these PRs with url?)

@jasnell
Copy link
Member Author

jasnell commented Jan 11, 2017

heh, yeah, the bot seems to be a bit aggressive about labeling things these days

@jasnell jasnell removed url Issues and PRs related to the legacy built-in url module. dont-land-on-v7.x labels Jan 11, 2017
@jasnell
Copy link
Member Author

jasnell commented Jan 11, 2017

New CI: https://ci.nodejs.org/job/node-test-pull-request/5811/
(previous one had an error on windows)

lib/fs.js Outdated
@@ -211,6 +213,7 @@ fs.access = function(path, mode, callback) {
throw new TypeError('"callback" argument must be a function');
}

path = getPathFromURL(path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WHATWG URL parsing could escape the null byte if it's not in the hostname ('\u0000' -> '%00', e.g. new URL('file://hostname/a/b/c\u0000')), thus bypassing the null check, so might worth add a test for it. Is it possible for getPathFromURL to put null bytes in a path without one? If not, maybe the nullCheck can be performed before getPathFromURL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. it's not just about null bytes, the path could contain any number of pct-encoded characters. This should run the path through a decode before returning... will fix that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that can be a problem for files with non-ASCII names...might worth add a test for something like 'file://hostname/文件.txt'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the hostname can be escaped by domain to ASCII too. Like 'file://你好你好/a/b'

function isFileUrl(path) {
if (typeof path !== 'string') return false;
if (path.length < 7) return false;
if ((path[0].codePointAt(0) | 0x20) !== 102) return false; // f F
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not path.codePointAt(0)? Or path.startsWith('file://')?

} else {
return path;
}
if (url.protocol !== 'file:')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be moved into the else if (path instanceof URL) block, because a path that starts with file://(isFileUrl ) will have the file protocol.

@jasnell
Copy link
Member Author

jasnell commented Jan 11, 2017

@jasnell
Copy link
Member Author

jasnell commented Jan 12, 2017

sigh... lol... obvious that last run was less than successful ;-) ... will investigate later on tonight

imyller added a commit to imyller/meta-nodejs that referenced this pull request Mar 2, 2017
    Notable changes:

    * deps:
        * update V8 to 5.5 (Michaël Zasso) [#11029](nodejs/node#11029)
        * upgrade libuv to 1.11.0 (cjihrig) [#11094](nodejs/node#11094)
        * add node-inspect 1.10.4 (Jan Krems) [#10187](nodejs/node#10187)
        * upgrade zlib to 1.2.11 (Sam Roberts) [#10980](nodejs/node#10980)
    * lib: build `node inspect` into `node` (Anna Henningsen) [#10187](nodejs/node#10187)
    * crypto: Remove expired certs from CNNIC whitelist (Shigeki Ohtsu) [#9469](nodejs/node#9469)
    * inspector: add --inspect-brk (Josh Gavant) [#11149](nodejs/node#11149)
    * fs: allow WHATWG URL objects as paths (James M Snell) [#10739](nodejs/node#10739)
    * src: support UTF-8 in compiled-in JS source files (Ben Noordhuis) [#11129](nodejs/node#11129)
    * url: extend url.format to support WHATWG URL (James M Snell) [#10857](nodejs/node#10857)

    PR-URL: nodejs/node#11185

Signed-off-by: Ilkka Myller <ilkka.myller@nodefield.com>
@sam-github
Copy link
Contributor

@addaleax said:

Should this come with docs changes?

but I didn't see any follow up comments. Why is this not documented? I just found it while browsing the fs source.

@addaleax
Copy link
Member

@sam-github Quoting the PR description (I think this had been added later):

The documentation for the fs API is intentionally not updated in this commit because the URL API is still considered experimental and is not officially documented at this time

@sam-github
Copy link
Contributor

I see. We can't document that fs APIs take a url.URL object when url.URL itself is undocumented. Fair enough.

@addaleax
Copy link
Member

To be fair, url.URL has been documented since this PR landed: https://nodejs.org/api/url.html#url_the_whatwg_url_api. I’ll open a good first contribution issue asking for docs for this feature.

anchnk pushed a commit to anchnk/node that referenced this pull request May 4, 2017
Update fs module documentation adding WHATWG file URLS support for
relevant fs functions/classes.

Fixes: nodejs#12341
Refs: nodejs#10739
jasnell pushed a commit that referenced this pull request May 4, 2017
Update fs module documentation adding WHATWG file URLS support for
relevant fs functions/classes.

PR-URL: #12670
Fixes: #12341
Ref: #10739
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
anchnk pushed a commit to anchnk/node that referenced this pull request May 6, 2017
Update fs module documentation adding WHATWG file URLS support for
relevant fs functions/classes.

Fixes: nodejs#12341
Refs: nodejs#10739
dobriai pushed a commit to dobriai/electron-oauth2-3legged that referenced this pull request Apr 21, 2018
Looks like there is some junk with Windows paths. To put it simply, the
following call is an Identity operation under Linux, but leaves a
leading slash under Windows:

url.parse( url.format( { pathname: path.join(process.cwd(), 'got_access_token.html'), protocol: 'elif:', slashes: true } )).pathname

On windows you get '/E:/some/path/...' - note the leading slash.

Some discussions here:
  nodejs/node#10703
  nodejs/node#10739

Dunno! I just fixed it by hand.
kapouer added a commit to kapouer/bundledom that referenced this pull request Jun 10, 2020
kapouer added a commit to kapouer/bundledom that referenced this pull request Jun 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system. semver-minor PRs that contain new features and should be released in the next minor version. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.