-
Notifications
You must be signed in to change notification settings - Fork 29.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: improve cpSync
performance
#53541
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
'use strict'; | ||
|
||
const common = require('../common'); | ||
const fs = require('fs'); | ||
const path = require('path'); | ||
const tmpdir = require('../../test/common/tmpdir'); | ||
tmpdir.refresh(); | ||
|
||
const bench = common.createBenchmark(main, { | ||
n: [1, 100, 10_000], | ||
}); | ||
|
||
function main({ n }) { | ||
tmpdir.refresh(); | ||
const options = { force: true, recursive: true }; | ||
const src = path.join(__dirname, '../../test/fixtures/copy'); | ||
const dest = tmpdir.resolve(`${process.pid}/subdir/cp-bench-${process.pid}`); | ||
bench.start(); | ||
for (let i = 0; i < n; i++) { | ||
fs.cpSync(src, dest, options); | ||
} | ||
bench.end(n); | ||
} |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -155,7 +155,7 @@ | |||||
const src = nextdir(); | ||||||
mkdirSync(src, mustNotMutateObjectDeep({ recursive: true })); | ||||||
writeFileSync(join(src, 'foo.js'), 'foo', 'utf8'); | ||||||
symlinkSync('foo.js', join(src, 'bar.js')); | ||||||
symlinkSync(join(src, 'foo.js'), join(src, 'bar.js')); | ||||||
|
||||||
const dest = nextdir(); | ||||||
mkdirSync(dest, mustNotMutateObjectDeep({ recursive: true })); | ||||||
|
@@ -171,7 +171,7 @@ | |||||
const src = nextdir(); | ||||||
mkdirSync(src, mustNotMutateObjectDeep({ recursive: true })); | ||||||
writeFileSync(join(src, 'foo.js'), 'foo', 'utf8'); | ||||||
symlinkSync('foo.js', join(src, 'bar.js')); | ||||||
symlinkSync(join(src, 'foo.js'), join(src, 'bar.js')); | ||||||
|
||||||
const dest = nextdir(); | ||||||
mkdirSync(dest, mustNotMutateObjectDeep({ recursive: true })); | ||||||
|
@@ -216,9 +216,9 @@ | |||||
symlinkSync(dest, join(src, 'link')); | ||||||
cpSync(src, dest, mustNotMutateObjectDeep({ recursive: true })); | ||||||
assert.throws( | ||||||
() => cpSync(src, dest, mustNotMutateObjectDeep({ recursive: true })), | ||||||
Check failure on line 219 in test/parallel/test-fs-cp.mjs GitHub Actions / test-asan
Check failure on line 219 in test/parallel/test-fs-cp.mjs GitHub Actions / test-linux
|
||||||
{ | ||||||
code: 'ERR_FS_CP_EINVAL' | ||||||
code: 'ERR_FS_CP_EEXIST' | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
AFAICT if symlink in src points to location in dest, Directory structure before copying: ├── copy_15
│ └── link -> node/test/.tmp.0/copy_16
├── copy_16
│ └── link -> node/test/.tmp.0/copy_16 Directory structure after performing regular ├── copy_15
│ └── link -> node/test/.tmp.0/copy_16
├── copy_16
│ ├── copy_15
│ │ └── link -> node/test/.tmp.0/copy_16
│ └── link -> node/test/.tmp.0/copy_16 It's not really related to this particular PR, but i guess creating cyclic symlink like this should be allowed. |
||||||
} | ||||||
); | ||||||
} | ||||||
|
@@ -234,7 +234,7 @@ | |||||
symlinkSync(src, join(dest, 'a', 'c')); | ||||||
assert.throws( | ||||||
() => cpSync(src, dest, mustNotMutateObjectDeep({ recursive: true })), | ||||||
{ code: 'ERR_FS_CP_SYMLINK_TO_SUBDIRECTORY' } | ||||||
{ code: 'ERR_FS_CP_EEXIST' } | ||||||
); | ||||||
} | ||||||
|
||||||
|
@@ -398,7 +398,7 @@ | |||||
writeFileSync(join(dest, 'a', 'c'), 'hello', 'utf8'); | ||||||
assert.throws( | ||||||
() => cpSync(src, dest, mustNotMutateObjectDeep({ recursive: true })), | ||||||
{ code: 'EEXIST' } | ||||||
{ code: 'ERR_FS_CP_EEXIST' } | ||||||
); | ||||||
} | ||||||
|
||||||
|
@@ -416,19 +416,6 @@ | |||||
assert.strictEqual(srcStat.mtime.getTime(), destStat.mtime.getTime()); | ||||||
} | ||||||
|
||||||
// It copies link if it does not point to folder in src. | ||||||
{ | ||||||
const src = nextdir(); | ||||||
mkdirSync(join(src, 'a', 'b'), mustNotMutateObjectDeep({ recursive: true })); | ||||||
symlinkSync(src, join(src, 'a', 'c')); | ||||||
const dest = nextdir(); | ||||||
mkdirSync(join(dest, 'a'), mustNotMutateObjectDeep({ recursive: true })); | ||||||
symlinkSync(dest, join(dest, 'a', 'c')); | ||||||
cpSync(src, dest, mustNotMutateObjectDeep({ recursive: true })); | ||||||
const link = readlinkSync(join(dest, 'a', 'c')); | ||||||
assert.strictEqual(link, src); | ||||||
} | ||||||
|
||||||
// It accepts file URL as src and dest. | ||||||
{ | ||||||
const src = './test/fixtures/copy/kitchen-sink'; | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this correct?
afaict, this will create hard links in the destination directory to the original files. so, while it will appear upon inspecting the filesystem that the files have been copied and the timestamps have been preserved, the files will not have actually been copied, just linked as directory entries in the filesystem to the same file.
this means if you change any of the files in the source directory subsequently, the "files" in the destination will also change. i haven't looked at existing code in detail but am pretty sure this is not how it works with this flag.
i am also guessing without checking that this will only work if source and destination are on same filesystem? 🤔
if i am indeed correct, i guess simplest thing to do for now would be to use the slow path if "preserveTimestamps" is set?
from a quick google, i'm not sure there is an efficient way to copy and preserve timestamps without having to touch each file in a syscall and it just doesn't seem that there is an option rn in std::filesystem to emulate the current behaviour. happy to be wrong though - not a C++ expert.
https://en.cppreference.com/w/cpp/filesystem/copy_options
https://en.wikipedia.org/wiki/Hard_link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i put some benches using the test/fixtures/copy directory from node.js and the same options as above here if anyone wants to try them out - should be easy to get them working on linux or macos.
https://github.com/just-js/lo-bench/blob/main/fs/README.md
some things to note:
on linux core i5, using /dev/shm to reduce filesystem/disk overhead
on macos m1 mini, using a ram disk
also, for the same workload:
so, without testing with other directory structures and sizes/depths but assuming we see similar results, we could assert the following:
another couple of things to note
let me know if i got anything wrong - i ran through this pretty quickly, but it should be easy to reproduce those results using the link above. 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
For now, I'll split this pull-request into multiple changes, and try to optimize the existing implementation before moving it into full C++.
#53612
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a different approach: #53614