Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Rewrite lib/copy/ncp.js #374

Closed
wants to merge 15 commits into from
Closed

[WIP] Rewrite lib/copy/ncp.js #374

wants to merge 15 commits into from

Conversation

manidlou
Copy link
Collaborator

Refs: #292.

This is the first draft of reconsidering/rewriting lib/copy/ncp.js.

As @RyanZim mentioned before:

Once rewritten, it may be small enough to inline in lib/copy/copy.js;

It make sense to me too. So, it is wrapped in one file now lib/copy/copy.js.

A few notes:

Please feel free to point out any issues that you can think of with these changes.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.1%) to 84.337% when pulling 54ebe52 on manidlou:rewrite-copy into 42656df on jprichardson:master.

@manidlou
Copy link
Collaborator Author

Hmm! coverage decreased! I will look at it and try to add tests for the parts that are not covered yet.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 85.663% when pulling cac1696 on manidlou:rewrite-copy into 42656df on jprichardson:master.

Copy link
Collaborator

@RyanZim RyanZim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. Couple of nits below. Also a couple of questions for @jprichardson on aspects I'm not sure about.

@manidlou This needs to be rebased on top of the latest master to avoid undoing work done on move's tests.

fs.writeFileSync(path.join(src, FILES[0]), dat0)
fs.writeFileSync(path.join(src, FILES[1]), dat1)
fs.writeFileSync(path.join(src, FILES[2]), dat2)
fs.writeFileSync(path.join(src, FILES[3]), dat3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use outputFile here instead of ensureFile and writeFile.

})
})

it(`should copy the directory successfully when dest is 'srcsrc/dest'`, done => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the test's code, the description should say: when dest is 'src_src/dest' (missing underscore).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. In fact, description is correct, but dest path in test's code should change to match the description.

@@ -159,6 +159,7 @@ describe('fs-extra', () => {
})
})

/* It is redundant. See above, we already tested for this case.
Copy link
Collaborator

@RyanZim RyanZim Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove this test.

const filter = /.html$|.css$/i

fse.copy(srcFile1, destFile1, filter, (err) => {
assert(!err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use assert.ifError(err) here.


options.filter = options.filter || function () { return true }

// Warn about using preserveTimestamps on 32-bit node
if (options.preserveTimestamps && process.arch === 'ia32') {
console.warn(`fs-extra: Using the preserveTimestamps option in 32-bit node is not recommended;\n
see https://github.com/jprichardson/node-fs-extra/issues/269`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could port this to ES6 template strings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RyanZim would you please say what you mean exactly? is it referring to the extra \n in the warn?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, nevermind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that \n necessary as with template strings we can do multi-line like normally?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manidlou We are already using template strings (I didn't notice that when I commented here). The \n is to make a double linebreak.

Just ignore this line comment.

lib/copy/copy.js Outdated
if (typeof options === 'function' && !callback) {
callback = options
const DEST_NOENT = -1
const DEST_EXISTS = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly could/should use ES6 Symbols here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using Symbols here as one of their use cases is for unique constants.

lib/copy/copy.js Outdated
if (err) return cb(err)
if (options.preserveTimestamps) {
return utimes(dest, srcStat.atime, srcStat.mtime, cb)
} else return cb()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need else here.

lib/copy/copy.js Outdated
if (isSrcSubdir(src, res)) return cb(new Error(`Cannot copy directory '${src}' into itself '${res}'`))
if (src === res) return cb()
return cpDir(src, dest, options, cb)
} else return cb()
Copy link
Collaborator

@RyanZim RyanZim Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case would this code path be taken? Just wondering.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean line 141 } else return cb(), I guess it is not needed and should be removed as we check for all possible conditions.

lib/copy/copy.js Outdated
return cb()
}).catch(err => {
return cb(err)
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be shortened to

})).then(() => cb())
.catch(cb)

lib/copy/copy.js Outdated
}
// if src points to dest
if (resolvedSrcPath === dest) return cb()
return copy(resolvedSrcPath, dest, options, cb)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hairy on expected behavior for symlinks; @jprichardson is this correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I am not 100% sure neither. So, @jprichardson we definitely need some advice here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RyanZim I guess you are right. I don't have a good feeling about return copy(resolvedSrcPath, dest, options, cb) neither. I recall I used it because of the ability to handle weird cases like when src is a symlink and dest exists (either regular or another symlink) and is a kid of src as well.

return fs.symlink(resolvedSrcPath, dest, cb) correctly throws EEXIST: file already exists, symlink 'src' -> 'dest' for those cases.

Now, I think we should decide what we want to do here for these two cases when src is a symlink:

  • dest exists and is a regular file/directory
  • dest exists and is a symlink (file/directory)

Do we want to throw an error and notify user that dest exists?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I think we should decide what we want to do here for these two cases when src is a symlink:

What is the existing behavior?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the existing behavior?

Currently, when src is a symlink, this is what happens for the following cases:

  • dest exists and is a regular file or directory

    it throws Error: EINVAL: invalid argument, readlink 'dest', and this is because the existing behavior assumes when src is a symlink, dest is also a symlink (https://github.com/jprichardson/node-fs-extra/blob/master/lib/copy/ncp.js#L182).

  • dest exists and is a symlink

    since overwrite is the default behavior, it removes dest then copies src (creates a symlink in dest that points to src).

Now, for the case 2, it works fine except if src is a kid of dest. In this case, since it removes dest, we end up losing src and consequently a broken symlink is created.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.6%) to 81.34% when pulling 436a587 on manidlou:rewrite-copy into b5c72dc on jprichardson:master.

Copy link
Collaborator

@RyanZim RyanZim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry so long in getting to this.

One old type-check in two places, other than that, LGTM.

lib/copy/copy.js Outdated
} else if (res === DEST_EXISTS) {
if (isSrcSubdir(src, dest)) return cb(new Error(`Cannot copy directory '${src}' into itself '${dest}'`))
return cpDir(src, dest, options, cb)
} else if (res && typeof res !== 'number') { // dest exists and is a link
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This typeof res !== 'number' is obsolete since you are using symbols.

Copy link
Collaborator Author

@manidlou manidlou Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I just noticed that for both cases where dest exists (regular and symlink), onDir() should check for options.overwrite and proceed its operation accordingly; the same as onFile().

Edit

Is that correct because I am not 100% sure if that should be the correct behavior for copying direcotory? 😕 Since it is mentioned in docs that overwrite would apply to both file and directory. Then, this algorithm wouldn't be consistent with docs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manidlou Historically, overwrite only is checked for files. There's no need to nuke an entire directory if we just need to add one file to it. I'd leave it that way.

lib/copy/copy.js Outdated
// if src points to dest
if (resolvedSrcPath === dest) return cb()
return copy(resolvedSrcPath, dest, options, cb)
} else if (resolvedDestPath && typeof resolvedDestPath !== 'number') { // dest is a link
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again here also, when dest exists (regular and symlink), should go with the same pattern of checking options.overwrite and operates accordingly.

lib/copy/copy.js Outdated
} else if (options.errorOnExist) {
return cb(new Error(dest + ' already exists'))
} else return cb()
} else return cb()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last else basically means dest exists and is a symlink. I believe the same pattern that is used for the case DEST_EXISTS, which means checking for options.overwrite and operates accordingly, should be applied for this case as well. Is that right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. @jprichardson ?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed @manidlou

lib/copy/copy.js Outdated
} else if (res === DEST_EXISTS) {
if (isSrcSubdir(src, dest)) return cb(new Error(`Cannot copy directory '${src}' into itself '${dest}'`))
return cpDir(src, dest, options, cb)
} else if (res && typeof res !== 'number') { // dest exists and is a link
Copy link
Collaborator Author

@manidlou manidlou Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I just noticed that for both cases where dest exists (regular and symlink), onDir() should check for options.overwrite and proceed its operation accordingly; the same as onFile().

Edit

Is that correct because I am not 100% sure if that should be the correct behavior for copying direcotory? 😕 Since it is mentioned in docs that overwrite would apply to both file and directory. Then, this algorithm wouldn't be consistent with docs.

lib/copy/copy.js Outdated
// if src points to dest
if (resolvedSrcPath === dest) return cb()
return copy(resolvedSrcPath, dest, options, cb)
} else if (resolvedDestPath && typeof resolvedDestPath !== 'number') { // dest is a link
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again here also, when dest exists (regular and symlink), should go with the same pattern of checking options.overwrite and operates accordingly.

fs.mkdirsSync(dest)
const subdir = path.join(TEST_DIR, 'dest', 'subdir')
fs.mkdirsSync(subdir)
fs.writeFileSync(path.join(subdir, 'file.txt'), 'some data')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be changed to use fs.outputFileSync().

it('should not copy and return', done => {
src = path.join(TEST_DIR, 'src', 'somefile.txt')
fs.ensureFileSync(src)
fs.writeFileSync(src, 'some data')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. It will be changed to use fs.outputFileSync() instead.

it('should not copy and return', done => {
src = path.join(TEST_DIR, 'src', 'srcfile.txt')
fs.ensureFileSync(src)
fs.writeFileSync(src, 'src data')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

it('should not copy and return', done => {
dest = path.join(TEST_DIR, 'dest', 'somefile.txt')
fs.ensureFileSync(dest)
fs.writeFileSync(dest, 'some data')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@RyanZim
Copy link
Collaborator

RyanZim commented Apr 5, 2017

@manidlou ping?

@manidlou
Copy link
Collaborator Author

manidlou commented Apr 5, 2017

@RyanZim I asked a question a few comments above regarding copying when src is a symlink, the case that we are not sure about the correct behavior yet, that I've been waiting for your and @jprichardson's answers. Would you please take a look at it and give me your ideas about it? 😃

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 86.751% when pulling b9834f9 on manidlou:rewrite-copy into 096a8e1 on jprichardson:master.

@manidlou
Copy link
Collaborator Author

So, following our discussion on copying symlinks when src is a dest's kid and vice versa, to me for this case (which I guess is relatively a rare case) it is reasonable to throw error and let user decide on the action. Although we can deliberately handle these cases, but that makes it overly complex while it think it is not worth it for this case. I tried to find a short and yet descriptive error message for this case. I came up with ${dest} exists and contains ${src}.

@jprichardson @RyanZim what do you think then?

Copy link
Collaborator

@RyanZim RyanZim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overall LGTM

@jprichardson Please review carefully though, I might have missed something.

Copy link
Collaborator

@RyanZim RyanZim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this looks OK, @jprichardson ?

@manidlou
Copy link
Collaborator Author

manidlou commented May 8, 2017

What about noRecurseOnFailedFilter? do you think it is reasonable enough to keep it or it's better to remove it for the sake of simplicity?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.7%) to 86.242% when pulling cecd184 on manidlou:rewrite-copy into 096a8e1 on jprichardson:master.

@RyanZim
Copy link
Collaborator

RyanZim commented Jun 20, 2017

@jprichardson Thoughts on this?

Repository owner deleted a comment from coveralls Jun 29, 2017
Repository owner deleted a comment from coveralls Jun 29, 2017
Repository owner deleted a comment from coveralls Jun 29, 2017
Repository owner deleted a comment from coveralls Jun 29, 2017
Repository owner deleted a comment from coveralls Jun 29, 2017
Repository owner deleted a comment from coveralls Jun 29, 2017
@RyanZim RyanZim mentioned this pull request Sep 5, 2017
@manidlou
Copy link
Collaborator Author

@jprichardson @RyanZim will you please take a look at https://github.com/manidlou/benchmark-fs-extra-copy?

I add a quick summary here too:

I've been working on an alternative implementation for copy() that uses recursive function (I couldn't find a better name) instead of Promise for copying a directory to see if we can get a faster execution. In addition, I organized the code into bunch of small functions as they can be optimized by v8 particularly when copying directories with lots of files and subdirectories as the same function needs to be called repeatedly. You've probably seen these posts or alike before: A tour of V8: Crankshaft, the optimizing compiler and NodeJS : A quick optimization advice.

So, I created a benchmark to compare the running time of these implementations:

  • this new recursive function
  • Promise
  • current (existing)

to see which one runs faster particularly on large input sizes, when a src directory has lots of items.

So, when src length is up to ~6000, they run pretty much the same. However, on roughly > 6000 the new recursive function runs significantly faster than other implementations.

line chart comparison, range: 110-8190

line chart comparison, range: 110-111110

any thoughts please?

@RyanZim
Copy link
Collaborator

RyanZim commented Oct 18, 2017

@manidlou Sorry so terribly long in getting back (again). If you're comfortable with the recursive method; by all means add it. Just glancing at it, it doesn't look bad, and better perf is always good.

@RyanZim
Copy link
Collaborator

RyanZim commented Oct 18, 2017

BTW @manidlou, please don't let me just let stuff slide like this; if I don't reply within 7 days, you're welcome to ping me (I forget stuff really easily; if I'm not responding, I could be busy, but most of the time, I just forgot about it).

@manidlou
Copy link
Collaborator Author

@RyanZim besides perf, overall I personally have a better feeling about recursive than Promise. So, if you don't mind, since it's been a long time that this PR been open, just for saving a bit of time resolving conflicts manually, I'll open a new PR for recursive implementation and close this one. Of course, if you think we need to keep this open, please don't hesitate to reopen it! 😄

@manidlou manidlou closed this Oct 19, 2017
@manidlou manidlou mentioned this pull request Oct 19, 2017
@manidlou manidlou deleted the rewrite-copy branch October 19, 2017 10:41
@RyanZim RyanZim added this to the 5.0.0 milestone Nov 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

handle copying between identical files
4 participants