Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink recursive flag (and design more generally) for fs.rmdir #34278

Closed
CxRes opened this issue Jul 9, 2020 · 22 comments
Closed

Rethink recursive flag (and design more generally) for fs.rmdir #34278

CxRes opened this issue Jul 9, 2020 · 22 comments
Labels
fs Issues and PRs related to the fs subsystem / file system.

Comments

@CxRes
Copy link

CxRes commented Jul 9, 2020

Is your feature request related to a problem? Please describe.
We (@me and @RyanZim) have concerns regarding API direction for fs.rmdir flags.

Up till now node's fs API was modeled closely to POSIX. The new flags for fs.rmdir, in particular, the recursive flag is a break from this tradition. It is especially odd that rmdir can delete a file. Other difficulties include not knowing if you are deleting a file/folder that does not already exist because errors are not reported in recursive mode (at least users should be able to opt out of this behaviour).

We refer you to our discussion at fs-extra which builds on fs (subsequent to the PR being closed).
jprichardson/node-fs-extra#785

Describe the solution you'd like
A fs.rm consistent with POSIX rm. This would mean fs.rm will have the recursive semantics instead of fs.rmdir which will remain a lower level function.

Describe alternatives you've considered
One alternative would be to implement recursive semantics in user land, though I personally feel that we should use/expose OS API's maximally through node. We are open to other suggestions. This is a good time to address this issue while the feature is still experimental!

@richardlau
Copy link
Member

I have vague recollections that something similar may have been mentioned during the recent OpenJS World Collaborator summit during the Tooling group's session cc @nodejs/tooling

@richardlau
Copy link
Member

cc @nodejs/fs as well

@richardlau richardlau added the fs Issues and PRs related to the fs subsystem / file system. label Jul 9, 2020
@RyanZim
Copy link
Contributor

RyanZim commented Jul 9, 2020

@CxRes Thanks for opening this, I've been meaning to do this, but didn't make it a priority. I agree with @CxRes on his points, but would just like to add a bit more clarity.

For me the biggest problem with the current system is that recursive is a magic flag that does a lot of things:

  1. It enables file/directory recursive deletion, similar to rm -r (this is consistent with its naming; if this was all it did, I'd be mostly fine with it).
  2. It does not throw ENOENT if the path does not exist, somewhat analogous to rm -f.
  3. It adds retry logic, and enables the retryDelay option.

Each of these things could well be their own separate option, and the error-swallowing and retry options would be equally useful for non-recursive fs.rmdir() and fs.unlink(). To be clear, I am not advocating for adding error-swallowing and retry options to plain rmdir and unlink; I am merely pointing out that it would be logically consistent to do so. In reality, both rmdir and unlink are simple POSIX methods (until recursive was added), and I think it's logically consistent that they stay this way.

For that reason, rimraf logic should be its own method. I don't care much about naming; rm is what @CxRes suggested; has the potential disadvantage of being associated with plain rm, when it's actually closer to rm -rf; I think removeTree (or something similar) was also suggested in past discussions, before it was decided to make it an option. I will throw out my vote against naming it fs.rmrf, since it's non-intuitive for those not familiar with *nix shell. Also, being a maintainer of the npm module fs-extra, I hope fs.remove isn't chosen, since that will create a naming conflict with the existing fs-extra method and make my life a bit painful (regardless of the merits of that name).

@Trott
Copy link
Member

Trott commented Jul 10, 2020

cc @bcoe

@iansu
Copy link
Contributor

iansu commented Jul 10, 2020

I made a PR #28208 to add rimraf to Node.js as part of the Tooling Group. Originally we had it in a new module and named the function rmtree. The new module idea was not well received so I moved it into the fs module under the same name.

I was opposed to adding it as a flag to rmdir and I agree that it is confusing that rmdir also deletes files. In the shell rmdir and rm are two different things. Using a flag also make feature detection more difficult. I was not involved with the PR that ultimately made it into core.

The Tooling Group is discussing making more fs APIs recursive so I agree that it would be nice to reach some kind of consensus on this and establish a pattern.

@boneskull
Copy link
Contributor

a few thoughts:

  • mkdir recursive also deviates from the 1:1 syscall.

  • an allowRetries flag which defaults to true on win32 seems reasonable

  • if a user expects the directory to already be empty, then the recursive flag isn’t needed. I know you know this, but I can count the times in my long and uneventful career when I’ve needed to remove a tree of empty directories on one hand. that’s the most common thing people want to do, which is why it removes files.

  • there’s nothing explicitly stated in the docs that the functions in fs are intended to mirror syscalls (unless my reading comprehension really is that bad). if we want to do that, then we should make it explicit. but first we should absolutely ask ourselves why we’re doing it other than theoretical purity

  • @RyanZim I don’t understand if rmdir (as it stands) is problematic for fse. It didn’t sound like that was the main reason the original PR was closed—is that correct?

  • renaming it to “prune” or “rmtree” might be ok. rm could work as well, if it works on non-directories. we should then implement all of gnu coreutils 🙂

  • imo, trying to adhere closely to posix-isms is not helpful given most people using node day-to-day use it on windows (afaik). just because that’s the way we’ve done it before

@CxRes
Copy link
Author

CxRes commented Jul 10, 2020

@boneskull Thanks for chiming in!!! Just to clarify a couple of points in response to your thoughts:

  • Re file deletion: its (at least in my mind) counter-intuitive that fs.rmdir deletes a file at the root (not that it deletes files inside some directory along with that directory) ie fs.rmdir runs successfully to delete no directory at all. Its more confusing than wrong per se!

  • I agree with you that adhering to POSIX conventions may not be the best way ahead - I think of it is as more of convenient crutch to circumvent the difficult discussions of API design where often there are no right answers. And I implicitly picked it up while making my initial suggestion. But I believe that any solution must allow users to efficiently reap the full benefits of the underlying OS APIs and not unnecessarily force operations to user land.

@Trott
Copy link
Member

Trott commented Jul 10, 2020

Not saying it's wise or unwise, or even something we've continued to actually do, but the fs docs do say:

The fs module provides an API for interacting with the file system in a
manner closely modeled around standard POSIX functions.

@coreyfarrell
Copy link
Member

Not saying it's wise or unwise, or even something we've continued to actually do, but the fs docs do say:

The fs module provides an API for interacting with the file system in a
manner closely modeled around standard POSIX functions.

In the case of the default options I think this is still (mostly?) true. I don't have any objection to accepting an option to deviate from the normal mode of operation. I also don't object to moving recursive functions to separate names, though this will increase the number of functions exported by fs and fs/promises. One concern I have is that when we propose recursive variants of additional functions we'll face resistance to further expansion of the namespaces.

@RyanZim
Copy link
Contributor

RyanZim commented Jul 10, 2020

@RyanZim I don’t understand if rmdir (as it stands) is problematic for fse. It didn’t sound like that was the main reason the original PR was closed—is that correct?

rmdir as it stands is not problematic for fse; the discussion just clarified in my mind that the recursive option is a clumsy, unintuitive design, for anyone using Node.

I didn't highlight this well in my earlier post; to be perfectly clear, you can delete a file with fs.rmdir(file, { recursive: true }). Now that's unintuitive; but you're incentivised to write code like that, because fs.unlink will throw if the file doesn't exist, and will throw if your antivirus happens to be scanning the file at the time (retries fix this).

@boneskull
Copy link
Contributor

@Trott I stand corrected. maybe that should be removed as it’s no longer true.

there’s unfortunately nowhere else to put these recursive functions as adding a new builtin module (which is what we originally wanted to do, I think, and @iansu mentioned) would be a nonstarter for unrelated reasons.

@iansu
Copy link
Contributor

iansu commented Jul 10, 2020

We were just discussing this issue in the Tooling Group meeting today. We do agree that being able to use fs.rmdir(file, { recursive: true }) to directly delete a file seems like a bug. That's something that should be easy to fix. If we were to do that would that alleviate some or all of the concerns people have raised here?

@RyanZim
Copy link
Contributor

RyanZim commented Jul 11, 2020

Correction on my earlier post, I said:

It [the recursive option] adds retry logic, and enables the retryDelay option.

This is partially untrue. Retry logic is off by default (unlike rimraf's behavior), by virtue of maxRetries being 0 by default. However, the maxRetries & retryDelay options have no effect without recursive enabled, so retrying without recursive is impossible.

@boneskull
Copy link
Contributor

if unlink can fail for the same reason on win32, perhaps that would want a maxRetries flag too. but I don’t see anything particularly heinous about an option that needs a second option to do anything. silently ignoring it is maybe a problem, but it just depends how much type validation node wants to be doing.

@bcoe
Copy link
Contributor

bcoe commented Jul 11, 2020

@RyanZim @CxRes I think it would be fairly disruptive to remove the recursive option at this point as, although listed as experimental in the docs, this feature has been in the wild for several node versions and I expect has quite a bit of adoption.

We originally explored introducing rimraf as its own method and could not reach consensus, @cjihrig unblocked the logjam on the discussion by putting forward a candidate PR with the functionality as it is today. tldr; both approaches were explored.

It does not throw ENOENT if the path does not exist, somewhat analogous to rm -f.

I think if it helped node-fs-extra with adoption, it would be worth us considering changing this behavior? such that we throw on ENOENT?

EDITED: basically, given it's experimental, I think we should be open to considering breaking changes to rimdir recursive, e.g., error handling, I'm less convinced we should split it off of rmdir, just in the name of matching POSIX.

@RyanZim
Copy link
Contributor

RyanZim commented Jul 11, 2020

Started composing this before @iansu commented; hence the slight disconnect.


I watched the discussion about this at the @nodejs/tooling meeting today (recording: https://www.youtube.com/watch?v=ndCLeX40g6Q); a few comments:

  1. The point was brought up about why this was not raised sooner. I was not aware of the movement to add recursive to rmdir while it was in progress, had I been aware, I would have contributed to the discussion then. I saw it after the fact, but wasn't convinced it was fundamentally broken enough to justify re-hashing an already long discussion. The fs-extra issue with @CxRes clarified in my mind what the problems were, and justified raising my concerns. Wish I could have raised it earlier, but that's where the facts stand.

  2. I am personally not opposed to having rmdir deviate from strict POSIX behavior. My issue is not that rmdir has a recursive flag, my issue is that recursive flag does a lot more than make rmdir recursive. I tend to agree with @coreyfarrell that POSIX-named functions should generally be close to POSIX by default, though.

  3. It was mentioned on the call that the fact that recursive rmdir can delete a file sounds like a bug. I agree.

  4. The fact that the recursive option makes rmdir ignore ENOENT also buggy IMO.

  5. If we want to fix recursive rmdir, we should not allow it to delete files; and we should not ignore ENOENT or it should be an option, off by default. Retry options should probably work without recursive too (more on that at the end of this post).

  6. As was pointed out, though, npm stats overwhelmingly show that what people actually want from a tooling perspective is rimraf. A thing that takes any path you throw at it, removes the path if at all possible, and doesn't error out if at all possible. Just like rm -rf. They want to ensure a path does not exist. If we merely fix recursive rmdir, we improve little from a tooling perspective; the ecosystem will most likely create a modern "rimraf" that looks something like this:

    function rimrafSync(path) {
      try {
        fs.rmdirSync(path, { recursive: true, maxRetries: 3 })
      } catch (error) {
        if (error.code === 'ENOENT') return
        if (error.code === 'ENOTDIR') fs.unlinkSync(path) // maybe retry logic here too?
        else throw error
      }
    }

    Simpler than the current rimraf, sure; but won't remove the need to install a new module.

  7. Now, it could be argued that most of the time, you know whether you're deleting a file or a directory. That's possibly true. Even then, with a fixed recursive mkdir, you'll still have to do something like fs.rmdirSync(path, { recursive: true, force: true, maxRetries: 3 }). That's a fair-sized function call! Also, this will necessitate adding a force option (and perhaps retry functionality) to fs.unlink, for consistency; unless we go the route of requiring userland to try/catch wrap all their calls to both rmdir & unlink, which is even more verbose for the developer.

  8. As we've seen, fixing recursive rmdir is an option, but it has its drawbacks. The alternative is to deprecate/remove recursive rmdir, and add the behavior as a separate method. This is my preferred solution. IMO the function's defaults should align with rimraf's current behavior; how many options you'd want to add to turn off various aspects is debatable (FWIW, the original fs-extra issue that gave rise to this thread was a debate about adding an option to fs-extra's fork of rimraf to disable the swallowing of ENOENT; rimraf doesn't support this).

  9. The objection raised to moving it to its own method is that we would break some people's code. I agree, but I'd like to point out that properly fixing recursive mkdir will also break a non-trivial amount of code, since ENOENTs will start throwing. (Dunno how many people are deleting files with it at this point.) Personally, I'd just say "we're breaking stuff anyhow, so let's totally break stuff and move forward afresh," but I fully understand why others may disagree.

  10. Another way to handle this would be to rename the recursive option to something that properly carries the idea of making rmdir recursive and swallowing ENOENT (e.g. recursiveForce; we can probably find a better name than that). Personally not a fan of this approach, since it's just as breaking as moving the logic to a separate function. If we're breaking anyhow, I prefer a separate function since that gives us the chance to delete files with the same method.

Those are my thoughts; things don't have to be completely done my way, but at bare minimum, we need to stop deleting files with recursive rmdir, and think seriously about whether swallowing ENOENT as an implicit consequence of recursive option makes much sense (IMO, it doesn't).

Retry logic is less of an issue; should maybe consider porting it to plain rmdir and/or unlink, but that's another discussion for another thread. (that discussion potentially opens another can of worms about trying to put graceful-fs's EMFILE retry logic in core).

@CxRes
Copy link
Author

CxRes commented Jul 11, 2020

I was listening to the @nodejs/tooling meeting as well. One suggestion, that seems to have been not discussed is the possibility of splitting fs into namespaces (rather than staring an entirely new module). For example, you may have a fs.tree that keeps together recursive functions (even if a few functions get aliased for legacy reasons).

Also what would be a good way to get in touch with the @nodejs/tooling group? There is another inconsistency issue that's driving me nuts that falls squarely in that domain.

@Trott
Copy link
Member

Trott commented Jul 11, 2020

what would be a good way to get in touch with the @nodejs/tooling group?

One possibility is to open an issue in the https://github.com/nodejs/tooling repository.

@coreyfarrell
Copy link
Member

  1. As was pointed out, though, npm stats overwhelmingly show that what people actually want from a tooling perspective is rimraf.

@RyanZim That's possible but I don't think we can say this based on npm stats alone. A project using rimraf does not show that they need/want rimraf over recursive fs.rmdir, in many cases they simply need to support Node.js <12.10.0.

@RyanZim
Copy link
Contributor

RyanZim commented Jul 14, 2020

At the moment, recursive rmdir is basically the same as rimraf, without old version support. My point was that we have no evidence that people want a "fixed" recursive rmdir.

@coreyfarrell
Copy link
Member

@RyanZim sorry for misunderstanding. rimraf also supports glob patterns where recursive rmdir does not so I thought that was what you were talking about.

bcoe added a commit to bcoe/node-1 that referenced this issue Oct 1, 2020
bcoe added a commit that referenced this issue Oct 3, 2020
Refs: #34278

PR-URL: #35171
Reviewed-By: Christopher Hiller <boneskull@boneskull.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
danielleadams pushed a commit that referenced this issue Oct 6, 2020
Refs: #34278

PR-URL: #35171
Reviewed-By: Christopher Hiller <boneskull@boneskull.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
@bcoe
Copy link
Contributor

bcoe commented Oct 11, 2020

After discussions with the tooling group, and with the TSC, we ultimately did decide to introduce fs.rm, which is modeled after POSIX's rm.

We are however, not opting to remove the rmdir/recursive behavior immediately, as this would be too disruptive to users. Instead we're:

  1. removing the experimental language from rmdir/recursive, to indicates folks should expect bug fixes (and feel comfortable using the feature).
  2. we've introduced fs.rm which we can direct users towards instead of rmdir/recursive, if they want permissive behavior, i.e., for it not to throw on missing paths, or file paths.
  3. we've added a "docs only" deprecation, pointing folks towards this fs.rm.
  4. we're in the process of adding a runtime deprecation method, to indicate to users that they should stop relying on the permissive form of rmdir/recursive.
  5. in node@16, we intend to throw on nonexistent paths, and file paths, when using the fs.rmdir method with recursive=true.

@bcoe bcoe closed this as completed Oct 11, 2020
andersk added a commit to andersk/node that referenced this issue Nov 16, 2020
This was missed in commit 35b17d9.

Refs: nodejs#34278
Refs: nodejs#35171

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
nodejs-github-bot pushed a commit that referenced this issue Nov 19, 2020
This was missed in commit 35b17d9.

Refs: #34278
Refs: #35171

Signed-off-by: Anders Kaseorg <andersk@mit.edu>

PR-URL: #36131
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
codebytere pushed a commit that referenced this issue Nov 22, 2020
This was missed in commit 35b17d9.

Refs: #34278
Refs: #35171

Signed-off-by: Anders Kaseorg <andersk@mit.edu>

PR-URL: #36131
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
BethGriggs pushed a commit that referenced this issue Dec 10, 2020
This was missed in commit 35b17d9.

Refs: #34278
Refs: #35171

Signed-off-by: Anders Kaseorg <andersk@mit.edu>

PR-URL: #36131
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
BethGriggs pushed a commit that referenced this issue Dec 10, 2020
This was missed in commit 35b17d9.

Refs: #34278
Refs: #35171

Signed-off-by: Anders Kaseorg <andersk@mit.edu>

PR-URL: #36131
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
BethGriggs pushed a commit that referenced this issue Dec 15, 2020
This was missed in commit 35b17d9.

Refs: #34278
Refs: #35171

Signed-off-by: Anders Kaseorg <andersk@mit.edu>

PR-URL: #36131
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Ben Coe <bencoe@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
joesepi pushed a commit to joesepi/node that referenced this issue Jan 8, 2021
Refs: nodejs#34278

PR-URL: nodejs#35171
Reviewed-By: Christopher Hiller <boneskull@boneskull.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

No branches or pull requests

8 participants