Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Security of node_modules and package.json Resolution #43192

Closed
arhart opened this issue May 23, 2022 · 17 comments
Closed

Improve Security of node_modules and package.json Resolution #43192

arhart opened this issue May 23, 2022 · 17 comments
Labels
feature request Issues that request new features to be added to Node.js.

Comments

@arhart
Copy link
Contributor

arhart commented May 23, 2022

What is the problem this feature will solve?

It is easy to accidentally allow another user to influence what code node loads and executes. Details can be found at HackerOne reports 1564437 (CommonJS module loading), 1564444 (ECMAScript module resolution), and 1564445 (package.json). While these behaviors are documented, the security implications are easy to overlook.

Insecure patterns around these features are in common use. For example, any combination of the following two circumstances is likely to be insecure:

  1. Attempting to require or dynamically import a module which is not guaranteed to exist in the expected location, for example by doing ANY of these:
    1. Intentionally attempting to load optional dependencies or optional peer dependencies which are not installed
    2. Forgetting to install dependencies before running a node script
    3. Creating a typo when developing a node script
  2. Running a node script when some parent directory might contain a node_modules directory that might be under control of another user, for example by doing ANY of these:
    1. Running a node script on Windows (any authenticated user is likely to have permission to create C:\node_modules, D:\node_modules, etc.)
    2. Running a node script from under a shared directory (ex: /tmp or a network share with similar permissions)
    3. Running a node script from under a home directory where an account might get created with a sibling home directory called "node_modules"

What is the feature you are proposing to solve the problem?

I suggest a multi-phase approach which can be modified by further discussion and feedback. Because there are security implications, I'd like improvements to be made as soon as possible. Because of both security and compatibility implications, a cautious approach is also desirable. Here are some potential phases:

  1. Discuss it publicly
    1. Are there other, similar security concerns that should be considered?
    2. Are there ideas for better approaches for addressing these concerns?
  2. Clearly document the existence of security concerns around these common features. There is precedent for documenting and emphasizing security warnings
    1. CommonJS module loading Loading from node_modules folders and require
    2. ECMAScript module resolution Terminology around bare specifier resolution and Resolution algorithm
    3. Program entry point
    4. Determining module system
  3. Document in more detail what the security concerns are and what steps a developer, packager, or user can take.
    1. Perhaps create a section of the documentation to discuss this and other security considerations in more detail. Perhaps it could be a sibling to Usage and example
    2. Should we document what files are loaded by relative paths and what files are loaded by absolute paths? Perhaps explicitly state that if modules are moved out from under node while it is searching for or loading them, it explicitly does not define what happens?
  4. Update behavior and corresponding documentation. There is precedent for security reasons for shipping breaking changes with an option to opt-in to the insecure behavior. See 10.19.0 during Active LTS

Proposed behavior update:

  1. Look for node_modules and package.json as immediate siblings of the scripts in question, rather than starting there and walking arbitrarily far toward the root.
  2. Add an --insecure-loader option to re-enable the old behavior.

Questions:

  1. Combined with symbolic links, hard links, and junctions, does this give package managers the needed flexibility?
  2. Is assuming that ./node_modules and ./package.json are trusted a reasonable burden to place on someone wanting to run ./script.js?

What alternatives have you considered?

  1. I initially reported these through the Node.js security processes, but they were considered "a long-standing and widely-known (or at least far from newly-discovered) problem". I was encouraged to discuss them publicly in the issue tracker.
  2. Disabling searching for node_modules and package.json relative to the running script is perhaps simpler and safer, but would presumably require some replacement to support existing use cases.
  3. A config file in a user's home directory could specify trusted search paths, but if we can find a workable solution without adding additional configuration files, that avoids some complexity.
  4. A config file could specify a ceiling for search paths, but again that introduces additional configuration.
  5. We could completely stop searching for node_modules and only look where a package.json file instructs, but that would require adding instructions to package.json and very nearly make package.json mandatory.

Edit: Replaced "from" with "from under" to clarify that the script need not be immediately in a shared directory or immediately in a home directory.

@arhart arhart added the feature request Issues that request new features to be added to Node.js. label May 23, 2022
@Trott
Copy link
Member

Trott commented May 23, 2022

@nodejs/modules

@guybedford
Copy link
Contributor

guybedford commented May 23, 2022

Node.js already has a policy manifest system for clearly restricting and defining what code may be loaded, and even contextualizing which parent scopes permit that loading.

Are there any specific security concerns here which policy does not currently address?

Better education and tooling workflows around policies is definitely something to consider here from a security perspective.

@Macias131987

This comment was marked as spam.

@Trott

This comment was marked as resolved.

@arhart
Copy link
Contributor Author

arhart commented May 24, 2022

@guybedford Thanks for pointing out the policy manifest system. I wasn't aware of it, and it looks interesting. My (very early) impression is that it is not a suitable replacement for the following reasons:

  1. It is still experimental, and thus "Use of the feature is not recommended in production environments."
  2. It sounds complicated and unlikely to be used for every script. The opening section recommends involving multiple user ids. node_modules are used by code with all levels of process around them such as
    1. users just learning to program
    2. quick, one-off scripts for converting data from one format to another
    3. production code with formal processes around design, test, and deployment
  3. I don't see a way to say "allow any resource, but don't search for it above the starting directory".
  4. I don't see anything that prevents searching for a package.json
  5. It seems a lot to ask authors of every one-off script it include
    1. the script itself
    2. a policy.json
    3. a package.json
  6. It doesn't provide a more-secure default than what we have today

@bmeck
Copy link
Member

bmeck commented May 24, 2022

I'm glad to see someone taking interest in these topics. We should discuss the needs prior to solutions hence the leaning towards seeing what needs policies currently do fill and thanks for a list of concerns about them.

  1. Per experimental status: this could be changes, but it is mostly a lack of overall contributors that has me not pushing towards moving it out of experimental, the overall feature has been fairly stable for years. The only thing really of note missing these days is bringing path patterns like in imports and exports fields in package.json.

  2. Multiple user ids and OS level protections are required for general sandboxing needs, the docs are being quite honest about needing more security that node cannot provide for full secure sandboxing. This is mentioned in several docs that spread around the security minded contributes as well as the new effort to formalize a threat model.

  3. "don't search" is a bit different from "don't allow" (important for type:"module" for example). If you don't want to allow something above a directory you can just use a scope with integrity: true but lacking cascade: true. Also, to be clear Node by default uses realpaths to avoid a variety of concerns so searching within a directory is a bit hard when using things like pnpm that link to outside of the directory or various shared node_modules/ workflows.

  4. package.json are subject to allow listing in the policy just like other files loaded. Perhaps we could enhance the docs.

  5. I'd agree, but right now package.json is not required for using scripts or policies. You can use node_modules/ without a package.json even, it is just rare.

  6. Due to back compat, more secure defaults will likely need to be extremely careful about what they do, or be opt-in.


Per the proposal:

Look for node_modules and package.json as immediate siblings of the scripts in question, rather than starting there and walking arbitrarily far toward the root.

How does this deal with things in bin/foo.js or lib/foo.js. It likely would have to do some kind of delegation. This is brought up some in https://docs.google.com/document/d/1y_o9WjagX-TYUcxqfvecYi4x65v9bsSd6Sb-P3Cc4F4/edit .

Add an --insecure-loader option to re-enable the old behavior.

Likely this would need to default to being used unless we could make a big break to the ecosystem.


Per alternatives:

I initially reported these through the Node.js security processes, but they were considered "a long-standing and widely-known (or at least far from newly-discovered) problem". I was encouraged to discuss them publicly in the issue tracker.

thanks for bringing up discussion on this!

Disabling searching for node_modules and package.json relative to the running script is perhaps simpler and safer, but would presumably require some replacement to support existing use cases.

Yup, even this proposal changes various workflows to no longer be supported and that needs more discussion.

A config file in a user's home directory could specify trusted search paths, but if we can find a workable solution without adding additional configuration files, that avoids some complexity.

Home directory is problematic because not all programs have the same acceptable privileges. Any configuration needs to be localized to an application.

A config file could specify a ceiling for search paths, but again that introduces additional configuration.

Additional configuration isn't a fatal aspect in security. It would be good to know if the amount of configuration is unacceptable / why / how to mitigate.

We could completely stop searching for node_modules and only look where a package.json file instructs, but that would require adding instructions to package.json and very nearly make package.json mandatory.

This likely is incompatible with peerDependencies and some hoisting various package managers do.


Take away:

  1. It seems really nice to have the idea of preventing searching outside of a directory which is a new feature not currently covered by policies.
  2. I'm very skeptical of coming with the solution so suddenly given it would break existing workflows and is being proposed as a default rather than opt-in.
  3. It seems like a great enhancement to have a better way to specify policy not with a file but with alternative mechanisms (can already be a data: URL but that might be too arcane).

@arhart
Copy link
Contributor Author

arhart commented May 25, 2022

Thank you for the thoughtful response. I'd like to see the node ecosystem as a whole improve and have confidence that my projects, specifically, aren't accidentally vulnerable.

  1. "don't search" is a bit different from "don't allow" (important for type:"module" for example). If you don't want to allow something above a directory you can just use a scope with integrity: true but lacking cascade: true. Also, to be clear Node by default uses realpaths to avoid a variety of concerns so searching within a directory is a bit hard when using things like pnpm that link to outside of the directory or various shared node_modules/ workflows.

Can you expand on this -- both the difference between "don't search" and "don't allow" as well as how integrity: true without cascade: true would work? Maybe an example could clarify.

  1. package.json are subject to allow listing in the policy just like other files loaded. Perhaps we could enhance the docs.

Thanks. My impression from the docs was that policies only applied to require() and import, but package.json can load before the module system is even determined.

  1. Due to back compat, more secure defaults will likely need to be extremely careful about what they do, or be opt-in.

Certainly! At the same time, I find the current defaults surprisingly insecure. Requiring opt-in places the burden on each such application or library, but more-secure defaults could fix most of that in one motion.


Per the proposal:

Look for node_modules and package.json as immediate siblings of the scripts in question, rather than starting there and walking arbitrarily far toward the root.

How does this deal with things in bin/foo.js or lib/foo.js. It likely would have to do some kind of delegation. This is brought up some in https://docs.google.com/document/d/1y_o9WjagX-TYUcxqfvecYi4x65v9bsSd6Sb-P3Cc4F4/edit .

I initially read this as asking about nested structures within dependencies, but it could also apply to structures within a top level project. I had envisioned links solving this. Package managers might possibly be able to automatically adapt nested structures within dependencies. Requiring projects to do the same seems like too much. Such links would be scattered throughout a project. They would be at paths not touched by a package manager, clutter the project, and likely break other tools. Also, projects can simultaneously be standalone entities and dependencies of larger projects.

Add an --insecure-loader option to re-enable the old behavior.

Likely this would need to default to being used unless we could make a big break to the ecosystem.

Yes, and the proposal would break more than I see package managers being able to automatically fix.


A config file could specify a ceiling for search paths, but again that introduces additional configuration.

Additional configuration isn't a fatal aspect in security. It would be good to know if the amount of configuration is unacceptable / why / how to mitigate.

For things like checking file hashes or sophisticated sandboxing, I feel that requiring additional configuration is reasonable.

For something as simple as require('bar') loading something entirely unexpected (even as documented), I think we need a better default. I don't think I have one.


Take away:

  1. It seems really nice to have the idea of preventing searching outside of a directory which is a new feature not currently covered by policies.
  2. I'm very skeptical of coming with the solution so suddenly given it would break existing workflows and is being proposed as a default rather than opt-in.
  3. It seems like a great enhancement to have a better way to specify policy not with a file but with alternative mechanisms (can already be a data: URL but that might be too arcane).
  1. Could you elaborate on how this idea compares / contrasts with what is found in policies? Having read the policies documentation again, I'm still pretty shaky on what it does / does not do.
  2. While I'm sensitive to breaking existing workflows, I see this as a widespread local privilege escalation vulnerability. I found the security consequences of the status quo surprising, and I think other developers might as well.
  3. How easy could we make this (or another) opt-in mechanism? If it's nearly as easy as upgrading node, then it's not a better default, but it could get close.

Things I'd like to see in a proposed solution:

  1. It should be easy to use correctly without opening applications to arbitrary code execution (the current behavior fails badly for optional peer dependencies and other scenarios).
  2. It should be sufficiently compatible to not require opt-in. If we can't get this, the opt-in should be as easy as possible.
  3. It should avoid confusion. For example, resolving require('bar') in ./node_modules/foo.js differently when foo.js is loaded from node ./foobar.js rather than node ./node_modules/foo.js might be confusing.
  4. It should be simple and robust when node processes create other node processes, possibly with non-node-processes in between.
    1. This probably excludes relying exclusively on command-line options and possibly environment variables as well.
    2. An opt-in from package.json might work here, but since node only looks for the nearest parent package.json and dependencies tend to already have one, I'm skeptical.
    3. Maybe an opt-in from something like a policy.json could help. Could we avoid re-introducing an analogous problem with policy.json files up to the root?

@arhart
Copy link
Contributor Author

arhart commented Jun 6, 2022

The following policy.json file roughly sets a ceiling for dependencies. "./" is relative to the location of the policy file. A file within "./" may request any dependency. The normal search is performed, but the integrity check for files found within "./" succeeds and the integrity check for files found outside of "./" fails.

{
  "scopes": {
    "./": {
      "integrity": true,
      "dependencies": true
    }
  }
}

It is possible to pass --experimental-policy via the NODE_OPTIONS environment variable. That doesn't allow a policy to be enforced across processes that spawn child processes after scrubbing the environment, but it allows some degree of policy without passing an additional option to every node invocation.

@arhart
Copy link
Contributor Author

arhart commented Jun 7, 2022

@guybedford @bmeck policies protect from the outside -- whatever calls node passes --expirimental-policy or sets an equivalent value in the NODE_OPTIONS environment variable. Are you aware of work on something similar from the inside -- allowing a script to require('policy').addPolicy(...) or something similar?

There's a suggestion of referencing a policy from package.json#policy in this linked document. Is that progressing?

@bmeck
Copy link
Member

bmeck commented Jun 8, 2022

require('policy').addPolicy(...) style APIs do not work well with ESM since ESM will link prior to allowing any such API to be called/enforced. No programmatic API is planned, you can spin up a child process if needing to do that for now at least.

There's a suggestion of referencing a policy from package.json#policy in this linked document. Is that progressing?

There is currently no effort to fund that policy work directly since the Node.js project is relying on volunteer work for now. If you would like to work on such a feature; I think such a feature wouldn't be too hard to implement but likely needs the design refined further by testing in the wild.

@arhart
Copy link
Contributor Author

arhart commented Jun 8, 2022

require('policy').addPolicy(...) style APIs do not work well with ESM since ESM will link prior to allowing any such API to be called/enforced. No programmatic API is planned, you can spin up a child process if needing to do that for now at least.

@bmeck When you say "link", is that a technical term here distinct from when the code runs?

I suspect using such an API with ESM would require indirection. For policy available synchronously (such as that embedded directly in source), a single import at the top of the program-entry-point would run before other imports. For policy only available asynchronously, dynamic import() could be used to ensure the imports run in the required order.

@bmeck
Copy link
Member

bmeck commented Jun 8, 2022

is that a technical term here distinct from when the code runs?

Yes, all imports have their source text loaded and exports/imports bound prior to any evaluation in the graph. So even if there is a top level import at the entry point to a program all dependencies of the program are already loaded in the default ESM workflow.

@bmeck
Copy link
Member

bmeck commented Jun 8, 2022

I'd note that this linking is required by ESM spec, not Node.js itself and is required for various validations around early errors, circular dependencies, and now import assertions.

@arhart
Copy link
Contributor Author

arhart commented Jun 11, 2022

@bmeck thanks. I see what you mean about linking.

A config file could specify a ceiling for search paths, but again that introduces additional configuration.

Additional configuration isn't a fatal aspect in security. It would be good to know if the amount of configuration is unacceptable / why / how to mitigate.

It seems really nice to have the idea of preventing searching outside of a directory which is a new feature not currently covered by policies.

Returning to this idea, maybe whenever we walk toward the root looking for a node_modules directory or a package.json file, we could also look for a node_ceiling file. If such a file is found, it could indicate to stop walking toward the root. I'm playing with code to see what would need to be touched. I've identified the following so far:

  1. getPackageScopeConfig() in esm/resolve.js walks toward the root looking for package.json. It's already a while loop with a break when the root is reached. Adding a ceiling check could also break.
  2. packageResolve() in esm/resolve.js walks toward the root looking for node_modules/${packageName}/package.json. The while condition checks for root. A ceiling check before updating packageJSONUrl could break.
  3. readPackageScope() in cjs/loader.js walks toward the root calling readPackage() which looks for package.json. A while condition checks (separatorIndex > rootSeparatorIndex). A ceiling check could break or return false.
  4. Module._nodeModulePaths() in cjs/loader.js generates paths progressively closer to the root ending with \node_modules or /node_modules. There are Windows and non-Windows versions. _nodeModulePaths() returns a paths array; it doesn't perform IO, and it seems good to keep it that way. Module._resolveFilename() calls Module._findPath(). _findPath() is called elsewhere, including run_main.js and .eslintrc.js. It's also monkey-patchable, which is done in .eslintrc.js. If _findPath() is an appropriate location, a ceiling check could be added, but it is more complicated. I have some code which looks for a ceiling until a path to node_modules at the root is found. It also considers a root to have been found when absoluteRequest is true.

Thoughts on this approach? I don't mind sharing the code, but it's not cleaned up for a pull-request yet.

arhart added a commit to arhart/node that referenced this issue Jul 13, 2022
Stop searching for package.json or node_modules when a node_ceiling
file is found.

Refs: nodejs#43192
Refs: nodejs#43368
@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2022

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot added the stale label Dec 9, 2022
@arhart
Copy link
Contributor Author

arhart commented Jan 6, 2023

Closing. I'd still like to see this. Maybe a proposal for ambient configuration will gain traction and make this easier.

@arhart
Copy link
Contributor Author

arhart commented Jan 6, 2023

That didn't mark it closed.

@github-actions github-actions bot removed the stale label Jan 6, 2023
@bnoordhuis bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js.
Projects
Status: Pending Triage
Development

No branches or pull requests

6 participants