Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vendor official NPM modules #28062

Closed
benjamingr opened this issue Jun 4, 2019 · 29 comments
Closed

Vendor official NPM modules #28062

benjamingr opened this issue Jun 4, 2019 · 29 comments

Comments

@benjamingr
Copy link
Member

benjamingr commented Jun 4, 2019

tl;dr; core vendoring @nodejs/package_name modules.

Hey, for detecting esm syntax as well as other stuff we have discussed vendoring official Node.js modules on the @nodejs NPM space which to my understanding the org owns.

I would like us to try this and I think it could be a really interesting thing to explore. Ideally this would allow us to keep core small

Is this something people are interested in? Should we do this?

@nodejs/collaborators @nodejs/tsc

@MylesBorins
Copy link
Member

Is this different from #21551?

Based on current conversations at TC39 for a standard library and within wasi I think we might want to consider using the nodejs: namespace for built ins.

Are you instead suggesting we publish to @nodejs/ and then vendor in those modules?

How would you imagine the two different methods for builtins working together?

@benjamingr
Copy link
Member Author

If I understand correctly "core modules go under a namespace" means we publish our own core modules under @nodejs

This is about publishing official modules to NPM rather than maintaining that code in the node repo and publishing it with every installation of Node.js

@SMotaal
Copy link

SMotaal commented Jun 5, 2019

@MylesBorins I think we're discussing two separate dimensions of namespaces:

  1. NPM namespace until that changes is roughly implied by npm i @nodejs/… which leads to a package folder node_modules/@nodejs/…

  2. Builtin module namespaces, while that takes shape, is roughly implied by import 'nodejs:…' which could very well be the only way to safely signal to Node.js to load (or even download) the official @nodejs for that.

Technically there is nothing preventing a user today from having import from '@nodejs/not-official' work today — thus the advantage of the proprietary implementation-specific builtin syntax.


A note worth making here is that something like unpkg.com/@nodejs/… opens up a world of possibilities for long-standing browser interoperability feats.

@MylesBorins
Copy link
Member

So I think it is important for us to figure out what namespace mechanism we want to use for built-ins before deciding to use the npm scope. We likely do not want to conflate the two.

  • If we use @nodejs/ internally rather than nodejs:, we likely don't want to use it for ecosystem modules as well, I believe it would cause confusion
  • If we use @nodejs/ for modules published to a registry and nodejs: for builtins I think we would need to be really careful about what we put where and how we educate folks on which to use
    • would we have both nodejs:fs and @nodejs/fs?

Definitely think this is a very interesting space to explore... @nodejs/${builtin} being published as a standalone alternative to using the nodejs:${builtin} could offer us the ability of folks using more modern versions of our APIs in older versions of node or in other environments via polyfills.

That being said, we need to be very thoughtful about this... I can see this exploding in complexity very quickly and getting very confusing for people about where to get APIs and what are the best practices.

perhaps this would be a good thing to do some ecosystem outreach to library authors, application developers, and educators.

@SMotaal
Copy link

SMotaal commented Jun 5, 2019

Certainly a lot to consider.

How does everyone feel about the though that a builtin being more like the platform provided versus third-party ones, so that @nodejs/ and nodejs: are both different representations of Node.js platform modules, harmonized by the runtime?

So the runtime can determine what is in it's builtin core and what is an extensibility feature, albeit one of the finite set of which builtin core is fully aware of and knows how to properly bootstrap.

@joyeecheung
Copy link
Member

joyeecheung commented Jun 5, 2019

AFAIK this issue is proposing @nodejs/ npm modules - the major difference between this and something like #21551, as I understand, is that they have to be installed from a registry (for now), can be versioned separately from the binary and can bump the major version with less hassle. Something like readable-stream already practically exist, this is just proposing using a new namespace for future modules to avoid conflicts with other npm modules.

On the other hand nodejs: and the standard library proposal is still more about adding builtins that users get out of the box from the binary.

I personally don't see anything blocking @nodejs/ now. It just means the module is maintained in this organization, similar to what is implied by other npm namespace. nodejs: is new, is not settled, is not going to happen soon, and is not yet understood by the ecosystem while @nodejs/ is just following an existing convention. If anything happens in the ES standard, that definitely has to work with the existing @something/ mechanism, not the other way around.

@MylesBorins
Copy link
Member

@joyeecheung I think the only challenge is are we 100% we want to use nodejs: for builtins. In #2155 we had objections and the current implementation uses @nodejs/... and I personally think it would be very confusing if @nodejs/ was used for both

@joyeecheung
Copy link
Member

joyeecheung commented Jun 5, 2019

@MylesBorins For this proposal, I think our challenge is to make sure we 100% will not use @nodejs/ for builtins? So that basically comes down to if anyone objects to a situation like this:

  • @nodejs/ for "offical" npm modules
  • nodejs: or something else for future builtins.

If we have a new module before the ES proposal enters stage 4 (I think that's very likely for quic), we may need to settle down on a home-grown prefix like nodejs:quic before it is coined in the proposal. But even then, the downside would be that we'll need to deal with 3 builtin specifiers (unprefixed, nodejs:quic, standard one) and need to provide transition from others to the standard one. I don't think that's too difficult to deal with, and there would probably be plenty of time for the transition to happen, but others might think otherwise.

@cjihrig
Copy link
Contributor

cjihrig commented Jun 5, 2019

Are we expecting users to install and use, for example, @nodejs/fs? If so, how would reaching into the bindings/internals be handled? We've put in a lot of work over the past few years to hide these things for a reason.

@joyeecheung
Copy link
Member

@cjihrig Good point, we probably need to make a decision on whether @nodejs/something is able to reach into internals or not. I think it would be a bit error-prone if we modify the user land module loader for this kind of special access - if we don't do that, this won't be able to address the issues for #27808 but may still be sufficient for something like whatwg-stream

@cclauss
Copy link
Contributor

cclauss commented Jun 5, 2019

The message that I got was that Node.js only "vendors in" dependencies. While I am not a fan of vendoring in, it was the strong consensus that "One of the goals of the build has always been to have as few external dependencies as possible, to decrease friction for builders and maintenance hassle for maintainers.". Are we moving away from that?

@joyeecheung
Copy link
Member

joyeecheung commented Jun 5, 2019

@cclauss I think that's a bit off-topic. This proposal is about maintaining npm modules which, IIUC,

  • Will not be maintained in this repo
  • Will not be included in the build

So having those does not affect our build or dependency management because they are not dependencies. Take readable-stream for an example.

@ljharb
Copy link
Member

ljharb commented Jun 5, 2019

It would be pretty bad i think if npm-installed modules somehow had superpowers; if it needs access to something that a userland module can’t access, then i don’t think it belongs published to npm.

@SMotaal
Copy link

SMotaal commented Jun 5, 2019

@ljhard — I think of a few pitfalls of one pipeline that could be improved here:

  1. Affording space to expand (or carve out) official but "auxiliary" features to the platform (I personally think back to the days of MatLab and its frameworks if that helps)

  2. Separating out noise of asymmetrically evolving "auxiliary" features offered by the platform for highly specialized tasks.

  3. Clearly drawing the boundaries of what is core and what is sometimes utilized by core for such specialized tasks.

So I can see this leading to a better alignment of versioning where core tracks with V8, without the noise generated from specialized tasks (not to mention the extra deadweight).

That said, the only decision for today is simply, do we want to experiment with this given the opportunity to do so with something that has no implications of its own yet?

@ljharb
Copy link
Member

ljharb commented Jun 5, 2019

I'm not advocating one pipeline, nor more than one - i'm saying that things published to npm should not get privileged access to internals.

@SMotaal
Copy link

SMotaal commented Jun 5, 2019

I was thinking that they would not — ie they would be bootstrapped by core only if they are verified, which imho is equally reliable from a safeguarding standpoint to the current monolithic distributions includes (like /usr/local/lib/node_modules/npm) — obviously minus the additional bootstrapping bit.

// package exposes something like leaving it out of the fold until called
export function bootstrap({x, y, z}): void

@ljharb
Copy link
Member

ljharb commented Jun 5, 2019

It's javascript on disk, what would "verified" even mean? the npm that ships with node doesn't have any privileged access, for example.

@joyeecheung
Copy link
Member

joyeecheung commented Jun 5, 2019

There is policy that we can use to verify user land modules, but with the entire CJS module loader and its internals e.g. the cache exposed and heavily mokey-patched in the ecosystem, I can imagine all sorts to tricks to bypass the checks if we allow any special access in the user land loader when policy is unused.

@bmeck
Copy link
Member

bmeck commented Jun 5, 2019

Indeed, policies are not meant to stop mutation but rather to ensure only whitelisted code can be loaded prior to mutation. Locking down core to prevent mutation would be a different feature akin to --frozen-intrisics. As such they are not likely to be suitable. A more robust solution would likely to be embedding the modules in the executable entirely similar to what we do with 'internal/'

@SMotaal
Copy link

SMotaal commented Jun 5, 2019

@joyeecheung I think it is important to take smaller steps and only ever consider privileged access after the not so much battles first iron out — There is no urgency for effectively addressing the issues but certainly the vigilance in discussing longer term concerns is a very good indication that everyone at least thought that far ahead for different reasons.

embedding the modules in the executable entirely

@bmeck I think from personal experience working on platform-enhanced code, and since one big aspect of being a published module here is that it potentially imports outside of Node.js… I was thinking that we can separate a package into a symbolic interface that can be even embedded in core (that which does not require too many updates) but bulk of the package with all implementation distributes separately.

In a sense, you can bootstrap using the browser-specific interface or when in Node.js, core will safely bootstrap such packages only when the installation verifies in some manner TBD.

@mcollina
Copy link
Member

mcollina commented Jun 9, 2019

I don’t think we can publish core subsystems to npm any time soon. Specifically this would require:

  1. porting all internals C++ to napi, possibly having a performance hit.
  2. create a release pipeline for all internal modules
  3. figure our a release strategy for our OpenSSL updates on npm

Therefore, I’m -1, it’s just too much work and I’m not sure it would benefit the project.

Note that I’m actually maintaining the only module that we are exposing on npm, readable-stream, which maintainance is complex. This module is currently extracted from node core and several regexp-based transformations are applied to it to make sure everything works in older node versions.

@addaleax
Copy link
Member

addaleax commented Jun 9, 2019

I don’t think we can publish core subsystems to npm any time soon. Specifically this would require:

I don’t think we can publish all core subsystems to npm soon. There’s definitely good candidates imo, esp. for non-C++-modules, e.g. util.inspect() (which has grown to amazing complexity), console, repl, etc.

Note that I’m actually maintaining the only module that we are exposing on npm, readable-stream, which maintainance is complex. This module is currently extracted from node core and several regexp-based transformations are applied to it to make sure everything works in older node versions.

I think it’s a good sign that the dependencies that we vendor in tend to make a lot less trouble. I’m still a fan of vendoring in streams, too.

@SMotaal
Copy link

SMotaal commented Jun 9, 2019

@mcollina I think @addaleax really speaks to the origin where this unfolds from, and that was something that Node.js would be better off giving the end user, but was not something that would make the cut to be core.

It benefits the project to minimize complexity, and sometimes something not provided by Node.js directly just because it was not core enough creates complexities in the ecosystem.

So, does this mean we are never getting to the scenario you are concerned about?! Likely not that particular one imho, and not any if never try on more tenable things almost begging to happen.

Sounds fair?

@mcollina
Copy link
Member

I think it’s a good sign that the dependencies that we vendor in tend to make a lot less trouble. I’m still a fan of vendoring in streams, too.

That's likely a good candidate to start with anyway.

@mhdawson
Copy link
Member

If I understand correctly I have a few questions about transitions.

If a module is external then moved to the nodejs org, we would then publish under @nodejs/org? This would mean end user code needs to be updated before we could stop publishing updates under the old name as well?

Is the main advantage of using @nodejs/X discoverability and implied trust as a module managed by the Node.js org or are there other advantages (assuming from the discussion that they won't have any special abilities versus other modules published to npm).

@bmeck
Copy link
Member

bmeck commented Jun 11, 2019 via email

@SMotaal
Copy link

SMotaal commented Jun 11, 2019

assuming from the discussion that they won't have any special abilities versus other modules published to npm

@mhdawson It seems fair to start there, yes, so such packages needing any platform APIs puts them equally on the table for everyone (ie it is safe to expose any relevant APIs or not being the only deciding factor)

If these are not embedded, wouldn't this add another search path on disk?

@bmeck can you give more context?

@bmeck
Copy link
Member

bmeck commented Jun 11, 2019

If the resources are not embedded in the binary, they live on disk somewhere. That location could vary and cannot be embedded in the binary. Therefore, when loading these vendored resources, node has to generate the location of the resources on disk somehow (in addition to integrity checking the resources)

@benjamingr
Copy link
Member Author

Hey, this went stale a while back. I think this can be closed with a reopen / new issue once/if discussion picks up :]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests