Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add zip file support #1985

Merged
merged 4 commits into from
Aug 10, 2022
Merged

add zip file support #1985

merged 4 commits into from
Aug 10, 2022

Conversation

evanw
Copy link
Owner

@evanw evanw commented Feb 4, 2022

This PR enables esbuild to transparently traverse into .zip files as if they were directories. This is relevant to Yarn which stores packages in .zip files and then monkey-patches the JavaScript file system API so other JS code doesn't need to know about it. That doesn't work for esbuild because it's written in Go, so Yarn needs an esbuild plugin to do this which is single-threaded and also doesn't handle all of the cases esbuild handles. This change to add .zip file support to esbuild seems to work fine, and it's not a lot of code to maintain.

I do still need to figure out what the lifecycle of the zip file reader is though. Right now they aren't closed. I also need to check that this works on Windows, which is always needs special-casing when it comes to path separators. And ideally this would have some test coverage.

@arcanis
Copy link

arcanis commented Feb 4, 2022

I don't know if you'd find that too specific to be in the core, but Yarn also makes a little change on the filesystem paths: if you have a path like /path/to/project/__virtual__/<segment>/<n>/path/to/file.js, we take /path/to/project, apply N times the .. operation (usually 0), and append path/to/file.js (the segment is completely ignored). So for instance:

/project/.yarn/__virtual__/something/0/cache/lodash.zip
  -> /project/.yarn/cache/lodash.zip

/project/subfolder/__virtual__/something/1/hello/world.js
  -> /project/hello/world.js

We do this because, to allow that a single package is instantiated multiple times (required when a package has peer dependencies that are provided by multiple different ancestors), they must have different paths on the disk (otherwise Node will cache them). Symlinks don't work, because Node runs realpath on them, hence those virtual paths, where in practice we replace something by the hash of the package ancestor, thus providing a different path for each peer dependency set.

@evanw
Copy link
Owner Author

evanw commented Feb 4, 2022

I don't know if you'd find that too specific to be in the core, but Yarn also makes a little change on the filesystem paths: if you have a path like /path/to/project/__virtual__/<segment>/<n>/path/to/file.js, we take /path/to/project, apply N times the .. operation (usually 0), and append path/to/file.js (the segment is completely ignored).

Thanks for the extra context. I don't fully understand what you're saying because I don't know enough about how Yarn works. But I'm guessing you're saying this:

  • The onResolve callback for @yarnpkg/esbuild-plugin-pnp might map an import of lodash/add.js into /project/.yarn/__virtual__/something/0/cache/lodash.zip/add.js

  • The onLoad callback for @yarnpkg/esbuild-plugin-pnp then takes in /project/.yarn/__virtual__/something/0/cache/lodash.zip/add.js and turns it into /project/.yarn/cache/lodash.zip/add.js because the actual zip file is /project/.yarn/cache/lodash.zip not /project/.yarn/__virtual__/something/0/cache/lodash.zip

My idea was for esbuild to fully replace onLoad since that should be faster than using a plugin. It means files can be loaded in parallel because everything stays in Go. If my understanding above is correct then I think esbuild would have to interpret __virtual__ as well when it replaces onLoad, otherwise it wouldn't find the .zip files.

If Yarn did this __virtual__ substitution in onResolve instead of esbuild doing it, then that would break semantics because esbuild (like node) also needs modules to have separate paths for their identities to be separate. Otherwise those modules would be instantiated once instead of being instantiated multiple times, which might be desirable from a bundle size perspective but is incorrect from a bundling perspective because it could result in behavior changes to the code.

And if Yarn did this __virtual__ substitution in its own onLoad, that would defeat the goal of having esbuild handle onLoad for performance reasons.

/project/subfolder/__virtual__/something/1/hello/world.js

When does this form without .yarn come up? It seems the most "clean" to only do this path manipulation when .yarn is a parent directory.

@evanw
Copy link
Owner Author

evanw commented Feb 4, 2022

Another idea: Once esbuild understands paths inside .zip files, you could potentially use the new-ish resolve API to reuse esbuild's internal logic for path remapping using package.json and other sources of information. The @yarnpkg/esbuild-plugin-pnp plugin could just map the package prefix to a prefix inside a .zip file in onResolve and then call resolve for esbuild to take it from there. I think that should fix things like #1983 without needing Yarn's esbuild plugin to copy all of esbuild's behaviors.

So specifically an import path of lodash/add could be mapped into project/.yarn/__virtual__/something/0/cache/lodash.zip/add and then passed to esbuild's resolve function which would then figure out that it needs to be project/.yarn/__virtual__/something/0/cache/lodash.zip/add.js. The drawback of this would be that it would be slower because it involves more IPC traffic. But the benefit is that the plugin would automatically match all of esbuild's various behaviors. Obviously the optimal solution would just be for esbuild to do it all but it sounds like stabilizing the PnP data format is still a while away.

Not sure if this is a good idea or not because of the performance impact, but I'm putting it out there anyway. Actually I could potentially add an API to say "call resolve on the return value for me" for onResolve to eliminate the extra IPC overhead. Hmm.

@arcanis
Copy link

arcanis commented Feb 5, 2022

But I'm guessing you're saying this:

  • The onResolve callback for @yarnpkg/esbuild-plugin-pnp might map an import of lodash/add.js into /project/.yarn/__virtual__/something/0/cache/lodash.zip/add.js
  • The onLoad callback for @yarnpkg/esbuild-plugin-pnp then takes in /project/.yarn/__virtual__/something/0/cache/lodash.zip/add.js and turns it into /project/.yarn/cache/lodash.zip/add.js because the actual zip file is /project/.yarn/cache/lodash.zip not /project/.yarn/__virtual__/something/0/cache/lodash.zip

Yep, exactly! We do this work in the onLoad so that Node/Esbuild always see the files as different entities.

When does this form without .yarn come up? It seems the most "clean" to only do this path manipulation when .yarn is a parent directory.

In practice probably never, but we have a virtualFolder setting that lets users tweak the location. I imagine it's seldom used, but it's difficult to say...

Another idea: Once esbuild understands paths inside .zip files, you could potentially use the new-ish resolve API to reuse esbuild's internal logic for path remapping using package.json and other sources of information.

I agree it'd be really nice! The PnP API even has a function (resolveToUnqualified) that does just that, which is what we try to use when possible (for example that's what Webpack's enhanced-resolve use).

Note however that some context has to be preserved for the resolution to work, in particular for the exports field to be taken into account (exports only apply to bare identifier resolutions, not to relative/absolute paths; so if the PnP API was calling Esbuild's resolve API with an absolute path, then we'd need a way to explain that originally it was a package path, and thus that the exports field should be taken into account).

Repository owner deleted a comment from CarlosBellange Feb 6, 2022
@bryanjtc
Copy link

Any progress on this?

@evanw
Copy link
Owner Author

evanw commented Jul 22, 2022

This is sort of blocked waiting on Yarn to expose the necessary data, which from what I understand is blocked on releasing another major version of Yarn. See yarnpkg/berry#3591 specifically:

Write a formal spec for the PnP data files, so that 3rd-party resolvers (in particular native resolvers) can implement their own

@arcanis arcanis mentioned this pull request Jul 26, 2022
3 tasks
@arcanis
Copy link

arcanis commented Jul 27, 2022

@evanw I've started writing a formal spec at yarnpkg/berry#4671 - do you mind reviewing it whenever you have some time?

@evanw
Copy link
Owner Author

evanw commented Jul 27, 2022

Thanks for putting that together! Took a look. Some thoughts (keep in mind while reading that I don't use Yarn myself):

  • The specification doesn't mention __virtual__. Does that mean that it can be implemented correctly without needing to do anything related to __virtual__? If not, should the specification mention __virtual__ somewhere?

  • It sounds like pnpEnableInlining defaults to off which means this stuff won't work by default. Would it be ok to try to extract this information from .pnp.js/.pnp.cjs in that case? I'm pretty tempted because it would make a lot of scenarios just start working without any configuration changes (assuming the data is in the same format as the one in the specification, which is what it looks like). I tried extracting the data and it was pretty easy. I'm fine using the data in the JavaScript file as long as the format is documented so I know what it means, and as long as it's unlikely to undergo a breaking change in the future. I'm guessing that the specification is just stabilizing Yarn's internal format, which means it's now unlikely to change?

  • I'm guessing that this unusual tuple-style representation of maps is intended to make it easy to pass it to the new Map constructor from ES6. Things made more sense when I realized this. Before I realized this, it was pretty confusing to see lots of nested arrays everywhere. For example, I was wondering how many elements could be in each array and what they meant. TypeScript type declarations might help maybe? The examples are great too though, so thanks for those.

  • Is the iteration order over these maps important? If so, are there examples of cases where this matters?

  • What is the format of fallbackExclusionList? I have a guess, but it's probably good to put it in the specification. My guess is that it's unpacked via something like this:

    for (const [ident, references] of manifest.fallbackExclusionList)
      for (const reference of references)
        fallbackExclusionList.push({ ident, reference })

    To be clear, I'm not saying the format should be changed. Just documented.

  • For each registryPkg entry in manifest.packageRegistryData

    FWIW it took me a while to figure out what this means. After reading the whole thing a few times, I now think this means something like this:

    for (const [packageIdent, x] of manifest.packageRegistryData) {
      for (const [packageReference, y] of x) {
        const registryPkg = { packageIdent, packageReference, ...y }
        ...
      }
    }
  • For each fallbackLocator in manifest.packageRegistryData

    I think you mean in manifest.fallbackPool?

  • Let reference be the entry from parentPkg.packageDependencies referenced by ident
    If dependency is null, then

    I think you mean if reference is null?

  • The function GET_PACKAGE is never defined. After reading everything, I'm guessing it goes something like this:

    function GET_PACKAGE(manifest, { ident, reference }) {
      for (const [i, [r, data]] of manifest.packageRegistryData)
        if (i === ident && r === reference)
          return data
      return null
    }

    Obviously it's more efficient in reality since I assume these would be map lookups.

  • If url starts with registryPkg, then

    Starts with what? registryPkg.ident? registryPkg.reference? registryPkg.packageLocation? Something else? I'm guessing it's registryPkg.ident, but I'm flagging this because I found this confusing.

    It's also not clear what exactly "starts with means". For example, "react-dom" starts with "react". Is that supposed to match? Or is it supposed to be some sort of path-based "starts with" instead of string-based "starts with"?

  • What is supposed to happen if these GET_PACKAGE calls fail to find the package?

    Let parentPkg be GET_PACKAGE(manifest, parentLocator)
    Let fallbackPkg be GET_PACKAGE(manifest, fallbackLocator)
    Let dependencyPkg be GET_PACKAGE(manifest, {ident, reference})

  • What's supposed to happen if RESOLVE_VIA_FALLBACK returns null here?

    Set resolved to RESOLVE_VIA_FALLBACK(manifest, specifier) and return it

    Generally auditing everything to make sure failure paths are covered could be good. I might have missed some.

  • Let ident be the package scope and name from specifier

    I think I understand what this is saying, but you might want to be more explicit here and/or show some examples. I'm guessing this goes something like how it goes in node (copied below):

    1. If packageSpecifier does not start with "@", then
      1. Set packageName to the substring of packageSpecifier until the first
        "/" separator or the end of the string.
    2. Otherwise,
      1. If packageSpecifier does not contain a "/" separator, then
        1. Throw an Invalid Module Specifier error.
      2. Set packageName to the substring of packageSpecifier
        until the second "/" separator or the end of the string.
    3. If packageName starts with "." or contains "\" or "%", then
      1. Throw an Invalid Module Specifier error.

    It's ok if it's different in Yarn. But would be good to know if there are any characters that might cause the package scope and name to be considered invalid, for example.

  • This isn't strictly necessary, but it would make my life easier if there was either a big reference project that uses all of these features or some sort of easy-to-use test suite for this to help ensure that my implementation is correct.

I also tried implementing this specification on a branch here if it's interesting to you: https://github.com/evanw/esbuild/commits/yarn-pnp.

@merceyz
Copy link

merceyz commented Jul 27, 2022

That's some great feedback @evanw, could you make it in yarnpkg/berry#4671 instead/as well so readers of that PR can see it?

@evanw
Copy link
Owner Author

evanw commented Jul 27, 2022

My intent of posting a comment here was to provide one-off feedback instead of being drawn into the spec development process. You are welcome to do what you like with my feedback.

@merceyz
Copy link

merceyz commented Jul 27, 2022

Ah, that's totally fine, feel free to hide these three comments to not distract from the discussion 👍

arcanis added a commit to yarnpkg/berry that referenced this pull request Jul 28, 2022
@arcanis
Copy link

arcanis commented Jul 28, 2022

Thanks for the review! I've updated the document and corrected a few other places. A couple of important mistakes I fixed:

  • About fallbackPool - it's actually not the list of locators whose dependencies should act as fallbacks, but rather the fallback dependencies themselves (see the updated RESOLVE_VIA_FALLBACK). The dependencies of the top-level package (ie the one referenced by null, null in packageRegistryData) are however implicitly added to fallbackPool, overriding them¹

  • The dependencies listed in packageDependencies (and fallbackPool) are a map where the key is an ident, and the string is either a string (a reference) or a new [ident, reference] tuple. This structure only happens when a package is "aliased" to another (for example you write in your lockfile that foo resolves to bar@1.0.0 - in this case, the mapping in the manifest would be ["foo", ["bar", "npm:1.0.0"]]). Doesn't happen often, and probably could be deprecated, but I haven't investigated enough yet so I still preferred to write it into the spec. See step 12 of RESOLVE_TO_UNQUALIFIED .

  • In your case (a tool part of a project) the FIND_PNP_MANIFEST function should probably only be called once for the whole process. I clarified the function definition to make that more obvious.

Regarding your detailed feedbacks:

The specification doesn't mention virtual

I've added a section about this.

Would it be ok to try to extract this information from .pnp.js/.pnp.cjs in that case?

I think so, although we perhaps should add markers around the data string, to make sure it doesn't change? 🤔

I was wondering how many elements could be in each array and what they meant. TypeScript type declarations might help maybe?

As a workaround we have some types here

Is the iteration order over these maps important? If so, are there examples of cases where this matters?

I don't think so - we shouldn't have duplicate keys in those data, so order shouldn't matter. The only place where order matters is inside FIND_LOCATOR, hence the length check. While there can be two entries with the same packageLocation, they should have the exact same packageDependencies so it shouldn't have impact.

Starts with what? registryPkg.ident? registryPkg.reference? registryPkg.packageLocation? Something else? I'm guessing it's registryPkg.ident, but I'm flagging this because I found this confusing.

Starts with packageLocation - basically we're looking for the deepest package location that contains the given module url. Note that packageLocation always ends with a trailing /, you wouldn't have ./react but ./react/ (and thus it wouldn't match ./react-dom/).

What is supposed to happen if these GET_PACKAGE calls fail to find the package?

Clarified that the function cannot fail, because all packages referenced in the manifest are guaranteed to have a matching entry inside the packageRegistryData map.

What's supposed to happen if RESOLVE_VIA_FALLBACK returns null here?

Clarified that it returns undefined, which bubbles into RESOLVE_TO_UNQUALIFIED, which throws an exception

This isn't strictly necessary, but it would make my life easier if there was either a big reference project that uses all of these features or some sort of easy-to-use test suite for this to help ensure that my implementation is correct.

I've started writing tests here. I wrote them in JSON so it's easier for you to integrate into your own test runner; in our case, the code that runs this JSON is here: pnpStandardRunner.test.ts.


¹ This behaviour was how the fallback worked, before fallbackPool even existed, but we probably could deprecate it to only use the fallback pool ... still, that's how it currently works, so I wrote it as such in the spec.

arcanis added a commit to yarnpkg/berry that referenced this pull request Aug 4, 2022
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
arcanis added a commit to yarnpkg/berry that referenced this pull request Aug 8, 2022
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
@arcanis
Copy link

arcanis commented Aug 8, 2022

I just deployed the documentation to the website: https://yarnpkg.com/advanced/pnp-spec/

@evanw
Copy link
Owner Author

evanw commented Aug 9, 2022

Thank you for publishing the specification and the tests. I have updated my branch here, including incorporating the tests: https://github.com/evanw/esbuild/tree/yarn-pnp. Unfortunately many of the tests fail with the algorithm in the specification. I was able to get all tests to pass by deviating from the specification and following Yarn's implementation instead. I have the following additional feedback regarding the specification (sorry for another long post):

  1. I believe RESOLVE_VIA_FALLBACK(manifest, specifier) should be RESOLVE_VIA_FALLBACK(manifest, ident) since the function body uses ident instead of specifier

  2. The dependencyTreeRoots value is defined but never used. I'm assuming this means it can be ignored because it's irrelevant for implementing Yarn PnP?

  3. I think the data format for fallbackPool is incorrect. The specification has this example data:

    fallbackPool: [{
      name: "@app/monorepo",
      reference: "workspace:.",
    }],

    But the tests look like this:

    "fallbackPool": [
      ["test-2", "npm:1.0.0"],
      ["alias", ["test-1", "npm:1.0.0"]]
    ],

    The type definitions in the implementation also use triple-nested arrays instead of arrays of objects. I changed esbuild to follow the data format used by the implementation to fix the failing fallback-related tests.

  4. I suspect that the definition of FIND_LOCATOR is incorrect. Here is the definition:

    FIND_LOCATOR(manifest, moduleUrl)

    Note: The algorithm described here is quite inefficient. You should make sure to prepare data structure more suited for this task when you read the manifest.

    1. Let bestLength be 0
    2. Let bestLocator be null
    3. Let relativeUrl be the relative path between manifest and moduleUrl
      1. Note: Make sure it always starts with a ./ or ../
    4. If relativeUrl matches manifest.ignorePatternData, then
      1. Return null
    5. For each referenceMap value in manifest.packageRegistryData
      1. For each registryPkg value in referenceMap
        1. If registryPkg.discardFromLookup isn't true, then
          1. If registryPkg.packageLocation.length is greater than bestLength, then
            1. If relativeUrl starts with registryPkg.packageLocation, then
              1. Set bestLength to registryPkg.packageLocation.length
              2. Set bestLocator to the current registryPkg locator
    6. Return bestLocator

    However, here is the first manifest in the set of test cases:

    {
      "__info": [],
      "dependencyTreeRoots": [{
        "name": "root",
        "reference": "workspace:."
      }],
      "ignorePatternData": null,
      "enableTopLevelFallback": false,
      "fallbackPool": [],
      "fallbackExclusionList": [],
      "packageRegistryData": [
        [null, [
          [null, {
            "packageLocation": "./",
            "packageDependencies": [],
            "linkType": "SOFT"
          }]
        ]],
        ["root", [
          ["workspace:.", {
            "packageLocation": "./",
            "packageDependencies": [["test", "npm:1.0.0"]],
            "linkType": "SOFT"
          }]
        ]],
        ...
      ]
    }

    Doing a lookup for ./ in this map, which is what the first test does, gives a locator of [null, null] via the algorithm in the specification but ["root", "workspace:."] via real Yarn PnP. Thus the first test incorrectly fails because test is not found as a dependency.

    I tried digging around for the real implementation of FIND_LOCATOR and I'm assuming it's this:

    function findPackageLocator(location: PortablePath, {resolveIgnored = false, includeDiscardFromLookup = false}: {resolveIgnored?: boolean, includeDiscardFromLookup?: boolean} = {}): PhysicalPackageLocator | null {
      if (isPathIgnored(location) && !resolveIgnored) return null;
      let relativeLocation = ppath.relative(runtimeState.basePath, location);
      if (!relativeLocation.match(isStrictRegExp)) relativeLocation = `./${relativeLocation}` as PortablePath;
      if (!relativeLocation.endsWith(`/`)) relativeLocation = `${relativeLocation}/` as PortablePath;
      do {
        const entry = packageLocatorsByLocations.get(relativeLocation);
        if (typeof entry === `undefined` || (entry.discardFromLookup && !includeDiscardFromLookup)) {
          relativeLocation = relativeLocation.substring(0, relativeLocation.lastIndexOf(`/`, relativeLocation.length - 2) + 1) as PortablePath;
          continue;
        }
        return entry.locator;
      } while (relativeLocation !== ``);
      return null;
    }

    Specifically it uses the packageLocatorsByLocations map, which is populated like this:

    const packageLocatorsByLocations = new Map<PortablePath, {locator: PhysicalPackageLocator, discardFromLookup: boolean}>();
    data.packageRegistryData.forEach(([packageName, packageStoreData]) => packageStoreData.forEach(([packageReference, packageInformationData]) => {
      if ((packageName === null) !== (packageReference === null)) throw new Error(`Assertion failed: The name and reference should be null, or neither should`);
      const discardFromLookup = packageInformationData.discardFromLookup ?? false;
      const packageLocator: PhysicalPackageLocator = {name: packageName, reference: packageReference};
      const entry = packageLocatorsByLocations.get(packageInformationData.packageLocation);
      if (!entry) {
        packageLocatorsByLocations.set(packageInformationData.packageLocation, {locator: packageLocator, discardFromLookup});
      } else {
        entry.discardFromLookup = entry.discardFromLookup && discardFromLookup;
        if (!discardFromLookup) {
          entry.locator = packageLocator;
        }
      }
    }));

    This seems to indicate that a) iteration order matters after all for duplicate packageLocation values and b) that the "If registryPkg.packageLocation.length is greater than bestLength" condition isn't sufficient to capture what Yarn PnP is doing.

    Hopefully I'm correct that this is a deficiency of the specification and not a transcription error of mine. If it's a problem with the specification, then I leave it up to you how you want to resolve it. One suggestion could be to just specify the efficient algorithm Yarn actually uses instead of specifying the inefficient algorithm that Yarn doesn't use. I have made esbuild follow Yarn's implementation here (specifically packageLocatorsByLocations) and that fixed the test failures.

  5. I think the part regarding ignorePatternData is also incorrect. Here's the specification:

    1. Let relativeUrl be the relative path between manifest and moduleUrl

      1. Note: Make sure it always starts with a ./ or ../
    2. If relativeUrl matches manifest.ignorePatternData, then

      1. Return null

    This causes the test shouldn't go through PnP when trying to resolve dependencies from packages covered by ignorePatternData to fail. The problem is that the specification matches with ignorePatternData after inserting the ./ prefix but the implementation tests for ignorePatternData before inserting the ./ prefix:

    function findPackageLocator(location: PortablePath, {resolveIgnored = false, includeDiscardFromLookup = false}: {resolveIgnored?: boolean, includeDiscardFromLookup?: boolean} = {}): PhysicalPackageLocator | null {
      if (isPathIgnored(location) && !resolveIgnored) return null;
      let relativeLocation = ppath.relative(runtimeState.basePath, location);
      if (!relativeLocation.match(isStrictRegExp)) relativeLocation = `./${relativeLocation}` as PortablePath;
      ...
      return null;
    }

    I changed esbuild to follow the implementation to fix this test.

  6. I think the fallback logic in RESOLVE_TO_UNQUALIFIED is incorrect. Here's the specification:

    1. Let referenceOrAlias be the entry from parentPkg.packageDependencies referenced by ident

    2. If referenceOrAlias is undefined, then

      1. If manifest.enableTopLevelFallback is true, then

        1. If parentLocator isn't in manifest.fallbackExclusionList, then

          1. Set referenceOrAlias to RESOLVE_VIA_FALLBACK(manifest, ident)
    3. If referenceOrAlias is still undefined, then

      1. Throw a resolution error
    4. If referenceOrAlias is null, then

      1. Note: It means that parentPkg has an unfulfilled peer dependency on ident

      2. Throw a resolution error

    However, the test called should use the top-level fallback if a dependency is missing because of an unfulfilled peer dependency doesn't work with this algorithm. The problem is that parentPkg.packageDependencies returns null for referenceOrAlias which by the specification means no fallback should be attempted (since that only happens if referenceOrAlias is undefined). But the implementation checks for if (dependencyReference == null) { to do the fallback, which means the fallback should happen if dependencyReference is either null or undefined. I changed esbuild to follow the implementation here.

@arcanis
Copy link

arcanis commented Aug 9, 2022

I've opened yarnpkg/berry#4724 to address the problems you pointed. A few notes:

I have updated my branch here, including incorporating the tests: https://github.com/evanw/esbuild/tree/yarn-pnp

I tried it and it worked out of the box with lodash as dependency! However it failed with clipanion and typanion. Repro:

cd $(mktemp -d)
yarn init -2
yarn add typanion
echo "import 'typanion'" > index.js
~/esbuild/esbuild ./index.js --bundle

I think it has something to do with the zip filesystem - running yarn unplug typanion (which extracts the package inside a real folder in .yarn/unplugged) workarounds the issue.

I have also tried to run the build on the Yarn repo, but it failed to resolve the workspaces. If you want to try it:

~/esbuild/esbuild packages/yarnpkg-cli/sources/cli.ts --bundle

Doing a lookup for ./ in this map, which is what the first test does, gives a locator of [null, null] via the algorithm in the specification but ["root", "workspace:."] via real Yarn PnP. Thus the first test incorrectly fails because test is not found as a dependency. This seems to indicate that iteration order matters after all for duplicate packageLocation values.

That's a bug in the test; the null / null and the root@workspace:. entries should have the same packageDependencies, because they are the same package (they have the same packageLocation, so it wouldn't make sense for them to have different dependency sets).

I'm assuming this means it can be ignored because it's irrelevant for implementing Yarn PnP?

Correct - it's only useful at runtime for the PnP API consumers.

The problem is that the specification matches with ignorePatternData after inserting the ./ prefix but the implementation tests for ignorePatternData before inserting the ./ prefix

Indeed - I originally fixed that in the ignorePatternData description, but forgot to fix the function definition.

@arcanis

This comment was marked as outdated.

@arcanis
Copy link

arcanis commented Aug 9, 2022

Scratch my previous post - the reason is just that we changed the hook format in yarnpkg/berry#4320; before:

#!/usr/bin/env node
/* eslint-disable */

try {
  Object.freeze({}).detectStrictMode = true;
} catch (error) {
  throw new Error(`The whole PnP file got strict-mode-ified, which is known to break (Emscripten libraries aren't strict mode). This usually happens when the file goes through Babel.`);
}

function $$SETUP_STATE(hydrateRuntimeState, basePath) {
  return hydrateRuntimeState(JSON.parse('{\
    "__info": [\
      "This file is automatically generated. Do not touch it, or risk",\
      "your modifications being lost. We also recommend you not to read",\
      "it either without using the @yarnpkg/pnp package, as the data layout",\
      "is entirely unspecified and WILL change from a version to another."\
    ],\
    "dependencyTreeRoots": [\
      {\

After:

#!/usr/bin/env node
/* eslint-disable */

try {
  Object.freeze({}).detectStrictMode = true;
} catch (error) {
  throw new Error(`The whole PnP file got strict-mode-ified, which is known to break (Emscripten libraries aren't strict mode). This usually happens when the file goes through Babel.`);
}

const RAW_RUNTIME_STATE =
'{\
  "__info": [\
    "This file is automatically generated. Do not touch it, or risk",\
    "your modifications being lost."\
  ],\
  "dependencyTreeRoots": [\
    {\

@evanw
Copy link
Owner Author

evanw commented Aug 10, 2022

Thanks! I updated my branch, which now appears to be working correctly in all of these cases. So it seems like it's ready for people to try out. I'm planning to move forward here by landing this zip PR, then landing my branch, then releasing these changes as a new minor release.

@evanw evanw merged commit c58fe49 into master Aug 10, 2022
@evanw evanw deleted the zip branch August 10, 2022 01:24
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Sep 21, 2022
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Oct 5, 2022
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Nov 16, 2022
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Dec 20, 2022
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Feb 1, 2023
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Mar 16, 2023
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request May 1, 2023
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Jun 1, 2023
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
arcanis added a commit to yarnpkg/example-repo-zipn that referenced this pull request Jul 3, 2023
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
merceyz pushed a commit to yarnpkg/berry that referenced this pull request Nov 14, 2023
* Writes a PnP spec

* Formalizes the algorithms

* Updates the PnP header

* Addresses feedbacks from evanw/esbuild#1985

* Starts writing reusable tests

* More tests, clarifications on FIND_PNP_MANIFEST

* Updates style for code segments

* Adds tests for aliases

* Fixes inadvertent default setting change

* Fixes tests

* Versions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants