Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the Yarn Plug'n'Play module resolution algorithm #2451

Merged
merged 2 commits into from Aug 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
16 changes: 16 additions & 0 deletions CHANGELOG.md
@@ -1,5 +1,21 @@
# Changelog

## Unreleased

**This release contains backwards-incompatible changes.** Since esbuild is before version 1.0.0, these changes have been released as a new minor version to reflect this (as [recommended by npm](https://docs.npmjs.com/cli/v6/using-npm/semver/)). You should either be pinning the exact version of `esbuild` in your `package.json` file or be using a version range syntax that only accepts patch upgrades such as `~0.14.0`. See the documentation about [semver](https://docs.npmjs.com/cli/v6/using-npm/semver/) for more information.

* Implement the Yarn Plug'n'Play module resolution algorithm ([#154](https://github.com/evanw/esbuild/issues/154), [#237](https://github.com/evanw/esbuild/issues/237), [#1263](https://github.com/evanw/esbuild/issues/1263), [#2451](https://github.com/evanw/esbuild/pull/2451))

[Node](https://nodejs.org/) comes with a package manager called [npm](https://www.npmjs.com/), which installs packages into a `node_modules` folder. Node and esbuild both come with built-in rules for resolving import paths to packages within `node_modules`, so packages installed via npm work automatically without any configuration. However, many people use an alternative package manager called [Yarn](https://yarnpkg.com/). While Yarn can install packages using `node_modules`, it also offers a different package installation strategy called [Plug'n'Play](https://yarnpkg.com/features/pnp/), which is often shortened to "PnP" (not to be confused with [pnpm](https://pnpm.io/), which is an entirely different unrelated package manager).

Plug'n'Play installs packages as `.zip` files on your file system. The packages are never actually unzipped. Since Node doesn't know anything about Yarn's package installation strategy, this means you can no longer run your code with Node as it won't be able to find your packages. Instead, you need to run your code with Yarn, which applies patches to Node's file system APIs before running your code. These patches attempt to make zip files seem like normal directories. When running under Yarn, using Node's file system API to read `./some.zip/lib/file.js` actually automatically extracts `lib/file.js` from `./some.zip` at run-time as if it was a normal file. Other file system APIs behave similarly. However, these patches don't work with esbuild because esbuild is not written in JavaScript; it's a native binary executable that interacts with the file system directly through the operating system.

Previously the workaround for using esbuild with Plug'n'Play was to use the [`@yarnpkg/esbuild-plugin-pnp`](https://www.npmjs.com/package/@yarnpkg/esbuild-plugin-pnp) plugin with esbuild's JavaScript API. However, this wasn't great because the plugin needed to potentially intercept every single import path and file load to check whether it was a Plug'n'Play package, which has an unusually high performance cost. It also meant that certain subtleties of path resolution rules within a `.zip` file could differ slightly from the way esbuild normally works since path resolution inside `.zip` files was implemented by Yarn, not by esbuild (which is due to a limitation of esbuild's plugin API).

With this release, esbuild now contains an independent implementation of Yarn's Plug'n'Play algorithm (which is used when esbuild finds a `.pnp.js`, `.pnp.cjs`, or `.pnp.data.json` file in the directory tree). Creating additional implementations of this algorithm recently became possible because Yarn's package manifest format was recently documented: https://yarnpkg.com/advanced/pnp-spec/. This should mean that you can now use esbuild to bundle Plug'n'Play projects without any additional configuration (so you shouldn't need `@yarnpkg/esbuild-plugin-pnp` anymore). Bundling these projects should now happen much faster as Yarn no longer even needs to be run at all. And path resolution rules within Yarn packages should now be consistent with how esbuild handles regular Node packages. For example, fields such as `module` and `browser` in `package.json` files within `.zip` files should now be respected.

Keep in mind that this is brand new code and there may be some initial issues to work through before esbuild's implementation is solid. Yarn's Plug'n'Play specification is also brand new and may need some follow-up edits to guide new implementations to match Yarn's exact behavior. If you try this out, make sure to test it before committing to using it, and let me know if anything isn't working as expected. Should you need to debug esbuild's path resolution, you may find `--log-level=verbose` helpful.

## 0.14.54

* Fix optimizations for calls containing spread arguments ([#2445](https://github.com/evanw/esbuild/issues/2445))
Expand Down
3 changes: 3 additions & 0 deletions internal/js_ast/js_ast.go
Expand Up @@ -1771,6 +1771,9 @@ type AST struct {
ModuleScope *Scope
CharFreq *CharFreq

// This is internal-only data used for the implementation of Yarn PnP
ManifestForYarnPnP Expr

Hashbang string
Directive string
URLForCSS string
Expand Down
73 changes: 73 additions & 0 deletions internal/js_parser/js_parser.go
Expand Up @@ -149,6 +149,10 @@ type parser struct {
loopBody js_ast.S
moduleScope *js_ast.Scope

// This is internal-only data used for the implementation of Yarn PnP
manifestForYarnPnP js_ast.Expr
stringLocalsForYarnPnP map[js_ast.Ref]stringLocalForYarnPnP

// This helps recognize the "await import()" pattern. When this is present,
// warnings about non-string import paths will be omitted inside try blocks.
awaitTarget js_ast.E
Expand Down Expand Up @@ -341,6 +345,11 @@ type parser struct {
isControlFlowDead bool
}

type stringLocalForYarnPnP struct {
value []uint16
loc logger.Loc
}

type injectedSymbolSource struct {
source logger.Source
loc logger.Loc
Expand Down Expand Up @@ -414,6 +423,17 @@ type optionsThatSupportStructuralEquality struct {
mangleQuoted bool
unusedImportFlagsTS config.UnusedImportFlagsTS
useDefineForClassFields config.MaybeBool

// This is an internal-only option used for the implementation of Yarn PnP
decodeHydrateRuntimeStateYarnPnP bool
}

func OptionsForYarnPnP() Options {
return Options{
optionsThatSupportStructuralEquality: optionsThatSupportStructuralEquality{
decodeHydrateRuntimeStateYarnPnP: true,
},
}
}

func OptionsFromConfig(options *config.Options) Options {
Expand Down Expand Up @@ -9463,6 +9483,18 @@ func (p *parser) visitAndAppendStmt(stmts []js_ast.Stmt, stmt js_ast.Stmt) []js_
}
}
}

// Yarn's PnP data may be stored in a variable: https://github.com/yarnpkg/berry/pull/4320
if p.options.decodeHydrateRuntimeStateYarnPnP {
if str, ok := d.ValueOrNil.Data.(*js_ast.EString); ok {
if id, ok := d.Binding.Data.(*js_ast.BIdentifier); ok {
if p.stringLocalsForYarnPnP == nil {
p.stringLocalsForYarnPnP = make(map[js_ast.Ref]stringLocalForYarnPnP)
}
p.stringLocalsForYarnPnP[id.Ref] = stringLocalForYarnPnP{value: str.Value, loc: d.ValueOrNil.Loc}
}
}
}
}

// Attempt to continue the const local prefix
Expand Down Expand Up @@ -14039,6 +14071,46 @@ func (p *parser) visitExprInOut(expr js_ast.Expr, in exprIn) (js_ast.Expr, exprO
e.Args[i] = arg
}

// Our hack for reading Yarn PnP files is implemented here:
if p.options.decodeHydrateRuntimeStateYarnPnP {
if id, ok := e.Target.Data.(*js_ast.EIdentifier); ok && p.symbols[id.Ref.InnerIndex].OriginalName == "hydrateRuntimeState" && len(e.Args) >= 1 {
switch arg := e.Args[0].Data.(type) {
case *js_ast.EObject:
// "hydrateRuntimeState(<object literal>)"
if arg := e.Args[0]; isValidJSON(arg) {
p.manifestForYarnPnP = arg
}

case *js_ast.ECall:
// "hydrateRuntimeState(JSON.parse(<something>))"
if len(arg.Args) == 1 {
if dot, ok := arg.Target.Data.(*js_ast.EDot); ok && dot.Name == "parse" {
if id, ok := dot.Target.Data.(*js_ast.EIdentifier); ok {
if symbol := &p.symbols[id.Ref.InnerIndex]; symbol.Kind == js_ast.SymbolUnbound && symbol.OriginalName == "JSON" {
arg := arg.Args[0]
switch a := arg.Data.(type) {
case *js_ast.EString:
// "hydrateRuntimeState(JSON.parse(<string literal>))"
source := logger.Source{KeyPath: p.source.KeyPath, Contents: helpers.UTF16ToString(a.Value)}
log := logger.NewStringInJSLog(p.log, &p.tracker, arg.Loc, source.Contents)
p.manifestForYarnPnP, _ = ParseJSON(log, source, JSONOptions{})

case *js_ast.EIdentifier:
// "hydrateRuntimeState(JSON.parse(<identifier>))"
if data, ok := p.stringLocalsForYarnPnP[a.Ref]; ok {
source := logger.Source{KeyPath: p.source.KeyPath, Contents: helpers.UTF16ToString(data.value)}
log := logger.NewStringInJSLog(p.log, &p.tracker, data.loc, source.Contents)
p.manifestForYarnPnP, _ = ParseJSON(log, source, JSONOptions{})
}
}
}
}
}
}
}
}
}

// Stop now if this call must be removed
if out.methodCallMustBeReplacedWithUndefined {
p.isControlFlowDead = oldIsControlFlowDead
Expand Down Expand Up @@ -16486,6 +16558,7 @@ func (p *parser) toAST(before, parts, after []js_ast.Part, hashbang string, dire
ApproximateLineCount: int32(p.lexer.ApproximateNewlineCount) + 1,
MangledProps: p.mangledProps,
ReservedProps: p.reservedProps,
ManifestForYarnPnP: p.manifestForYarnPnP,

// CommonJS features
UsesExportsRef: usesExportsRef,
Expand Down
31 changes: 31 additions & 0 deletions internal/js_parser/json_parser.go
Expand Up @@ -187,3 +187,34 @@ func ParseJSON(log logger.Log, source logger.Source, options JSONOptions) (resul
p.lexer.Expect(js_lexer.TEndOfFile)
return
}

func isValidJSON(value js_ast.Expr) bool {
switch e := value.Data.(type) {
case *js_ast.ENull, *js_ast.EBoolean, *js_ast.EString, *js_ast.ENumber:
return true

case *js_ast.EArray:
for _, item := range e.Items {
if !isValidJSON(item) {
return false
}
}
return true

case *js_ast.EObject:
for _, property := range e.Properties {
if property.Kind != js_ast.PropertyNormal || property.Flags&(js_ast.PropertyIsComputed|js_ast.PropertyIsMethod) != 0 {
return false
}
if _, ok := property.Key.Data.(*js_ast.EString); !ok {
return false
}
if !isValidJSON(property.ValueOrNil) {
return false
}
}
return true
}

return false
}
162 changes: 162 additions & 0 deletions internal/logger/logger.go
Expand Up @@ -1686,3 +1686,165 @@ func allowOverride(overrides map[MsgID]LogLevel, id MsgID, kind MsgKind) (MsgKin
}
return kind, true
}

// For Yarn PnP we sometimes parse JSON embedded in a JS string. This is a shim
// that remaps log message locations inside the embedded string literal into
// log messages in the actual JS file, which makes them easier to understand.
func NewStringInJSLog(log Log, outerTracker *LineColumnTracker, outerStringLiteralLoc Loc, innerContents string) Log {
type entry struct {
line int32
column int32
loc Loc
}

var table []entry
oldAddMsg := log.AddMsg

generateTable := func() {
i := 0
n := len(innerContents)
line := int32(1)
column := int32(0)
loc := Loc{Start: outerStringLiteralLoc.Start + 1}
outerContents := outerTracker.contents

for i < n {
// Ignore line continuations. A line continuation is not an escaped newline.
for {
if c, _ := utf8.DecodeRuneInString(outerContents[loc.Start:]); c != '\\' {
break
}
c, width := utf8.DecodeRuneInString(outerContents[loc.Start+1:])
switch c {
case '\n', '\r', '\u2028', '\u2029':
loc.Start += 1 + int32(width)
if c == '\r' && outerContents[loc.Start] == '\n' {
// Make sure Windows CRLF counts as a single newline
loc.Start++
}
continue
}
break
}

c, width := utf8.DecodeRuneInString(innerContents[i:])

// Compress the table using run-length encoding
table = append(table, entry{line: line, column: column, loc: loc})
if len(table) > 1 {
if last := table[len(table)-2]; line == last.line && loc.Start-column == last.loc.Start-last.column {
table = table[:len(table)-1]
}
}

// Advance the inner line/column
switch c {
case '\n', '\r', '\u2028', '\u2029':
line++
column = 0

// Handle newlines on Windows
if c == '\r' && i+1 < n && innerContents[i+1] == '\n' {
i++
}

default:
column += int32(width)
}
i += width

// Advance the outer loc, assuming the string syntax is already valid
c, width = utf8.DecodeRuneInString(outerContents[loc.Start:])
if c == '\r' && outerContents[loc.Start] == '\n' {
// Handle newlines on Windows in template literal strings
loc.Start += 2
} else if c != '\\' {
loc.Start += int32(width)
} else {
// Handle an escape sequence
c, width = utf8.DecodeRuneInString(outerContents[loc.Start+1:])
switch c {
case 'x':
// 2-digit hexadecimal
loc.Start += 1 + 2

case 'u':
loc.Start++
if outerContents[loc.Start] == '{' {
// Variable-length
for outerContents[loc.Start] != '}' {
loc.Start++
}
loc.Start++
} else {
// Fixed-length
loc.Start += 4
}

case '\n', '\r', '\u2028', '\u2029':
// This will be handled by the next iteration
break

default:
loc.Start += int32(width)
}
}
}
}

remapLineAndColumnToLoc := func(line int32, column int32) Loc {
count := len(table)
index := 0

// Binary search to find the previous entry
for count > 0 {
step := count / 2
i := index + step
if i+1 < len(table) {
if entry := table[i+1]; entry.line < line || (entry.line == line && entry.column < column) {
index = i + 1
count -= step + 1
continue
}
}
count = step
}

entry := table[index]
entry.loc.Start += column - entry.column // Undo run-length compression
return entry.loc
}

remapData := func(data MsgData) MsgData {
if data.Location == nil {
return data
}

// Generate and cache a lookup table to accelerate remappings
if table == nil {
generateTable()
}

// Generate a range in the outer source using the line/column/length in the inner source
r := Range{Loc: remapLineAndColumnToLoc(int32(data.Location.Line), int32(data.Location.Column))}
if data.Location.Length != 0 {
r.Len = remapLineAndColumnToLoc(int32(data.Location.Line), int32(data.Location.Column+data.Location.Length)).Start - r.Loc.Start
}

// Use that range to look up the line in the outer source
location := outerTracker.MsgData(r, data.Text).Location
location.Suggestion = data.Location.Suggestion
data.Location = location
return data
}

log.AddMsg = func(msg Msg) {
msg.Data = remapData(msg.Data)
for i, note := range msg.Notes {
msg.Notes[i] = remapData(note)
}
oldAddMsg(msg)
}

return log
}
31 changes: 31 additions & 0 deletions internal/resolver/resolver.go
Expand Up @@ -691,6 +691,18 @@ func (r resolverQuery) finalizeResolve(result *ResolveResult) {
}

func (r resolverQuery) resolveWithoutSymlinks(sourceDir string, sourceDirInfo *dirInfo, importPath string) *ResolveResult {
// Find the parent directory with the Yarn PnP data
for info := sourceDirInfo; info != nil; info = info.parent {
if info.pnpData != nil {
if result, ok := r.pnpResolve(importPath, sourceDirInfo.absPath, info.pnpData); ok {
importPath = result // Continue with the module resolution algorithm from node.js
} else {
return nil // This is a module resolution error
}
break
}
}

// This implements the module resolution algorithm from node.js, which is
// described here: https://nodejs.org/api/modules.html#modules_all_together
var result ResolveResult
Expand Down Expand Up @@ -848,6 +860,7 @@ type dirInfo struct {
// All relevant information about this directory
absPath string
entries fs.DirEntries
pnpData *pnpData
packageJSON *packageJSON // Is there a "package.json" file in this directory?
enclosingPackageJSON *packageJSON // Is there a "package.json" file in this directory or a parent directory?
enclosingTSConfigJSON *TSConfigJSON // Is there a "tsconfig.json" file in this directory or a parent directory?
Expand Down Expand Up @@ -1176,6 +1189,24 @@ func (r resolverQuery) dirInfoUncached(path string) *dirInfo {
}
}

// Record if this directory has a Yarn PnP data file
if pnp, _ := entries.Get(".pnp.data.json"); pnp != nil && pnp.Kind(r.fs) == fs.FileEntry {
absPath := r.fs.Join(path, ".pnp.data.json")
if json := r.extractYarnPnPDataFromJSON(absPath, &r.caches.JSONCache); json.Data != nil {
info.pnpData = compileYarnPnPData(absPath, path, json)
}
} else if pnp, _ := entries.Get(".pnp.cjs"); pnp != nil && pnp.Kind(r.fs) == fs.FileEntry {
absPath := r.fs.Join(path, ".pnp.cjs")
if json := r.tryToExtractYarnPnPDataFromJS(absPath, &r.caches.JSONCache); json.Data != nil {
info.pnpData = compileYarnPnPData(absPath, path, json)
}
} else if pnp, _ := entries.Get(".pnp.js"); pnp != nil && pnp.Kind(r.fs) == fs.FileEntry {
absPath := r.fs.Join(path, ".pnp.js")
if json := r.tryToExtractYarnPnPDataFromJS(absPath, &r.caches.JSONCache); json.Data != nil {
info.pnpData = compileYarnPnPData(absPath, path, json)
}
}

return info
}

Expand Down