Uses js-yaml for parsing #183

arcanis · 2019-05-19T10:42:26Z

My investigations revealed that the lockfile parsing was dramatically slow - more than 180ms for the lockfile in Yarn itself. Using js-yaml instead of our pegjs-generated parser makes it much more reasonable, in the range of ~30ms.

This diff uses js-yaml by default to the lockfile and yarnrc files, and fallbacks to the peg parser if it detects that those files start with # yarn lockfile v1 (because the v1 used to write this header everywhere, including in the home yarnrc).

Some context:

Yaml and old-style files have overlapping and incompatible semantics. Some strings that can be parsed in both languages will produce different outputs. Because of this, we can't simply fallback to the peg parser if js-yaml fails to parse.
The js-yaml parser is fairly complex, and after spending a fair amount of time trying to make it compatible with the old-style syntax I don't think there's value pursuing this path. Forking it would be easy infra-wise, but the required changes are non-trivial and it's likely we would get it wrong somehow.
One significant issue of this approach is that old-style yarnrc files that have been manually written (such as the ones in the repository roots) likely don't contain the version header and will thus be parsed as regular Yaml. We can likely try to support them in some capacity. For this we would need to check whether the result of parsing is a pure string (instead of an object). The question is: should we? It's a bit error-prone, as it will only detect problematic files at the top-level.

packages/berry-parsers/sources/syml.ts

olingern · 2019-05-19T15:29:50Z

packages/berry-parsers/sources/syml.ts

@@ -86,7 +89,11 @@ export function stringifySyml(value: any) {

 export function parseSyml(source: string) {
  try {
-    return parse(source.endsWith(`\n`) ? source : `${source}\n`);
+    try {
+      return safeLoad(source) as {[key: string]: any};


Is there a way to detect a yarn v1 lockfile vs a v2? A nested try catch seems like it would make it difficult to debug if parsing issues are opened.

I don't think we can 🙁

The best we could do would be to always parse in strict Yaml, except when we detect a certain comment at the top of the file (like # yarn v1 and/or # THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY which was added everywhere back in the v1). That doesn't seem very intuitive however; I'd prefer the migration path to be as simple as possible.

Welp, in the end I still have to resort to a check like this because the following is valid Yaml:

no-deps@*: version "1.0.0" resolved "https://registry.yarnpkg.com/no-deps/-/no-deps-1.0.0.tgz#39453512f8241e2d20307975e8d9eb6314f7bf61" integrity sha1-OUU1EvgkHi0gMHl16NnrYxT3v2E=

Except that it's an object whose value is a string ... 😣

arcanis · 2019-05-20T17:10:00Z

@Vlasenko @deini Can you review this diff (I've updated the PR summary with additional context)? I'm not entirely satisfied about this, but I don't see any other way to make a proper switch to Yaml 🙁

larixer · 2019-05-21T12:34:18Z

It should be okay if some hand-written files were supposed to have yaml syntax but ended up having syntax errors that were forgiven. We face stricter error checking with all the tools in JS ecosystem, think of tsc, eslint, etc. The code that seems okay a while ago is rejected by new version of the tool, because of stricter checks. The PR looks good to me

arcanis · 2019-05-23T06:44:57Z

True - although the migration path isn't quite good enough yet since even the 1.16 still only parse a very small Yaml subset. I think I'll make it use js-yaml as well in the 1.17 (as a fallback) so that once the v2 is released people can migrate without too much issues.

arcanis · 2019-05-24T06:33:16Z

I've opened yarnpkg/yarn#7300 on the v1 trunk to add basic support for parsing Yaml files.

Zirro reviewed May 19, 2019

View reviewed changes

packages/berry-parsers/sources/syml.ts Outdated Show resolved Hide resolved

olingern reviewed May 19, 2019

View reviewed changes

arcanis added 3 commits May 19, 2019 21:00

Adds js-yaml

173f180

Optimizes a bit the lockfile parsing

5c6c036

Supports more cases

4d02b86

arcanis force-pushed the js-yaml branch from 16485ed to 4d02b86 Compare May 19, 2019 19:57

arcanis added 7 commits May 19, 2019 22:21

Supports more cases

c4182ba

Branches the parsing depending on the v1 tag

6653ef6

Updates the npm tests

d7c51ec

Fixes more tests

5eeb367

More tests fixed

7b9eae3

Other tests + documentation

1015653

Merge remote-tracking branch 'origin/master' into js-yaml

61e9200

Merge branch 'master' into js-yaml

0340687

arcanis mentioned this pull request May 24, 2019

Adds support for parsing Yaml files yarnpkg/yarn#7300

Merged

arcanis merged commit d403efb into master May 24, 2019

arcanis deleted the js-yaml branch May 24, 2019 06:32

arcanis mentioned this pull request Jun 24, 2019

[Enhancement] Improve the parser performances #162

Closed

paul-soporan mentioned this pull request Sep 1, 2022

feat!: parse yaml with custom nom-based parser written in rust #4807

Draft

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uses js-yaml for parsing #183

Uses js-yaml for parsing #183

arcanis commented May 19, 2019 •

edited

olingern May 19, 2019

arcanis May 19, 2019

arcanis May 20, 2019

arcanis commented May 20, 2019

larixer commented May 21, 2019

arcanis commented May 23, 2019

arcanis commented May 24, 2019

Uses js-yaml for parsing #183

Uses js-yaml for parsing #183

Conversation

arcanis commented May 19, 2019 • edited

olingern May 19, 2019

Choose a reason for hiding this comment

arcanis May 19, 2019

Choose a reason for hiding this comment

arcanis May 20, 2019

Choose a reason for hiding this comment

arcanis commented May 20, 2019

larixer commented May 21, 2019

arcanis commented May 23, 2019

arcanis commented May 24, 2019

arcanis commented May 19, 2019 •

edited