Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uses js-yaml for parsing #183

Merged
merged 11 commits into from May 24, 2019
Merged

Uses js-yaml for parsing #183

merged 11 commits into from May 24, 2019

Conversation

arcanis
Copy link
Member

@arcanis arcanis commented May 19, 2019

My investigations revealed that the lockfile parsing was dramatically slow - more than 180ms for the lockfile in Yarn itself. Using js-yaml instead of our pegjs-generated parser makes it much more reasonable, in the range of ~30ms.

This diff uses js-yaml by default to the lockfile and yarnrc files, and fallbacks to the peg parser if it detects that those files start with # yarn lockfile v1 (because the v1 used to write this header everywhere, including in the home yarnrc).

Some context:

  • Yaml and old-style files have overlapping and incompatible semantics. Some strings that can be parsed in both languages will produce different outputs. Because of this, we can't simply fallback to the peg parser if js-yaml fails to parse.

  • The js-yaml parser is fairly complex, and after spending a fair amount of time trying to make it compatible with the old-style syntax I don't think there's value pursuing this path. Forking it would be easy infra-wise, but the required changes are non-trivial and it's likely we would get it wrong somehow.

  • One significant issue of this approach is that old-style yarnrc files that have been manually written (such as the ones in the repository roots) likely don't contain the version header and will thus be parsed as regular Yaml. We can likely try to support them in some capacity. For this we would need to check whether the result of parsing is a pure string (instead of an object). The question is: should we? It's a bit error-prone, as it will only detect problematic files at the top-level.

@@ -86,7 +89,11 @@ export function stringifySyml(value: any) {

export function parseSyml(source: string) {
try {
return parse(source.endsWith(`\n`) ? source : `${source}\n`);
try {
return safeLoad(source) as {[key: string]: any};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to detect a yarn v1 lockfile vs a v2? A nested try catch seems like it would make it difficult to debug if parsing issues are opened.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can 🙁

The best we could do would be to always parse in strict Yaml, except when we detect a certain comment at the top of the file (like # yarn v1 and/or # THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY which was added everywhere back in the v1). That doesn't seem very intuitive however; I'd prefer the migration path to be as simple as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, in the end I still have to resort to a check like this because the following is valid Yaml:

no-deps@*:
  version "1.0.0"
  resolved "https://registry.yarnpkg.com/no-deps/-/no-deps-1.0.0.tgz#39453512f8241e2d20307975e8d9eb6314f7bf61"
  integrity sha1-OUU1EvgkHi0gMHl16NnrYxT3v2E=

Except that it's an object whose value is a string ... 😣

@arcanis
Copy link
Member Author

arcanis commented May 20, 2019

@Vlasenko @deini Can you review this diff (I've updated the PR summary with additional context)? I'm not entirely satisfied about this, but I don't see any other way to make a proper switch to Yaml 🙁

@larixer
Copy link
Member

larixer commented May 21, 2019

It should be okay if some hand-written files were supposed to have yaml syntax but ended up having syntax errors that were forgiven. We face stricter error checking with all the tools in JS ecosystem, think of tsc, eslint, etc. The code that seems okay a while ago is rejected by new version of the tool, because of stricter checks. The PR looks good to me

@arcanis
Copy link
Member Author

arcanis commented May 23, 2019

True - although the migration path isn't quite good enough yet since even the 1.16 still only parse a very small Yaml subset. I think I'll make it use js-yaml as well in the 1.17 (as a fallback) so that once the v2 is released people can migrate without too much issues.

@arcanis arcanis merged commit d403efb into master May 24, 2019
@arcanis arcanis deleted the js-yaml branch May 24, 2019 06:32
@arcanis
Copy link
Member Author

arcanis commented May 24, 2019

I've opened yarnpkg/yarn#7300 on the v1 trunk to add basic support for parsing Yaml files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants