Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stripIgnoredTokens feature to remove insignificant whitespace #1628

Closed
wants to merge 25 commits into from
Closed

Conversation

rybon
Copy link

@rybon rybon commented Dec 24, 2018

Solves #1523.

@facebook-github-bot
Copy link

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@rybon rybon changed the title Added condense feature to printer to remove non-significant whitespace Add condense feature to printer to remove non-significant whitespace Dec 24, 2018
@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@rybon
Copy link
Author

rybon commented Dec 28, 2018

@IvanGoncharov can you review this one? Thanks.

Copy link

@jgcmarins jgcmarins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@IvanGoncharov
Copy link
Member

@rybon Thanks for PR 👍 and Happy New Year 🎄

Thanks a lot for writting bunch of test cases.
Here is a few desing question to move this PR forward:


Proposed functionality doesn't work with AST so I don't see any benefits in having it as part of print. Moreover it's hard to use with printSchema:

const schema = buildClientSchema(introspection);
let sdl = printSchema(schema);
const ast = parse(sdl);
sdl = print(ast, { condense: true });

I think we should make 'condense' a standalone function inside utils.


RegExp are good for PoC but it's a big maintainance problem in the future.
Plus the are huge number of edge cases, e.g. simplest one is:

{
  compile(jsCode: "function () { console.log('Valid JS code inside GraphQL string'); }")
}

You just can't handle such use cases with RegExp and should use lexer to correctly implement this functionality:

const lexer = createLexer(
new Source(`{
#comment
field
}`),
);
const startToken = lexer.token;
let endToken;
do {
endToken = lexer.advance();
// Lexer advances over ignored comment tokens to make writing parsers
// easier, but will include them in the linked list result.
expect(endToken.kind).not.to.equal('Comment');
} while (endToken.kind !== '<EOF>');


Last issue for me is the name, I understand why you can't call it minify (names of variables, fragments and operations stay unchanged) but I personally think it's imposible to figure out what condense do just from its name.
Since you just remove ignored tokens how about renaming it to removeIgnoredTokens?

@rybon What do you think?

@rybon
Copy link
Author

rybon commented Jan 2, 2019

@IvanGoncharov Happy New Year to you too :-). Thanks for the review.

I've put this feature in the printer module as that is the module that 'prints' an AST to a string. But if that isn't the right place to put it and we should move it to utils instead, fine by me. I don't really have an opinion on that. I'd rather have something in place in this library than nothing.

I agree that RegExp is kinda hacky and not an ideal solution. There are indeed a number of edge cases that are very hard to catch with this approach. And if the GraphQL spec expands, one would need to expand the RegExp as well. I don't have any experience with ASTs and lexers, so I don't know how to implement this feature with that approach instead. But I'm willing to try it out. Would you (or anyone else) be willing to help me get started on this?

I'm not really attached to the name condense. Naming things is hard. I was trying to come up with a name that is descriptive of the feature (https://en.wiktionary.org/wiki/condense) and minify didn't quite feel right. But another name is fine by me.

@IvanGoncharov
Copy link
Member

IvanGoncharov commented Jan 2, 2019

@rybon Sorry, I accidentally screw this PR trying to add commit on top of it.

I don't have any experience with ASTs and lexers, so I don't know how to implement this feature with that approach instead. But I'm willing to try it out. Would you (or anyone else) be willing to help me get started on this?

I implemented it here: be79c4e
You just need to document and test it.
I can't reopen this PR :(
Please open a new one based on removeIgnoredTokens and make necessary changes.

@IvanGoncharov

This comment has been minimized.

@rybon

This comment has been minimized.

@IvanGoncharov IvanGoncharov reopened this Jan 2, 2019
@IvanGoncharov

This comment has been minimized.

@IvanGoncharov
Copy link
Member

@rybon I tested it based on your example:

const {
  stripIgnoredTokens
} = require('./dist/utilities/stripignoredtokens');

const result = stripIgnoredTokens(`
query SomeQuery($foo: String!, $bar: String) {
  someField(foo: $foo, bar: $bar) {
    a
    b {
      c
      d
    }
  }
} 
`);

console.log(result);
// query SomeQuery($foo:String!$bar:String){someField(foo:$foo bar:$bar){a b{c d}}}

@rybon
Copy link
Author

rybon commented Jan 2, 2019

Looks good.

I'll revert the changes in printer and move the tests to utilities.

@IvanGoncharov
Copy link
Member

@rybon Great 👍
Please also add tests for SDL.
Some introspection results are huge and storing them as condense SDL is valid use case.

@rybon
Copy link
Author

rybon commented Jan 2, 2019

@IvanGoncharov done. I've added tests for the Query document and SDL document.

@rybon rybon changed the title Add condense feature to printer to remove non-significant whitespace Add stripIgnoredTokens feature to remove non-significant whitespace Jan 2, 2019
@IvanGoncharov IvanGoncharov added this to the v14.2.0 milestone Jan 16, 2019
@langpavel
Copy link
Contributor

langpavel commented Jan 17, 2019

I like this idea, really.

But code coverage should be increased — especially new features should be deeply tested.

BTW can this be implemented for AST too? — like print is now?

I always have AST (processed, validated, etc) when I wish write it down for next tooling.

@rybon
Copy link
Author

rybon commented Jan 24, 2019

@langpavel OK fair enough. But keep in mind that I'm not very familiar with this codebase, so I would need some help to get that done. Could you help me out?

@langpavel
Copy link
Contributor

@rybon I think that all you need is altered copy of src/language/printer.js

@rybon
Copy link
Author

rybon commented Jan 24, 2019

That is where the implementation was originally done (see earlier commits). What needs to be done in that copy?

@langpavel
Copy link
Contributor

langpavel commented Jan 24, 2019

@rybon Ok, I see first commit.
Point is that you used regex replaces, why? All the source code is baked here from AST, so if you rewrite this functions (join, block, indent) and printDocASTReducer, you will have everythig for free.

BTW I like latest tiny implementation. But having only AST version make more sense to me.
@rybon you should cover code by tests, it's curious that you decreased code coverage.. 🤔

@IvanGoncharov What do you think?

@rybon
Copy link
Author

rybon commented Jan 24, 2019

The RegExp implementation was done because I wasn't aware of an alternative approach at the time. I didn't decrease coverage on purpose, but I see it decreased by 0.007%. Are we striving for 100% coverage in this project? If so, how do we achieve that (have every single line covered)?

OK, I can rewrite these functions, but they would have to be parameterized to deal with this use case. What should the implementation be in that case? I have no experience with ASTs.

@langpavel
Copy link
Contributor

langpavel commented Jan 24, 2019

@rybon let's wait for @IvanGoncharov.

OK, I can rewrite these functions, but they would have to be parameterized to deal with this use case. What should the implementation be in that case? I have no experience with ASTs.

Better way is create new printDocASTReducer, say printDenseASTReducer which will define custom serializers.

You can learn about AST here: astexplorer.net

@IvanGoncharov
Copy link
Member

IvanGoncharov commented Jan 24, 2019

@rybon Situation is like this I agree with the current implementation.
It contains some minor hack: testing token for value to distinguish between punctuation tokens and all other but it can be a fixed latter.

The real problem here is that graphql-js has a pretty high standard for tests.
Not only in coverage (ATM 98% and we slowly moving to 100%) but in testing functionality.
It's very easy to fully cover this function since it short and doesn't contain a lot of if statements.

The idea here is that it simply not enough to use tests for print they are designed for a totally different case in mind.
You can't just pass the output of print through stripIgnoredTokens and test result since stripIgnoredTokens is the standalone function and should be tested as such.
For example print never return tabs so how do you test that tabs are removed.
Or how we test for Unicode BOM which is also ignored token.

I'm fully supporting this functionality but currently I need to finish a couple of other long standing issues with this lib. I can promise that we will include it in upcoming 14.2.0. But I need some time to finish other projects that I already started.

@rybon
Copy link
Author

rybon commented Jan 24, 2019

OK understood. I was testing it from the perspective of an end user, in this case an app developer who just wants to cut down on the size of GraphQL strings in an app bundle or over the wire. In that case the implementation details (the call stack that gets invoked) don't matter that much, only the end result does.
I can understand and respect that a library maintainer may have a different perspective, where testing implementation details (every function and line in the call stack) and 100% coverage matter more. That wasn't apparent to me when I opened this PR.

@langpavel
Copy link
Contributor

langpavel commented Jan 24, 2019

@rybon What is not covered:

  • input handling:
    • input is string
    • input is Source
    • throws in other cases
  • ripping out comments from input

First is public API, second expected behavior. None of both is implementation detail test.
NOTE: Testing implementation details is useless and really bad and unwanted practice. This is not the case

@codecov-io
Copy link

codecov-io commented Jan 28, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@59d9f17). Click here to learn what that means.
The diff coverage is 99.23%.

Impacted file tree graph

@@           Coverage Diff            @@
##             master   #1628   +/-   ##
========================================
  Coverage          ?   98.6%           
========================================
  Files             ?     216           
  Lines             ?   13368           
  Branches          ?    1971           
========================================
  Hits              ?   13181           
  Misses            ?     187           
  Partials          ?       0
Impacted Files Coverage Δ
src/index.js 100% <ø> (ø)
src/__fixtures__/index.js 100% <100%> (ø)
src/utilities/index.js 100% <100%> (ø)
src/utilities/__tests__/stripIgnoredTokens-test.js 100% <100%> (ø)
src/utilities/stripIgnoredTokens.js 95.45% <95.45%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59d9f17...b579351. Read the comment docs.

@rybon
Copy link
Author

rybon commented Jan 28, 2019

@langpavel done. Only line I can't seem to get covered is the continue; statement, could you help me with that?

@rybon
Copy link
Author

rybon commented Feb 22, 2019

@IvanGoncharov anything else that needs to be done for this PR to land?

@IvanGoncharov
Copy link
Member

@rybon Sorry for the delay.
Lexer implementation that I added previously was just a proof of concept and not really future-proof.
Also based on feedback #1523 I think we should support striping of indentation inside block string.
Plus even though current test do 100% percent coverage it is still based on print tests some of don't make sense for this functionality and missing many other checks.

I totally understand that adding this feature taking too much time so I stoped development of all other features until I merge this one. I'm working on it in my free time but I expect it will take a couple days at most.

P.S. I need to rebase some commit and also add new ones so to prevent stupid situation with github (like what I did with Lexer commit) I will open separate PR.

IvanGoncharov added a commit to IvanGoncharov/graphql-js that referenced this pull request Mar 26, 2019
Heavily based on work done by @rybon in graphql#1628.
Solves graphql#1523.
@rybon
Copy link
Author

rybon commented Mar 26, 2019

Just saw your PR (#1802). Great work! Thank you for your efforts, I really appreciate it.

@rybon rybon closed this Mar 26, 2019
IvanGoncharov added a commit to IvanGoncharov/graphql-js that referenced this pull request Mar 26, 2019
IvanGoncharov added a commit to IvanGoncharov/graphql-js that referenced this pull request Mar 26, 2019
IvanGoncharov added a commit to IvanGoncharov/graphql-js that referenced this pull request Mar 26, 2019
IvanGoncharov added a commit to IvanGoncharov/graphql-js that referenced this pull request Mar 26, 2019
IvanGoncharov added a commit that referenced this pull request Apr 3, 2019
Heavily based on work done by @rybon in #1628.
Solves #1523.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants