Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support full unicode in parser #2404

Merged
merged 16 commits into from Jul 14, 2021

Conversation

dondonz
Copy link
Member

@dondonz dondonz commented Jun 28, 2021

This PR implements the RFC to support full Unicode in the parser.

Key spec changes

  • GraphQL now supports a wider range of Unicode characters. SourceCharacter was expanded to include any Unicode code point that is neither a leading nor trailing surrogate. Previously only up to U+FFFF included
  • Spec now includes guidance on Unicode surrogate pairs
  • (minor) GraphQL now allows certain control characters

Key changes in this PR

This PR has two halves:

  1. UnicodeUtil, used by StringValueParsing. This is to handle braced escapes and escaped surrogate pairs
  2. ANTLR grammar changes, which expand the definition of a SourceCharacter to mean any Unicode code point that is neither a leading nor trailing surrogate

References

RFC GitHub issue: graphql/graphql-spec#687
RFC spec text: graphql/graphql-spec#849
RFC JS implementation: graphql/graphql-js#3117
Previous PR: #2335


Want a Unicode fun fact? Groovy fails to compile if there are any Unicode code points that are not exactly four hex digits. You'll even encounter this compilation problem in COMMENTS.

For example: this comment containing RFC text will cause a compilation error

For example the input `"\uD83D\uDCA9"` is a valid {StringValue} which represents the same Unicode text as `"\u{1F4A9}"`.

The fix is to add an extra backslash

... which represents the same Unicode text as `"\\u{1F4A9}"`.

@andimarek andimarek added this to the 17.0 milestone Jul 5, 2021
src/main/java/graphql/parser/UnicodeUtil.java Outdated Show resolved Hide resolved
src/main/java/graphql/parser/UnicodeUtil.java Outdated Show resolved Hide resolved
@dondonz dondonz changed the title WIP: Support full unicode in parser Support full unicode in parser Jul 14, 2021
@andimarek andimarek merged commit 357c9bb into graphql-java:master Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants