Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Unicode character casing and require Node.js 10 #62

Merged
merged 9 commits into from Apr 7, 2020
Merged

Handle Unicode character casing and require Node.js 10 #62

merged 9 commits into from Apr 7, 2020

Conversation

sverweij
Copy link
Contributor

@sverweij sverweij commented Mar 28, 2020

what & why

Adds handling of non-ASCII character casing.

After merging this PR a string like розовый_пушистый-единороги will be converted to розовыйПушистыйЕдинороги (or to РозовыйПушистыйЕдинороги when pascal case is on) and Licorne époustouflante to licorneÉpoustouflante (/ LicorneÉpoustouflante).

Fixes #60

choices

Explanation of choices that might seem non-obvious:

  • Replaces the \w with [\p{Alpha}\p{N}_] which is as close to the original intent of \w as I could muster (which is [A-Za-z0-9_] according to mdn)
  • Uses unicode property escapes in most of the regular expressions as suggested in Correctly handle Unicode characters #60. This implies use of node >=10. I've updated the package.json and travis config to reflect this (+ added node stable to the build matrix).
  • Keeps most of the existing logic in tact (if it ain't broken ...) except for this pattern in the preserveCamelCase function:
isLastCharLower && /[a-zA-Z]/.test(character) && character.toUpperCase() === character
isLastCharLower && /[\p{Lu}]/u.test(character)

Because the regular expression already implies it's an upper case character the character.toUpperCase() === character would always yield true anyway (but holler if I missed something!)

@sindresorhus
Copy link
Owner

This looks great 👌

Can you add an example to the readme and index.d.ts where it camelcases Unicode? Maybe just use the example from one of the tests.

Can you also mention at the top of the readme that it correctly handles Unicode?

@sindresorhus sindresorhus changed the title Handle unicode character casing Handle Unicode character casing Apr 6, 2020
@sverweij
Copy link
Contributor Author

sverweij commented Apr 6, 2020

Will do - expect an update tonight (21:00 EST - so in ~9h time)

@sindresorhus sindresorhus changed the title Handle Unicode character casing Handle Unicode character casing and require Node.js 10 Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Correctly handle Unicode characters
2 participants