add letters regex to match for more non ASCII chars #51
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hey, it's me again! :D
I realised that the letters rule only checks for a very small amount of chars thus causing issues in our 14 language support product. Me, as a German cannot use Umlaute (
ä, ö, ü
) or other special letters likeß
. After debugging and reading up on regex I decided to make another PR included SOME but not all charsets I would like to see in the package. I also updated the readme with the necessary information about the supported chars.Regarding tests. First and foremost I wanted to make sure that the general behaviour did not change and added negative test for all the other rules. I then started to make complete checks for the other charsets (not included in the PR) and quickly realises that something like
letters(128)
took several seconds to finish (we should consider capping the count value here, maybe 10 or so, everything above 5 sounds unreasonable to me, but who knows). And those 128 were only the greek and coptic alphabet. Not even starting with the CJK set... :DOkay, let me know what you think!