Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a test for identifier support across all languages #2371

Merged
merged 3 commits into from Jun 12, 2020

Conversation

RunDevelopment
Copy link
Member

Motivated by this comment, I added a test that goes through all languages and checks that identifiers aren't broken.

What does "identifiers aren't broken" mean?
It means that any identifier (/[_a-zA-Z][_a-zA-Z0-9]*/) will be tokenized as either one token or not at all. I.e. the identifier foo123 would be broken if the language tokenized the 123 part as a number. The test will see how the languages handle identifiers like this and others. It will also check for numbers.

Why do we need this?
As pointed out in the comment, Markup templating (MT) assumes that its placeholders (which are identifiers) aren't broken up. If they are, MT will stop working. In the past, it caused this issue.

How is this implemented?
The test is quite simple. It has a list (actually 3) of identifiers and just tests that those identifiers aren't broken for any given language. Because some languages don't have identifiers, you can selectively disable the test for a certain class or all classes of identifiers.
The error message of this test includes an explanation of what broken identifiers are and how to fix them. Instructions on disabling are also included.

(The problem with the current implementation of this test is that I only do a Prism.tokenize on every identifier. I don't test inside grammars because these are usually very specific to the parent pattern, so there are almost only false positives.)

The actual changes to the languages are just boundary assertions. (I didn't just blindly throw some \b in there tho. I went and looked up the spec/doc of every language I didn't know.)
In some cases, I even had to change some test cases because they were wrong. Markdown changed the most because I didn't know that foo_italic_ won't make anything italic at the time. That's fixed now. For languages that had a faulty number pattern, I didn't create any new test files because we now have this test.

Copy link
Member

@mAAdhaTTah mAAdhaTTah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this makes sense to me. Thanks for adding!

@RunDevelopment RunDevelopment merged commit 48fac3b into PrismJS:master Jun 12, 2020
@RunDevelopment RunDevelopment deleted the identifier-test branch June 12, 2020 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants