New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be safe: encode ">" character too #2746
Conversation
Although standard says it is _Anything else_ https://www.w3.org/TR/html52/syntax.html#data-state
I know-I know, it makes HTML grow from 1 byte to 4. |
😲This repo contains built code?! And there is no auto-builder job in CI? |
Regarding the PR:
Not only HTML but even the stricter XML allows unescaped
I'm sorry but is that really all the motivation for this PR? (I don't mean this in a mean or passive-aggressive way.) I did a quick benchmark and the small addition of this PR makes Right now, I just don't see any good reasons for or against this change (well, maybe the 5% thing), so I will default to not adding new code to Prism's codebase. Are there more reasons for this PR? |
No there are none. |
Thank you for your lengthy response! |
Is the benchmark available? I'd be quite curious to know the split of time that Prism uses for the parsing/lexing vs the actual HTML generation... for Highlight.js we find the actual HTML generation is seems to be a tiny fraction of the total time and that changes/regressions/improvements there aren't actually felt in the overall time. And FYI: We went the other direction on this, escaping: |
Yes, the benchmarking suite is implemented in #2153 (not merged yet). The entire suite is documented in
Prism first generates a token stream and then generated HTML from that. By pure coincidence, both of these steps take about the same amount of time. I'm pretty sure that we could further optimize the HTML generation to be even faster. But current browsers take about the same time to parse the generated HTML as Prism takes to generate it, so I don't think this is a priority rn. |
Although standard says it is Anything else
https://www.w3.org/TR/html52/syntax.html#data-state
... is valid HTML but looks misleading.