Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be safe: encode ">" character too #2746

Closed
wants to merge 1 commit into from
Closed

Be safe: encode ">" character too #2746

wants to merge 1 commit into from

Conversation

szepeviktor
Copy link

Although standard says it is Anything else
https://www.w3.org/TR/html52/syntax.html#data-state

<span class="token operator">></span>

... is valid HTML but looks misleading.

@github-actions
Copy link

No JS Changes

Generated by 🚫 dangerJS against b30e79a

@szepeviktor
Copy link
Author

szepeviktor commented Jan 30, 2021

I know-I know, it makes HTML grow from 1 byte to 4.
(CDN :) charges)

@szepeviktor
Copy link
Author

szepeviktor commented Jan 30, 2021

😲

This repo contains built code?! And there is no auto-builder job in CI?

@RunDevelopment
Copy link
Member

RunDevelopment commented Jan 30, 2021

Regarding the PR:

Although standard says it is Anything else

Not only HTML but even the stricter XML allows unescaped > characters.

valid HTML but looks misleading

I'm sorry but is that really all the motivation for this PR? (I don't mean this in a mean or passive-aggressive way.)

I did a quick benchmark and the small addition of this PR makes Prism.highlight about 5% slower. Not much but still.

Right now, I just don't see any good reasons for or against this change (well, maybe the 5% thing), so I will default to not adding new code to Prism's codebase. Are there more reasons for this PR?

@szepeviktor
Copy link
Author

Are there more reasons for this PR?

No there are none.
I've just freaked out seeing >> in HTML 😱

@szepeviktor
Copy link
Author

Thank you for your lengthy response!

@szepeviktor szepeviktor deleted the patch-1 branch January 31, 2021 00:52
@joshgoebel
Copy link

I did a quick benchmark and the small addition of this PR makes Prism.highlight about 5% slower. Not much but still.

Is the benchmark available? I'd be quite curious to know the split of time that Prism uses for the parsing/lexing vs the actual HTML generation... for Highlight.js we find the actual HTML generation is seems to be a tiny fraction of the total time and that changes/regressions/improvements there aren't actually felt in the overall time.

And FYI: We went the other direction on this, escaping: & < > " '

@RunDevelopment
Copy link
Member

Is the benchmark available?

Yes, the benchmarking suite is implemented in #2153 (not merged yet). The entire suite is documented in benchmark.html.

split of time that Prism uses for the parsing/lexing vs the actual HTML generation

Prism first generates a token stream and then generated HTML from that. By pure coincidence, both of these steps take about the same amount of time.

I'm pretty sure that we could further optimize the HTML generation to be even faster. But current browsers take about the same time to parse the generated HTML as Prism takes to generate it, so I don't think this is a priority rn.

Example: Here is what the timeline of highlighting Prism Core's code (1k lines) with Prism looks like.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants