Default configuration is unsafe, contrary to documentation #295

djramones · 2023-06-29T09:41:13Z

I'm using markdown-it-py on a new Django project and it's perfect, but I think the claims regarding security in the docs are quite misleading.

Both the PyPI project description and readthedocs home page say that this parser-renderer is "safe by default", but it looks like those statements were just copied over verbatim from the original JS project. The dedicated page on security further says HTML is disabled by default as a blanket protection, but on closer inspection that actually describes the original JavaScript markdown-it project, not this Python port, which switches the default configuration to unsafe full CommonMark compliance (meaning, HTML parsing is enabled, potentially allowing for injection-style attacks when used in web apps).

#56 is proof of this issue, as the reporter said "I wanted to go for markdown-it for security stance" but the code snippets show usage of the Python package without any deliberate configuration for safer parsing. To be fair, that is easy: just pass "js-default" to the constructor. But the security page in the docs does not mention this is required to actually get the benefits, and all the "safe by default" claims imply that no such effort is needed at all.

This is at least more of a documentation issue than an actual vulnerability in the code, and I can submit a PR for the necessary edits, but I'm not sure if I'm missing something. Why was the default config switched to full CommonMark compliance in the first place? The docs imply that this Python port aims to minimize differences with the JS version otherwise. Perhaps the motivations for the port mean that security is less of a priority? (That seems to be the case with Executable Books which appears to be concerned more with data science tooling than public-facing web applications, but I'm not too familiar with the organization.) If so, that is okay, but these priorities and design principles should be reflected accurately in the project pages and docs.

(I'm sorry in case my tone comes across as not nice; I'm actually grateful for this project! I just think the issue needs a bit more attention, and I believe in the importance of good documentation.)

The text was updated successfully, but these errors were encountered:

welcome · 2023-06-29T09:41:16Z

Thanks for opening your first issue here! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.

If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).

Welcome to the EBP community! 🎉

chrisjsewell · 2023-07-02T23:12:43Z

Heya, yeh it was just pragmatic really, that these days CommonMark is the standard that most people would expect to use.

But indeed happy to receive a PR to improve the documentation to note how to set "safe mode"

rlidwka · 2023-07-05T11:11:59Z

Heya, yeh it was just pragmatic really, that these days CommonMark is the standard that most people would expect to use.

Do people expect to write actual HTML in their markdown?

If they do, maybe sanitizing it by default (like comrak library in rust does, or github's cmark-gfm) might be a better option.

chrisjsewell · 2023-07-05T11:49:33Z

Do people expect to write actual HTML in their markdown?

Oh absolutely, if your use case is e.g. for writing your own documentation, blogs etc, then it is common that you want to "extend" Markdown, perhaps to improve the visual representation of your content.
In some respect this is essentially what https://mdxjs.com is and, as I mention in markdown-it/markdown-it#936, you may even essentially want to "embed" Markdown in HTML.

In this use-case, when you have a trusted source for the Markdown (yourself), then you don't really care about security.

I would think the use-case for security is when you are rendering Markdown from potentially "untrusted" sources.

Indeed, for GFM, part of the spec is https://github.github.com/gfm/#disallowed-raw-html-extension-, which is a santizer.
It's on the TODO to add for markdown-it.rs/markdown-it-pyrs (markdown-it-rust/markdown-it-plugins.rs#15), and yeh should probably add it here as well to the "gfm-like" config preset.

I think "commonmark" should remain exactly matching its spec, and then obviously the great thing of markdown-it, is that people can add their own plugins/rules to turn-off html parsing entirely, or perform sanitization as they please

djramones mentioned this issue Jul 3, 2023

📚 DOCS: Clarify docs regarding security configuration #296

Merged

chrisjsewell closed this as completed in #296 Jul 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default configuration is unsafe, contrary to documentation #295

Default configuration is unsafe, contrary to documentation #295

djramones commented Jun 29, 2023 •

edited

welcome bot commented Jun 29, 2023

chrisjsewell commented Jul 2, 2023

rlidwka commented Jul 5, 2023

chrisjsewell commented Jul 5, 2023

Default configuration is unsafe, contrary to documentation #295

Default configuration is unsafe, contrary to documentation #295

Comments

djramones commented Jun 29, 2023 • edited

welcome bot commented Jun 29, 2023

chrisjsewell commented Jul 2, 2023

rlidwka commented Jul 5, 2023

chrisjsewell commented Jul 5, 2023

djramones commented Jun 29, 2023 •

edited