Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadratic complexity on highly nested lists #1010

Open
4 tasks done
andersk opened this issue Jul 9, 2022 · 1 comment
Open
4 tasks done

Quadratic complexity on highly nested lists #1010

andersk opened this issue Jul 9, 2022 · 1 comment
Labels
🏁 area/perf This affects performance 🤞 phase/open Post is being triaged manually

Comments

@andersk
Copy link

andersk commented Jul 9, 2022

Initial checklist

Affected packages and versions

remark 14.0.2

Link to runnable example

No response

Steps to reproduce

$ 'time' node -e 'import("remark").then(({remark}) => remark().process("* ".repeat(8000) + "a"))'

RangeError: Maximum call stack size exceeded

35.51user 1.63system 0:27.79elapsed 133%CPU (0avgtext+0avgdata 182780maxresident)k
0inputs+0outputs (4major+870117minor)pagefaults 0swaps

Expected behavior

Ideally, remark should run in linear time (O(n) for input of length n). There are other Markdown parsers that achieve this.

When I reported this privately, the response indicated that security against computational complexity attacks is not considered to be goal of this project. If that’s the case, that should at least be noted in the Security section of the readme, similarly to the existing note about security against cross-site scripting attacks.

Actual behavior

remark runs in quadratic time (O(n²) for input of length n). It spends over 30 seconds of CPU time on this 16001 byte input before failing with RangeError: Maximum call stack size exceeded. The CPU time quadruples each time the input size is doubled. This means an application accepting Markdown input from the user can be tricked into spending enormous amounts of computation for relatively little input.

Runtime

Node v16

Package manager

npm 8

OS

Linux

Build and bundle tools

No response

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Jul 9, 2022
@ChristianMurphy ChristianMurphy added the 🏁 area/perf This affects performance label Jul 9, 2022
@ChristianMurphy
Copy link
Member

When a grammar is ambiguous, the time to build a parse tree will exceed linear time (or space) complexity, that is true of all parsers.
Polynomial time is unfortunate but at times necessary, hence why this does not meet the standard of "DDoS" or "a security vulnerability".

All that said, remark does strive to be fast, if a way to lower the complexity can be found, without breaking with the CommonMark standard, a pull request would be welcome.

@ChristianMurphy ChristianMurphy changed the title Denial of service via quadratic complexity on highly nested inputs Quadratic complexity on highly nested lists Jul 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏁 area/perf This affects performance 🤞 phase/open Post is being triaged manually
Development

No branches or pull requests

2 participants