Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BusBoy replacement? #297

Closed
2 tasks done
Uzlopak opened this issue Nov 25, 2021 · 24 comments · Fixed by #300
Closed
2 tasks done

BusBoy replacement? #297

Uzlopak opened this issue Nov 25, 2021 · 24 comments · Fixed by #300

Comments

@Uzlopak
Copy link
Contributor

Uzlopak commented Nov 25, 2021

Prerequisites

  • I have written a descriptive issue title
  • I have searched existing issues to ensure the issue has not already been raised

Issue

We should consider replacing busboy. It is actually quite buggy and has security issues. @mscdex does not really merge PRs, which would fix various bugs.

e.g.
mscdex/dicer#25
mscdex/dicer#22

Maybe we should write it completely new from scratch. The interfaces used by busboy seem to be a success, so we should keep the interfaces but replace the internals. Maybe with actual Streams instead of EventEmitter as baseclass.

Any thoughts?

@climba03003
Copy link
Member

An replacement means it should be able to provide 1:1 feature compatibility.

  1. stream handling for files - it should be the core of this module
  2. faster parsing for multipart? maybe

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 25, 2021

yes, I would also prefer a drop-in replacement

As far as i can see there are following streams to be potentially reimplemented:

StreamSearch
https://github.com/mscdex/streamsearch

HeaderParser
PartParser
Dicer
https://github.com/mscdex/dicer

Decoder (?)
UrlEncoded(?)
Multipart(?) + FileStream
BusBoy
https://github.com/mscdex/busboy

To solve this I would do a top to bottom approach. This means, first busboy and then all the way down to StreamSearch (if necessary).

@mcollina
Copy link
Member

Multipart is a complex spec, however it can be reimplemented.
Those libraries are complex and written in an early node era.

If you'd like to improve parsing speed, I think the only way is to move it to wasm.

@StarpTech
Copy link
Member

This reminds me to nodejs/node#38943

@mcollina
Copy link
Member

Totally

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 25, 2021

Actually it is not about performance, but about using the "new" nodejs streams and making it more resilient against malicious payloads.

Is there any interest in such a refactoring/reimplementation?

I would not invest time into this, if it will be denied because e.g. it is considered not worth it in the first place.

@mcollina
Copy link
Member

I would not land an implementation that is slower than busboy.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 25, 2021

Do you expect that a streams implementation would be slower?

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 26, 2021

Do we have some benchmarks regarding Busboy?

@mcollina
Copy link
Member

Do you expect that a streams implementation would be slower?

I expect a stream.Readable / stream.Writable to be slower.

Here are the bench I know about:

https://github.com/mscdex/dicer/tree/master/bench

@kibertoad
Copy link
Member

@Uzlopak We've ended up forking busboy: https://github.com/fastify/busboy
If there are known existing issues with the old version of it, we would be more than grateful if you could help with fixing them!

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 27, 2021

@kibertoad
Oha. That was unexpected. Man I love this hands-on community.

I am Currently shopping with my wife, but if I am at home I would participate.

You should integrate npm package Dicer into the forked busboy to fix the crashing issue. Yesterday I tested the example mentioned in the Dicer GitHub issue/pr and could crash some public nodejs services.

@kibertoad
Copy link
Member

@Uzlopak I'm open to that. Could you do a PR?

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 27, 2021

Will do

@kibertoad
Copy link
Member

@Uzlopak Note that it would be preferable to first integrate it as-is, and then do a fix PR separately, with benchmark results before and after the change, since it is a performance-sensitive part of the code, so we have to tread carefully.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Nov 27, 2021

Could you add the benchmarks please?
I was kind of lost when I tried this on my local fork

@kibertoad
Copy link
Member

Sure, I'll embed dicer with benchmarks now.

@kibertoad
Copy link
Member

@Uzlopak Benchmarks were broken, fixed them. Finalizing the PR now, there is some weird brokenage: fastify/busboy#14

I have a date in like 40 mins, so if I can't fix by then, will have to return to it later in the evening.

@kibertoad
Copy link
Member

@Uzlopak I've landed the change, feel free to proceed. Oh, and please use mocha to write new tests. Already moved part of busboy tests to it, but didn't yet do the same for dicer.

@Uzlopak
Copy link
Contributor Author

Uzlopak commented Dec 4, 2021

I have to say: It was an exhausting week but we did I think a pretty good job on refactoring busboy and fixing the security issues.

I thank you @kibertoad for the good team work. I was always looking for the github notifications and was happy, when I saw that you made remarks or approved a PR.

Looking forward for the impact when other projects also pick up our fork for better security and better performance :)

@mcollina
Copy link
Member

mcollina commented Dec 4, 2021

Could we get a PR to update the reference to our new module here?

@kibertoad
Copy link
Member

@Uzlopak Working side by side with you was a great honour and an utter joy, and if you ever feel like joining fastify plugin team, I would be more than happy to +1 you on that.

@mcollina That's the plan. I'll do a bit of testing with draft PR tonight, then wrap up remaining repo housekeeping, then it's 1.0.0 release, then there will be a final PR for the fastify-multipart transition.

@kibertoad
Copy link
Member

@mcollina Should it be a semver minor or major, btw? Considering that fastify-multipart was Node 10 already, I don't think there are any breaking changes that we are aware of, but I guess there is always a possibility that we introduced some unintentionally.

@mcollina
Copy link
Member

mcollina commented Dec 4, 2021

Go for minor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants