Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit buffer for fields #324

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Sethlans
Copy link

Feature request #323

When dealing with a third-party tool I noticed that some of the part of the multipart were with a particular charset, and others were with utf-8.

Since busboy assumes that all fields are from the same charset, this leads to an issue when parsing.
I added a raw buffer to the data emitted for the fields to be able to directly work with the Buffer and parse it with the correct charset.

bb.on('field', (name, val, buffer, info) => {
      console.log(`Field [${name}]: value: %j`, val);
      console.log(`Field [${name}] raw buffer`, buffer)
    });

@mscdex
Copy link
Owner

mscdex commented Sep 22, 2022

I think a better and more backwards compatible solution would be to add a 'buffer' charset that returns the Buffer instead of converting to string.

@Sethlans
Copy link
Author

I think a better and more backwards compatible solution would be to add a 'buffer' charset that returns the Buffer instead of converting to string.

Uhm, I do not follow you.
This was the fast solution I implemented in our scenario
Maybe a config to switch from converted string to Buffer in the val parameter?

@mscdex
Copy link
Owner

mscdex commented Sep 22, 2022

Uhm, I do not follow you.

You can specify the default character set when you create the busboy instance. Typically you'd specify something like 'utf8' if you know all of your fields will contain UTF-8 characters. What I was proposing was adding support for a special character set called 'buffer' that gives you the raw Buffer objects instead of creating any JavaScript strings.

@mscdex
Copy link
Owner

mscdex commented Sep 22, 2022

Additionally regarding the backwards compatibility, while this solution may work for your use case, breaking everyone else's code by adding another parameter to event handlers is not a good idea. If this PR's changes were added as-is, everyone would suddenly be getting a Buffer instead of the info object containing the various pieces of information about the part.

@Sethlans
Copy link
Author

Additionally regarding the backwards compatibility, while this solution may work for your use case, breaking everyone else's code by adding another parameter to event handlers is not a good idea. If this PR's changes were added as-is, everyone would suddenly be getting a Buffer instead of the info object containing the various pieces of information about the part.

Yeah, you are absolutely right, I will rewrite this to ensure that the code will not break for everyone. Thanks for the suggestion

@Sethlans
Copy link
Author

Owever, there should be a config to force the buffer charset. The defCharset config is overwritten if in the headers we have the charset specified. something like forceFieldBuffer or similar in order to always output the buffer in the val parameter instead of the parsed text

@Sethlans
Copy link
Author

@mscdex in the end I added a new config.
If I would like to specify a charset this would be however overwritten if another charset is found inside the multipart body.
By adding a config field I will overwrite this behaviour and emit field as buffer only in that situation. By default the field is false to avoid breaking changes

@Sethlans
Copy link
Author

Hey @mscdex, I was just wondering if you had the time to review the latest changes I applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants