Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to parse large base64 encoded image url in an img element's src #619

Open
hmaskat17 opened this issue May 8, 2023 · 6 comments
Open
Labels

Comments

@hmaskat17
Copy link

hmaskat17 commented May 8, 2023

To Reproduce

Step by step instructions to reproduce the behavior:

  1. Sanitize a string with a very large base64 encoded image url in the img element's src
  2. Allow attributes for img and src in options
  3. Returns string that contains an img element with missing url value i.e. it has no image anymore

Expected behavior

The encoded image url value should be contained in the returned string

Describe the bug

The sanitizeHtml function seems to discard long base64 encoded data when sanitizing and therefore returns a string with missing attribute values for the img, i.e. it returns an empty img element

Details

Version of Node.js:
18

Server Operating System:
Windows

Additional context:
There is no error object returned when fails to parse the image url

@hmaskat17 hmaskat17 added the bug label May 8, 2023
@boutell
Copy link
Member

boutell commented May 8, 2023

How large is large?

This could be an upstream limitation of htmlparser2, but I'm not casting blame, as I'm not 100% sure why there would be any limit there either. There is definitely no "if bytes more than X, reject it" policy in sanitize-html.

@hmaskat17
Copy link
Author

hmaskat17 commented May 9, 2023

How large is large?

The encoded data is 172 KB large when copied over to Notepad and contains 177,112 characters. So i don't know if that is large for a raw base64 image but it is a long line of characters inside a html element. @boutell

@boutell
Copy link
Member

boutell commented May 9, 2023

It doesn't seem unreasonable to me. Can you create a PR adding a failing unit test?

@hmaskat17
Copy link
Author

Just as a notice, I will have to come back to this on a later date because of time constraints.

@jzellis
Copy link

jzellis commented Jun 13, 2023

I'm seeing the same issue -- even after making sure img is an allowed tag and src is an allowed attribute for img, it still removes the src entirely when I sanitize it.

@boutell
Copy link
Member

boutell commented Jun 13, 2023

Please provide a failing unit test in test/test.js so we can be sure we are talking about the same thing.

You can try out your tests with:

npm install
npm test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants