Where do we sanitise the incoming markup? (Hooks implementation) #12

tanmayaBiswalOdiware · 2023-04-19T08:08:33Z

Writing DOM.purify inside renderer, right before returning the parsed markup leads to problems like showing [object Object] for things like bold and italics.

    const renderer = {
        paragraph(text) {
            console.log("text =", text)
            // let cleanHTML = DOMPurify.sanitize(text, { USE_PROFILES: { html: true } });
            return <MathJax key={uniqueId()}>{text}</MathJax>
            // return <span>hahah lol</span>
        },
        html(text) {
            let cleanHTML = DOMPurify.sanitize(text, { USE_PROFILES: { html: true } });
            return <span key={"title-span"} dangerouslySetInnerHTML={{ __html: cleanHTML }}></span>;
        }
    }

Where do you suggest we purify the input?

sibiraj-s · 2023-04-19T11:52:33Z

I have never been in a situation to sanitize markdown files. But If you ask me, I would heavily recommend to sanitize the content before storing in DB, processing it in client side is not recommended, as it can be bypassed on way or other.

And to your question. your code looks fine, and should work fine. Can you please verify the output of the sanitize function once (though it shouldn't be an issue).

And please provide a reproducible example in stackblitz/codesandbox maybe. Thanks

tanmayaBiswalOdiware · 2023-04-20T05:18:52Z

Well, there is sanitization happening on the server side as well. But the best practice is usually to sanitize them on both ends (I can just no implement it and tell my overlords that it been done though😂 but I digress).

Anyway, as far as the code shown above goes - it works fine for the HTML, but it doesn't work for some other type of input like bold, italics and so on. I am going to post a sandbox soon. Let me see if it can be reproduced.

Thanks for the prompt reply and hard work!

sibiraj-s · 2023-04-20T05:29:21Z

Ah, I get it now. I'll see why you are referring to now. I didn't understand fully earlier.

You must return a valid react node. See here . I'll also look into this once.

tanmayaBiswalOdiware · 2023-04-20T05:37:22Z

Yea, that's the problem. Sanitizing the argument passed into renderer messes up the React Node, and hence a weird output is received. That was my diagnosis. For now, I am essentially running the sanitization in useEffect where I am getting my data from API call.

sibiraj-s · 2023-04-20T05:41:47Z

I have just tried it and it works fine as expect 🤔 . Make sure the implementation is for the renderer.

This is the code i used

strong(text){
  const clean = DOMPurify.sanitize(text as string)
  return <CustomComponent>{clean}</CustomComponent>
}

You can refer to some examples here from storybook stories

To debug further. a reproducible stackblitz will help.

tanmayaBiswalOdiware · 2023-04-20T07:26:04Z

Ok, I have got the Sandbox. Maybe I am doing something wrong.

sibiraj-s · 2023-04-20T07:45:01Z

I think I see whats happening whats happening. So the elements like bold, italics are parsed first and converted to react element before passing to paragraph.

This react elements are just objects, which when passed to purifier, is converted to object object.

I tried differently before, hence I couldn't see this.

tanmayaBiswalOdiware · 2023-04-20T07:51:26Z

Yep, I figured as much. So even if I give the renderer a separate method for processing bold and italics, it still passes them into paragraph and the result is basically the same.

If I have inline prop as true, everything is sent into text method. And even in that case, the bold and italics are processed first and then sent into text method.

Anyways, I am in no hurry, you can have a little fun figuring this out in the weekend or the next😄

tanmayaBiswalOdiware · 2023-04-20T07:56:49Z

I have seen the native marked package handle Sanitize at some point in the rendering cycle. Maybe that will give you the clue. I was thinking of switching to the native package but then I thought that will be hassle, let's try hitting you up instead, maybe you will figure it out.

sibiraj-s · 2023-04-26T05:56:45Z

marked doesn't sanitize. Ref: https://marked.js.org/#usage

tanmayaBiswalOdiware · 2023-04-26T07:00:33Z

Yea, marked doesn't sanitize. But it has a way to handle sanitization at some point in the rendering cycle.
Ref: https://marked.js.org/using_pro#hooks

See the second example of them using postprocess hook.

Also, since marked just outputs the basic HTML, it can be directly sent through sanitization.
But with React, that can't be the case. So having access to the hook would be good.

sibiraj-s · 2023-04-26T08:28:54Z

Oh didn't know that. Thanks.

Yeah it is simple in their case as it is just a simple string of html. Where are the parser here converts to React components. I don't see an option here, even a window to do something.

tanmayaBiswalOdiware · 2023-04-26T08:32:44Z

I wonder if React elements can be sanitized. That didn't pop into my brain until now haha.

But from your point of view, there is no way to introduce hooks into this package?

sibiraj-s · 2023-04-26T08:32:45Z

I did not originally plan to add option to implement sanitization or provide some hooks to pre/post process stuffs, but I am open to suggestions for the implementation.

sibiraj-s · 2023-04-26T08:35:49Z

Why not just pass the content through dom-purify first then send it to marked-react. I believe dom-purify won't remove anything that is related to markdown formatting. 🤔

did you try that and created any problems?

sibiraj-s · 2023-04-26T08:38:07Z

tanmayaBiswalOdiware · 2023-04-26T08:41:18Z

That is in fact what I am doing right now. It looks fine with the content I have tested.

It's just that the processing takes place in two separate places, you know. Once I am sanitizing else where, and then later on in the return I push the data in the Marked.

It works, but they happen separately. So yes, I was just nitpicking but it would have been sweet if all of those could happen altogether. If you get what I mean.

That is why I wrote, it's not urgent, but it definitely will make the package better overall.

sibiraj-s · 2023-04-26T08:43:49Z

Cool. In that case, I'll leave this open for sometime, may be some idea might popup overtime or someone could suggest a better solution to make this seamless.

tanmayaBiswalOdiware · 2023-04-26T08:45:38Z

Sure, that would be good. I do not have a lot of experience with developing packages but I will try to think of something if we can somehow implement hooks (and even tokenizer but I just don't understand tokenizer myself haha)

tanmayaBiswalOdiware changed the title ~~Where do we sanitise the incoming markup?~~ Where do we sanitise the incoming markup? (Hooks implementation) Apr 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where do we sanitise the incoming markup? (Hooks implementation) #12

Where do we sanitise the incoming markup? (Hooks implementation) #12

tanmayaBiswalOdiware commented Apr 19, 2023

sibiraj-s commented Apr 19, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 20, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 20, 2023 •

edited

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 20, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023 •

edited

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023

sibiraj-s commented Apr 26, 2023

sibiraj-s commented Apr 26, 2023

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023 •

edited

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023

Where do we sanitise the incoming markup? (Hooks implementation) #12

Where do we sanitise the incoming markup? (Hooks implementation) #12

Comments

tanmayaBiswalOdiware commented Apr 19, 2023

sibiraj-s commented Apr 19, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 20, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 20, 2023 • edited

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 20, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

tanmayaBiswalOdiware commented Apr 20, 2023

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023 • edited

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023

sibiraj-s commented Apr 26, 2023

sibiraj-s commented Apr 26, 2023

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023 • edited

sibiraj-s commented Apr 26, 2023

tanmayaBiswalOdiware commented Apr 26, 2023

sibiraj-s commented Apr 20, 2023 •

edited

tanmayaBiswalOdiware commented Apr 26, 2023 •

edited

tanmayaBiswalOdiware commented Apr 26, 2023 •

edited