Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I only allow tagged html #645

Open
ViteOrder opened this issue Feb 1, 2024 · 7 comments
Open

How do I only allow tagged html #645

ViteOrder opened this issue Feb 1, 2024 · 7 comments

Comments

@ViteOrder
Copy link

ViteOrder commented Feb 1, 2024

I'm sanitizing SEO tags, and want to only allow meta tags.

So I use these options

{
  allowedTags: ['meta'],
  allowedAttributes: {
    meta: ['*']
  },
}

This prevents all other tags from being used, but it doesnt prevent untagged text. So, if a user sent a string of text, it would be rendered on the page.

How do I fix this?

@ViteOrder ViteOrder changed the title How do I only allow untagged html How do I only allow tagged html Feb 2, 2024
@BoDonkey
Copy link
Contributor

BoDonkey commented Feb 2, 2024

I'm confused. Can you provide an example where you show user input, expected behavior, and actual behavior?

@boutell
Copy link
Member

boutell commented Feb 2, 2024

It sounds like you want to discard the text of tags that are not allowed, as opposed to just stripping the tags themselves. Normally in HTML sanitization it makes more sense to just strip the tags because this preserves as much user content as is allowed. Also, what should happen to an allowed tag inside a disallowed tag?

In principle an option to completely discard the content of a disallowed tag is possible, but we should think about whether it makes sense.

Perhaps what you really want is to be able to list specific tags that should be discarded along with their contents. While others would be tolerated, for instance if you don't want people using the <b> tag that usually doesn't mean you want the text inside to be deleted.

@boutell
Copy link
Member

boutell commented Feb 2, 2024

You might also look at the existing transform options.

@ViteOrder
Copy link
Author

I'm confused. Can you provide an example where you show user input, expected behavior, and actual behavior?

A user sends this as their SEO tags

<title>My post</title>
<meta name="description" content="A post about a thing">

The title tag isnt allowed, so it gets sanitized to this

My post
<meta name="description" content="A post about a thing">

which gets rendered visually as text

image

Doing what @boutell was talking about would keep people from accidentally doing this, but wouldn't stop them from doing it on purpose

@boutell
Copy link
Member

boutell commented Feb 2, 2024 via email

@ViteOrder
Copy link
Author

Found a solution that works for me. It's a bit silly but ¯\_(ツ)_/¯

body {
  font-size:0;
}
body > * {
  font-size:1rem;
}

@gkumar9891
Copy link
Contributor

@ViteOrder you can use this to remove text for disallowed tags

{
   allowedTags: ['meta'],
   allowedAttributes: {
     meta: ['*']
   },
   disallowedTagsMode: 'completelyDiscard'
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants