Skip to content

Sanitize and 'clean' html for safe consumption in a plain text format.

License

Notifications You must be signed in to change notification settings

Ruzzie/Textorizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Textorizer

Build status Nuget

Sanitize and 'clean' html for safe consumption in a plain text format.

  var plainText = Textorize.HtmlToPlainText("<span>I contain html</span><p>convert me</p>");
  //  plaintext = "I contain html\nconvert me\n"  

Converts html input to a safe plain text representation without html. Content in Style and Script tags are completely removed, html entity characters are explicitly converted to their unicode characters. Invalid html is handled best effort for a reasonable equivalent plain text output.

Keep in mind the following equivalence:

Textorize(input) == Textorize(HtmlEncode(Textorize(input)))

For more examples see the testsuite

Install

Package Manager Console

PM> Install-Package Textorizer

.NET CLI Console

> dotnet add package Textorizer

License

Dual licensed

MIT

https://opensource.org/licenses/MIT

Unlicense

https://opensource.org/licenses/Unlicense

About

Sanitize and 'clean' html for safe consumption in a plain text format.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages