Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

HtmlEscapeValue shouldn't create dom elements #1843

Open
eliranek1 opened this issue Apr 26, 2018 · 11 comments
Open

HtmlEscapeValue shouldn't create dom elements #1843

eliranek1 opened this issue Apr 26, 2018 · 11 comments

Comments

@eliranek1
Copy link

Subject of the issue

htmlEscapeValue creates elements in order to html encode a value. This is a hack that we can work around as it bloats blink_gc memory and is less performant.

Your environment

  • version of angular-translate
    2.8.1
  • version of angular
    1.5.5
  • which browser and its version
    Chrome 65

Steps to reproduce

  1. Open chrome dev tools
  2. Open the more tools menu and open the Performance memory. This will show you the dom count of created dom elements. Note: this is different than elements that are attached to the DOM itself.
  3. Run a translation that will hit htmlEscapeValue
  4. Watch the number of dom elements increase. Old generation garbage collection will pick up most of the items but sometimes they will remain

Expected behaviour

No extra dom elements should get created

Actual behaviour

It creates a div.

###Proposed Change###
var regex = /[\u00A0-\u9999<>&]/gim;
var htmlEscapeValue = function (value) {
return value.replace(regex, ['&#', "$&".charCodeAt(0),';'].join(''));
};

Related SO question:
https://stackoverflow.com/questions/18749591/encode-html-entities-in-javascript

###Impact###
Perf:
Running through 14k different strings to get translated takes about 145ms on baseline, with the proposed fix it goes down to 30ms.

Memory:
Running about 1500 translations on our app showed a decrease of about 5MB in private bytes. 2.2MB of which were from blink_gc which is Chromes dom element allocator.

@eliranek1
Copy link
Author

eliranek1 commented Apr 26, 2018

Making this change will affect the end result of some special html encoded entities by turning them into numbers.
For example: > will change from &gt; to &#36;

@knalli
Copy link
Member

knalli commented Apr 26, 2018

Phew, well.

First of all thank you reporting and getting in touch. Although I get your point, I have a problem of parsing with a regex HTML code for some critical steps like escaping.

  • One fast run is good, but a complete suite running in different environments is better.
  • Is this really covering all bad and fancy html input vectors?

While we talking, you can test yourself if this working using a custom strategy using the corresponding provider: https://angular-translate.github.io/docs/#/api/pascalprecht.translate.$translateSanitizationProvider

@eliranek1
Copy link
Author

Thanks for the quick reply, Jan.

Do you have a way of testing a "complete suite running in different environments"? I assume by different environments you mean browsers, correct? I would be happy to test this out - is there a list of supported browsers somewhere?

This will convert all unicode chars with a charcode of 0000-9999. I believe this should cover well over the html characters.

I'll play around with the trnaslateSanitizationProvider and let you know if I need help.

@knalli
Copy link
Member

knalli commented Apr 26, 2018

@eliranek1 Different environments are browsers in our case, effectively, yes. We have no explicit list of browsers, but have very moderate support which is also caused by the supported AngularJS versions.

Regarding the actual performance, something like https://www.measurethat.net

But both depends on the input.

@eliranek1
Copy link
Author

I created a quick perf test on measurethat.net please check out the results in the meantime. Looks to be almost 4x faster:
139k Ops/Sec vs 420k Ops/Sec

https://www.measurethat.net/Benchmarks/ShowResult/11560

@eliranek1
Copy link
Author

The results above were from chrome 66.

Edge 41:
Element Creation: 37k Ops/Sec
Regex: 813k Ops/Sec

Firefox 59:
Element Creation: 331k Ops/Sec
Regex: 1,580k Ops/Sec

IE11:
Element Creation: 11k Ops/Sec
Regex: 874k Ops/Sec

@eliranek1
Copy link
Author

@knalli I'm not quite sure what you want me to do with the translateSanitizationProvider. Can you please provide a little more detail?

I created a test app that does two things:

  1. Has a compare function (button) where I downloaded a giant list of characters and then encoded each character with both the current and the proposed function. Then I compared the strings, if they were different I decoded each and compared the values to make sure they both resolved to the same char again. This test did not find any issues.
  2. It has two more buttons to run through the list of chars + a random number to avoid chromes optimization engine and it writes the time it took to the console.

I have attached this test app for your review.
translation.zip

@knalli
Copy link
Member

knalli commented Apr 26, 2018

@knalli I'm not quite sure what you want me to do with the translateSanitizationProvider. Can you please provide a little more detail?

@eliranek1 This allows you adding custom strategies for escaping. A example is in my showcase.

@knalli
Copy link
Member

knalli commented Apr 26, 2018

Please provide demos in a fully working manner, we have already blueprints for this:

  1. Please open this plnkr base.
  2. Fork it (at the top left).
  3. Ensure the version of AngularJS and angular-translate is correct.
  4. Ensure all angular-translate plugins are available; reduce and remove anything you can.
  5. Write a minimal as possible demo for your specific issue.
  6. Save and freeze. Provide us the final link to your demo.

@knalli
Copy link
Member

knalli commented Apr 26, 2018

And thank you for the perf demos already. As I said already, I'm still unsure about the coverage of all bad input. I simply not trust the pattern yet, needs more time for this. ;)

@vmukhachev
Copy link

  var htmlEscapeValue = function (value) {
    if (value == null) {
      return null;
    }
    return value.replace(/&/g, "&amp;")
                .replace(/"/g, "&quot;")
                .replace(/</g, "&lt;")
                .replace(/>/g, "&gt;");    
  };

it is only needed to escape 4 characters, well double quote is only needs to be escaped in attributes.
This simple change makes angular translate faster.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants