Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: improve objectWithoutPropertiesLoose on V8 #16357

Merged
merged 5 commits into from Apr 4, 2024

Conversation

romgrk
Copy link
Contributor

@romgrk romgrk commented Mar 16, 2024

I've been doing some benchmarks and I've seen _objectWithoutPropertiesLoose pop up a few times. I looked at the implementation and noticed it could be improved a bit, allocating Object.keys() to iterate through it is a bit inefficient. I have changed the implementation to use for-in and .hasOwnProperty. Benchmarks: https://jsben.ch/lSR7E

I'm not familiar with your codebase, I have tentatively used "core-js-pure/features/object/has-own.js" but I'm not sure if that corresponds to Object.prototype.hasOwnProperty.

Q                       A
Fixed Issues?
Patch: Bug Fix?
Major: Breaking Change?
Minor: New Feature?
Tests Added + Pass? Yes
Documentation PR Link
Any Dependency Changes?
License MIT

@babel-bot
Copy link
Collaborator

babel-bot commented Mar 16, 2024

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/56543

@liuxingbaoyu
Copy link
Member

Please change packages\babel-helpers\src\helpers.ts, then run make build, packages/babel-runtime-corejs3/helpers/esm/objectWithoutPropertiesLoose.js will be automatically generated.

@romgrk
Copy link
Contributor Author

romgrk commented Mar 16, 2024

Thanks, I've updated the PR.

@JLHwung
Copy link
Contributor

JLHwung commented Mar 16, 2024

On macOS Safari the initial is faster than updated:

image

https://jsbench.me/1hltu3eit5/1

I migrated the benchmark to jsbench.me because the other one pops up click baits.

@armano2
Copy link
Contributor

armano2 commented Mar 17, 2024

sadly those benchmarks are not valid, as v8 is going to optimize some of this code

if we are looking at performance for in is slower than for

for approach is generally fastest in most scenarios as it can be "easier" optimized (that may vary from browser compiler version to version)

let o = null
o=_objectWithoutPropertiesLoose(o1, exclude)
console.log(o)

try running it without the extra loop


there seem to not be measurable time difference between both pieces of code

@romgrk
Copy link
Contributor Author

romgrk commented Mar 17, 2024

On macOS Safari the initial is faster than updated:

Weid, JSC is usually happy with for-in and hasOwnProp. I'll try to do some digging. But I do think that optimizing for V8 is preferable, the difference both ways is in the 15-20% range, but chrome has a 65% traffic share at the moment.

sadly those benchmarks are not valid, as they v8 is going to optimize some of this code

Do you have a link to any relevant discussion?

try running it without the extra loop [...] there seem to not be measurable time difference between both pieces of code

What do you mean by without the loop? Do you have a better benchmark you could share to shed light on the measurements you see?

Wrt to the benchmark I have, I added a loop to ensure the function is optimized asap. This helper is what { x, ...rest } destructures to, which is a common pattern in many codebases so it's reasonable to assume the function is optimized by the engine in real world scenarios. I have also added multiple objects (o1 to o4) to ensure that the engine can't specialize for one particular object shape, which also wouldn't be the case outside of micro-benchmarks.

The other aspect that I didn't like is the Object.keys() allocation, which hides parts of its cost in the delayed garbage collector calls that are unlikely to be reflected in micro-benchmarks running in a tight CPU-bound loop. Unless escape analysis comes in play.

@romgrk
Copy link
Contributor Author

romgrk commented Mar 17, 2024

I've built a small benchmark to test on each engine, I can replicate the results you see with JSC. Despite the slowdown on JSC, I'd still argue that this implementation is better as the gains for V8 are more drastic, and more V8 users also means we should give more weight to that segment. Note that with this new implementation, V8 and JSC perform roughly the same, whereas the old one was considerably slower on V8 than on JSC.

But if you have more details on those upcoming V8 changes, please do share, I'd be happy to learn more.

### node: v21.7.1 ###
CASE: initial: #####################################............. 75.43% (2305 ops) (0.4338 ms/op)
CASE: updated: ################################################## 100% (3056 ops) (0.3272 ms/op)

### bun: 1.0.31 ###
CASE: initial: ################################################## 100% (3798 ops) (0.2633 ms/op)
CASE: updated: ############################################...... 88.92% (3377 ops) (0.2961 ms/op)

### gjs: 1.78.4 ###
CASE: initial: ################################################## 100% (1226 ops) (0.8157 ms/op)
CASE: updated: #################################################. 98.37% (1206 ops) (0.8292 ms/op)

@liuxingbaoyu
Copy link
Member

Thanks!
The performance changes look good to me, except for a bit of code size advice.

helpers.objectWithoutPropertiesLoose = helper("7.0.0-beta.0")`
  export default function _objectWithoutPropertiesLoose(source, excluded) {
    if (source == null) return {};
    var target = {};
    for (var key in source) {
      if ({}.hasOwnProperty.call(source, key)) {
        if (excluded.indexOf(key) < 0) target[key] = source[key];
      }
    }
    return target;
  }
`;

@romgrk
Copy link
Contributor Author

romgrk commented Mar 17, 2024

I have integrated one change from the snippet above, moving var key inside the for loop. I'm not sure if you're also proposing to remove the newlines, but I've kept the same style as other helpers in the file, which don't seem harmful as they would be minified away. The change from Object.prototype.hasOwnProperty to {}.hasOwnPoperty is not equivalent, {} allocates a new object on the heap, I feel like it might be better to avoid that.

For more details on the Object.keys()/escape analysis/garbage collection: I've digged a bit, neither top-tier optimizer (turbofan for V8 and FTL for JSC) is able to optimize away the Object.keys() call in the original implementation. Which means it is indeed creating temporary arrays that need to be GC'ed, and that means the benchmark for the old implementation doesn't fully reflect the real cost, and thus the real-world performance loss on JSC is likely to be less than what is reflected in them.

@romgrk
Copy link
Contributor Author

romgrk commented Apr 3, 2024

Some corrections, I've inspected JSC's output further and I think I was wrong when I said it didn't do escape analysis. I've added colors to the disassembly and I think it is able to optimize the Object.keys() away, which would explain the superior performance. This means that the benchmarks I showed above are likely to accurately reflect the real-world results.

Expand disassembly

image

I also previously wrote that chrome's market share is 65%, but including edge in the mix means V8 has a 70% usage currently, so I'd still recommend merging this one.

You have all the information, but if you don't feel like merging the PR feel free to close it.

Copy link
Member

@liuxingbaoyu liuxingbaoyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your detailed explanation!

In fact, we currently have terser enabled for most complex helpers, which compresses Object.prototype.hasOwnProperty into {}.hasOwnProperty. I'm surprised this affects performance.

https://jsbench.me/8jluk7693l/1

I did a simple test and it seems it doesn't affect performance in v8.

@romgrk
Copy link
Contributor Author

romgrk commented Apr 3, 2024

Yeah it's possible that V8 is able to perform ellision on that value through escape analysis. It's also possible that it's not, and the {} allocations need to be later cleared by the GC. AFAIK, allocating a value in a hot page is less expensive than the later GC costs. A micro-benchmark wouldn't show that cost though, because the CPU is running constantly it's less likely that the GC will kick-in. Can't say for sure without more research though, so by default I usually avoid allocations.

@nicolo-ribaudo
Copy link
Member

I don't like market-share-based arguments (if we optimize for V8, then it gives even more advantage to V8 leading to it having a higher market share). However, given that the performance improvement here is higher than the perf regression in SM/JSC, I'm fine with this PR.

@nicolo-ribaudo nicolo-ribaudo changed the title Performance: improve objectWithoutPropertiesLoose Performance: improve objectWithoutPropertiesLoose on V8 Apr 4, 2024
@nicolo-ribaudo nicolo-ribaudo merged commit e96a05d into babel:main Apr 4, 2024
51 checks passed
@nicolo-ribaudo nicolo-ribaudo added area: perf PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories labels Apr 4, 2024
@romgrk
Copy link
Contributor Author

romgrk commented Apr 4, 2024

I don't love having the ecosystem that much in the hands of google either, but JSC is already vastly more performant than V8 anyway, and is still more performant than V8 for this particular case even with the change.

@romgrk romgrk deleted the perf-object-loose branch April 4, 2024 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: perf PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants