Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access the .next() method once, at the beginning, in the iteration protocol #976

Closed
kmiller68 opened this issue Aug 16, 2017 · 21 comments
Closed

Comments

@kmiller68
Copy link
Contributor

kmiller68 commented Aug 16, 2017

Note: all of the following applies to Arrays, Strings, Sets, and Maps but I'm using Arrays as an example for simplicity.

Edit: I'm withdrawing my first proposal in favor of the second. I'm leaving striked out for posterity.

Since %ArrayIteratorPrototype%.next does not specify any attributes, it is currently not-writable but is configurable. From an implementation standpoint, the fact that the next function can change each iteration of a loop makes it substantially harder to optimize for-of loops over Arrays. In particular, this makes writing optimizations harder. While it's not impossible to work around it substantially complicates the implementation.

I would propose changing the %ArrayIteratorPrototype%.next property to be non-configurable and non-writable. I would argue that changing the %ArrayIteratorPrototype%.next property adds little value as all syntactic iteration of Arrays would provide an opaque object anyway. A much simpler way to change the iteration of Arrays is to just change the Symbol.iterator property. In fact, I don't believe there is any loss of functionality by making %ArrayIteratorPrototype%.next property non-configurable and non-writable.

Another, perhaps more controversial, option would be to change how the syntactic iteration works to fetch the next property exactly once, before starting the loop. This is more likely to be a breaking change but would likely have the benefit of speeding up start up times for iterators. Like before I don't think any functionality is limited by getting the next property exactly once since it's possible to simulate changing the next function inside the next function itself.

Thoughts?

@erights
Copy link

erights commented Aug 16, 2017

Both are potentially breaking changes. Of the two, I strongly prefer the second. Existing monkey-patches (aka shims, polyfills) of primordials are likely broken by the first and not broken by the second. There monkey patches generally do a static initialize-time replacement and do not change the methods dynamically, or indeed at all after initialization.

Note that the second resembles the proxy optimization that Tom and I proposed, that everyone claimed would not help implementations optimize.

@kmiller68
Copy link
Contributor Author

kmiller68 commented Aug 16, 2017

@erights I totally agree that both are potentially breaking changes that we cannot do if things actually break. I actually prefer the [[get]] of the next property once more than freezing the next property too. Also, that option works better in the general case for an interpreter.

@bmeurer
Copy link

bmeurer commented Aug 17, 2017

Besides the issue of potentially breaking existing code (which is probably not a big deal in this case), I generally like the idea. I suppose every engine has to jump through some hoops to guard against reconfiguration of the next methods at the moment (or just don't optimize aggressively).

@annevk
Copy link
Member

annevk commented Aug 17, 2017

cc @bzbarsky

@allenwb
Copy link
Member

allenwb commented Aug 17, 2017

I should point out that these issues were all discussed within TC39 during the development of ES6 and the current design represents the consensus decisions. At a minimum any proposal to change the design should review the past discussion (see, meeting notes and es-discuss)

@kmiller68
Copy link
Contributor Author

@allenwb As far as I can tell from greping es-discuss / meeting notes for "next" there's no mention of the next method being swizzled mid-stream. The closest thing are code examples happen to show the current semantics but they appear in unrelated discussions to what's being proposed here. There are, however, lots of discussions about the next methods returning new objects but that's unrelated. Do you recall a specific conversation about either of these proposals?

@allenwb
Copy link
Member

allenwb commented Aug 17, 2017

I don't recall details of any specific conversation. It probably would have been about ArrayIterator prototype rather than specifically about the next methods. Might also have been on bugs.ecmascrpt.org (currently offline, but I recently heard that there is a chance of it coming back).

The pros/cons of mutable built-in prototypes have been discussed many times over the last 20 years. The conclusion has always been to maintain consistency with the status quo established by the early design of JS. In particular we (eg, the TC39 consensus) has generally be to avoid special case deviations from established patterns if they are solely motivated by speculation about performance or ease of optimization. (imagine if all sorts of accommodations and irregularities had been made in 2005 motivated by the state of the art of JS engines at that time).

In this particular case, how is the the optimization hazard of a mutable %ArrayIteratorPrototype% any different from the hazard of a mutable Array.prototype or String.prototype method used in a hot while loop or for that matter any prototype access that an implementation may wish to optimize. At the highest level of optimization complex statements such as for-of are decomposed into primitive operations to which the actual optimization analysis/techniques are applied. User defined objects and the built-ins are always mutable by default so optimizations must always guard against and deal with mutations that invalidate caches or observations that drive dynamic specializations.

A similar related argument: for-of may be used with user defined iterators as well as built-in automatically generated iterators. Why should the built-in iterators be handled any differently from user defined iterators. Why should they perform better? Rather than adding special case runtime checks to see if an iterator is an Array Iterator isn't it better for an implementation to develop techniques that work for all iterators.

@allenwb
Copy link
Member

allenwb commented Aug 17, 2017

@kmiller68 said:

Since %ArrayIteratorPrototype%.next does not specify any attributes, it is currently not-writable but is configurable.

This is incorrect. Clause 17 says:

Every other data property described in clauses 18 through 26 and in Annex B.2 has the attributes { [[Writable]]: true, [[Enumerable]]: false, [[Configurable]]: true } unless otherwise specified.

%ArrayIteratorPrototype%.next is a data property defined in clause 22.1.5.2.1 and does not explicitly state any deviations from the normal built-in method/data property attribute settings.

@ajklein
Copy link
Contributor

ajklein commented Aug 17, 2017

@allenwb your argument above seems addressed at one of @kmiller68's proposed changes (making next non-configurable) but not the other (only fetching next once at the start of a for-of loop). I don't believe there's any deep JS history about that question; do you recall discussion about that issue?

An advantage of that approach is that it also makes it easier to optimize the author-defined iterators (addressing your concern in your last paragraph). There's also some precedent in ES2015 for such behavior, e.g., the Map and Set constructors, which only lookup the set or add methods once before looping over the passed-in iterable.

@allenwb
Copy link
Member

allenwb commented Aug 17, 2017

@kmiller68 said:

As far as I can tell from greping es-discuss / meeting notes for "next" there's no mention of the next method being swizzled mid-stream.

Pretty much everything in JS is swizzle-able across any operation that has the potential of invoking a function (pretty much any operation, given the existence of Proxy). This was generally understood by TC39 members during the development of ES2015 and since the possibility of swizzling is the norm it would be unremarkable for it to not be explicitly mentioned. If there was a discussion (and I pretty sure there was) it would of been about "should we try to prevent such swizzling" rather `should we enable such swizzling".

@ajklein I don't see how making it non-configurable helps since as I pointed out above, %ArrayIteratorPrototype%.next is writable.

So how does changing the attributes of %ArrayIteratorPrototype%.next make it easier to optimize author-defined iterators?

the Map and Set constructors, which only lookup the set or add methods once before looping over the passed-in iterable.

Ah, but they lookup next in the iterator each time through the loop. Admittedly, that is an internal inconsistency. Generally, for built-ins such as these constructors I assumed that they might be implemented in a manner that was opaque to aggressive interprocedural optimization. So I did the obvious optimizations at the spec. levels. I probably didn't do it for next because it would have required inlining two levels of iterator related abstract operations which would have generally obscured what was going on.

@saambarati
Copy link

saambarati commented Aug 17, 2017

@allenwb The conversation I think we aught to have is about whether or not we look up "next" once or on every iteration. Your previous comments seem to be speaking more to the proposal of making "next" non-configurable, non-writable, on builtin iterator prototypes. I agree that such a change would generally be bad as it would be inconsistent with the rest of JS. But I think looking up "next" only once is not bad, and worth a serious discussion, even if it were implicit in earlier conversations.

@kmiller68
Copy link
Contributor Author

I agree with Saam. At this point I think the consensus is that everyone prefers the look up of next once proposal over changing the builtin next properties. Are you opposed to considering the lookup next once spec change @allenwb?

Ah, but they lookup next in the iterator each time through the loop. Admittedly, that is an internal inconsistency. Generally, for built-ins such as these constructors I assumed that they might be implemented in a manner that was opaque to aggressive interprocedural optimization. So I did the obvious optimizations at the spec. levels. I probably didn't do it for next because it would have required inlining two levels of iterator related abstract operations which would have generally obscured what was going on.

I think I would have disagreed with making the optimization of pulling next out of the loop if it didn't match what for-of and spread did, for internal consistency reasons. FWIW, if we did the lookup next once change in for-of / spread I would also change the Map and Set constructors. I made a rough diff of what the spec changes could look like here.

@erights
Copy link

erights commented Aug 17, 2017 via email

@kmiller68
Copy link
Contributor Author

kmiller68 commented Aug 17, 2017

@erights By "swizzle" I meant change the value of a builtin method. Specifically, changing the iterator's next property during a for-of loop to alter the way the next value is produced for the next iteration of the loop.

@allenwb
Copy link
Member

allenwb commented Aug 17, 2017

I agree with Saam. At this point I think the consensus is that everyone prefers the look up of next once proposal over changing the builtin next properties. Are you opposed to considering the lookup next once spec change @allenwb?

Yes I oppose such a change.

  1. It is a breaking change. Consider:
let itr = {
  [Symbol.iterator]() {return this}, 
  next() {
   console.log("original next");
   return {value:undefined};
 }
};
let ctr=0;
for (let x of itr){
    if (ctr++ >= 5) itr.next = () => {
     console.log("replacement next");
     return {done:true}
    }
}
console.log("loop done");

Currently (I tested it) the loop runs 6 times, outputting "original next" 5 times and "replacement next" once. With the suggest spec. change this is an infinite loop, outputting "original next" each iteration.

  1. Regardless of whether this particular code snippet is good style or not very useful, it is understandable with only a basic understanding of iterables and the for-of statement. Basic understanding: for-of calls the next method of the iterator before each loop iteration and terminates if the next returns an object whose done property has the value true. The proposed change would make it impossible to understand the above code within having a much deeper understanding of detailed for-of semantics (the next method of the iterator is only accessed once and its value is cached until the loop terminates.

  2. It is a refactoring hazard that must be considered anytime a for-of loop is transformed to/from any other control abstraction (perhaps an application defined iteration function) that uses an iterator.

  3. Are there any measurements that suggest that this change would make a difference in any real, non-synthetic benchmark code.

  4. Multiple property lookups of the next method should be optimizable using a PIC. It is the same situation as any other property access on an object with a stable shape and stable method values. Implementations should focus on optimizing the general case of such property accesses rather than worry about this one specific occurrence.

The above is basically a sample of the sort of thing that was discussed many times about many different details of the existing language design. Basically: keep things internally consistent, don't be tricky, do the obvious thing, leave optimization to the engines.

@kmiller68
Copy link
Contributor Author

kmiller68 commented Aug 18, 2017

1 . We make breaking changes all the time. There is a policy of making them assuming there is sufficient value. Now, you might disagree there is sufficient value here but I have heard much evidence to the contrary. I know web developers have told me they are not permitted by style rules to use iterators in their code because it regresses their page load time v.s. indexed loops. Keep in mind the place where using iterators makes a major difference is not in the highest tier of the optimizing compiler, there is practically no difference there. In fact, in JSC we will not even allocate the iterator object or the result object most of the time. The place where this does matter is in the interpreter and/or non-optimizing JIT, which is where most of the execution time on page load is spent. As far as I know, almost every major website regularly tracks the performance of their page load time.

2 . By the same token, the way that Map/Set constructors were implemented then would be just as bad... Also, I highly doubt anyone would reasonably consider changing the next function on each iteration of the for-of loop so this feels like a red herring. Additionally, if for-of does not observe the fact that they changed the next function then why would anyone assume that it should change on each iteration? The only reason they would is because we define it that way. That's not how you defined the Map/Set constructors, for example.

4 . My goal here is to improve the performance of iterators in the interpreter not in optimizing compliers. While the spec should not favor builtin iterators at the expense of custom iterators, the reality is that builtin objects are use disproportionately in most JS applications. Any VM that did not make special optimizations for builtin operations, not even just limited to the interpreter, would be at a practical disadvantage to one that does. Keep in mind that all VMs try to optimize custom operations as much as possible. For what it's worth, I agree that in the long run it's actually to a VM's advantage to try to build a general solution to most problems. My goal here, ultimately, is to make iterating Arrays/Strings/Sets/Maps in our interpreter as fast as possible. Right now, without special optimizations in the JSC interpreter, iterating an Array/String is 6-7x slower than using an indexed loop. In the case of Arrays/Strings, ideally, it would be faster to use an iterator than using an indexed loop. Not making some change here makes that task much harder and takes time from making other, more general, optimizations.

Overall, I agree with your perspective that it's important to keep the language as internally consistent as possible. I disagree, however, that consistency should be the only factor that matters. I actually don't even think this change is internally inconsistent; @ajklein already has provided examples where in the JS spec differs from your claim of consistency. I think it's important for TC-39 to operate under the idea that as long as developer economics are not appreciably impacted it's valuable to enable as many optimizations outside optimizing compilers as possible. It's also important to make sure the language is as useful as possible. Historically, in the case of iterators, the death sentence for them has been that they are much slower than an indexed loop.

@bterlson
Copy link
Member

I think this should be advanced as a proposal or needs-consensus PR. Any volunteers to write one up?

@zloirock
Copy link

zloirock commented Sep 1, 2017

I think that making something non-configurable / non-writable can cause a bad precedent which will cause a serious problem for polyfills in the future. If you wanna optimize for-of - you could check not only that iterator is an instance of ArrayIterator / [Symbol.iterator] of iterable is Array#values, but also that .next iterator method is original ArrayIterator#next method / ArrayIterator#next without changes. 1 additional property access is not something critical.

@ljharb
Copy link
Member

ljharb commented Sep 1, 2017

I would much prefer the property be accessed once and cached, than made unchangeable - for polyfill reasons particularly.

I think the likelihood of "changing the next method mid-iteration" is significantly low, because it's so unergonomic to do so.

Is there any data as to who would be broken by either change? My assumption is that "grab .next once" won't actually break any real code, even if it breaks contrived examples.

@littledan littledan changed the title Making %ArrayIteratorPrototype%.next non-configurable/non-writable Access the .next() method once, at the beginning, in the iteration protocol Sep 1, 2017
@erights
Copy link

erights commented Sep 2, 2017

Again reusing ideas from tvcutsem/es-lab#21 , a similar change that would be almost unobservable is:

If the .next property is a non-writable non-configurable data property and either
* it is an own property, or
* it is inherited by a non-extensible inheritance chain that does not override it,
then the for-of loop safely fetches it only once.

This change would only be observable if the involved objects are proxies are extremely exotic. Otherwise it could not be observed, and so should break essentially no code.

hubot pushed a commit to WebKit/WebKit-http that referenced this issue Sep 23, 2017
https://bugs.webkit.org/show_bug.cgi?id=175653

Reviewed by Saam Barati.

JSTests:

Change test to match the new iteration behavior.

* stress/spread-optimized-properly.js:

Source/JavaScriptCore:

This patch speculatively makes a change to the iteration protocall to fetch the next
property immediately after calling the Symbol.iterator function. This is, in theory,
a breaking change, so we will see if this breaks things (most likely it won't as this
is a relatively subtle point).

See: tc39/ecma262#976

* builtins/IteratorHelpers.js:
(performIteration):
* bytecompiler/BytecodeGenerator.cpp:
(JSC::BytecodeGenerator::emitEnumeration):
(JSC::BytecodeGenerator::emitIteratorNext):
(JSC::BytecodeGenerator::emitIteratorNextWithValue):
(JSC::BytecodeGenerator::emitDelegateYield):
* bytecompiler/BytecodeGenerator.h:
* bytecompiler/NodesCodegen.cpp:
(JSC::ArrayPatternNode::bindValue const):
* inspector/JSInjectedScriptHost.cpp:
(Inspector::JSInjectedScriptHost::iteratorEntries):
* runtime/IteratorOperations.cpp:
(JSC::iteratorNext):
(JSC::iteratorStep):
(JSC::iteratorClose):
(JSC::iteratorForIterable):
* runtime/IteratorOperations.h:
(JSC::forEachInIterable):
* runtime/JSGenericTypedArrayViewConstructorInlines.h:
(JSC::constructGenericTypedArrayViewFromIterator):
(JSC::constructGenericTypedArrayViewWithArguments):

LayoutTests:

Change test to match the new iteration behavior.

* js/sequence-iterator-protocol-2-expected.txt:
* js/sequence-iterator-protocol-2.html:


git-svn-id: http://svn.webkit.org/repository/webkit/trunk@222421 268f45cc-cd09-0410-ab3c-d52691b4dbfc
bterlson added a commit that referenced this issue Oct 16, 2017
As discussed in #976, this PR changes the how iteration works in the spec. Previously, the next method would be fetched off the iterator object on each iteration. This PR changes that to happen exactly once at the start of the protocol.

Some places in the spec used to pass a Record around with the iterator and the completion status and others just pass the iterator object. This has been simplified to make every point of iteration pass a Record containing the iterator, completion status, and now the next method.
@bterlson
Copy link
Member

bterlson commented Jul 9, 2018

This is addressed by #988/#1021.

ryanhaddad pushed a commit to WebKit/WebKit that referenced this issue Dec 22, 2020
https://bugs.webkit.org/show_bug.cgi?id=175653

Reviewed by Saam Barati.

JSTests:

Change test to match the new iteration behavior.

* stress/spread-optimized-properly.js:

Source/JavaScriptCore:

This patch speculatively makes a change to the iteration protocall to fetch the next
property immediately after calling the Symbol.iterator function. This is, in theory,
a breaking change, so we will see if this breaks things (most likely it won't as this
is a relatively subtle point).

See: tc39/ecma262#976

* builtins/IteratorHelpers.js:
(performIteration):
* bytecompiler/BytecodeGenerator.cpp:
(JSC::BytecodeGenerator::emitEnumeration):
(JSC::BytecodeGenerator::emitIteratorNext):
(JSC::BytecodeGenerator::emitIteratorNextWithValue):
(JSC::BytecodeGenerator::emitDelegateYield):
* bytecompiler/BytecodeGenerator.h:
* bytecompiler/NodesCodegen.cpp:
(JSC::ArrayPatternNode::bindValue const):
* inspector/JSInjectedScriptHost.cpp:
(Inspector::JSInjectedScriptHost::iteratorEntries):
* runtime/IteratorOperations.cpp:
(JSC::iteratorNext):
(JSC::iteratorStep):
(JSC::iteratorClose):
(JSC::iteratorForIterable):
* runtime/IteratorOperations.h:
(JSC::forEachInIterable):
* runtime/JSGenericTypedArrayViewConstructorInlines.h:
(JSC::constructGenericTypedArrayViewFromIterator):
(JSC::constructGenericTypedArrayViewWithArguments):

LayoutTests:

Change test to match the new iteration behavior.

* js/sequence-iterator-protocol-2-expected.txt:
* js/sequence-iterator-protocol-2.html:


Canonical link: https://commits.webkit.org/193717@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@222421 268f45cc-cd09-0410-ab3c-d52691b4dbfc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants