Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial async generator support #1560

Merged
merged 8 commits into from Aug 5, 2018
Merged

Conversation

aearly
Copy link
Collaborator

@aearly aearly commented Jul 10, 2018

Closes #1551

We have to create a special eachOfLimit implementation for this, but once fully tested, all collections methods should fall into place.

@aearly aearly added this to the 3.0 milestone Jul 10, 2018
@aearly aearly added the feature label Jul 10, 2018
running -= 1;
if (err) return handleError(err)

if (err === false) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid calling replenish int his case and go straight to calling the complete callback?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I should return in this block.


function replenish() {
//console.log('replenish')
if (running >= limit || awaiting) return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you end up with more than one generator running asynchronously with this code? If I understand the code correctly, replenish only gets called

  1. to spawn the generator when asyncEachOfLimit is called
  2. whenever a generator completes
  3. after iterateeCallback is called

Do we not need to loop limit - running inside of replenish to call several generators at once?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd only end up with more than one iteratee running if your iteratee takes longer to process than your generator takes to generate. For example, in the tests, I have the generator ticking every 1 ms, but have the iteratee take 5 ms.

Async generators are like streams -- you get a linear sequence of items. You can't await multiple items at the same time (and if you did, you'd get the same item multiple times, as you do when you call .then() on the same promise multiple times). This is why we can't loop in replenish -- we'd end up calling the iteratee with the same item multiple times.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see thanks for explaining, I haven't used generators before.

So for my understanding, the replenish inside of the generator's then function is to spawn the asynchronous processes up until limit are running and the replenish inside of the iterateeCallback is to trigger additional items to run if the limit was reached?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't await multiple items at the same time (and if you did, you'd get the same item multiple times...)

Are you sure of this, as is right now we have the possibility of calling next multiple times before the promise resolves if iteratee is synchronous. In such a case will things break?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why we can't loop in replenish -- we'd end up calling the iteratee with the same item multiple times.

Sorry, I may be misunderstanding something, but I thought calls to next are queued and resolved separately. I'm looking at the Async iterators and async iterables section of the proposal.

That being said, I think we should still probably avoid looping in replenish as we still have to wait for earlier promises to resolve.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right, I was confusing .next() with .then(). We can call .next() as many times as we need to, perhaps up to the concurrency limit.

if (running >= limit || awaiting) return
//console.log('replenish awaiting')
awaiting = true
generator.next().then(({value, done: iterDone}) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have a generator that has no items?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind dumb question, I see how this works

//console.log('done')
return callback(null);
}
replenish()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to avoid calling replenish when done === true or have replenish check if done === true.

Lets say hypothetically we have a limit of 2 and the final 2 iteratees are running. When one of these iteratees resolves, running = 1 and done = true but we'll still replenish and call the generator.next().then(). This has the possibility of entering a race condition, if the other iteratee resolves in the time between the then callback is called, you can actually end up calling callback twice

if (running >= limit || awaiting) return
//console.log('replenish awaiting')
awaiting = true
generator.next().then(({value, done: iterDone}) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what @megawac was saying, I think we need a if (done) return; line at the very top of this callback. If one of the previous iteratees cancelled or errored, we shouldn't be invoking the iteratee with successive items.

For example, if the first iteratee starts and we start waiting for the second item. If the first iteratee cancels before the then callback is called with the second item, when it is called, we would still invoke the iteratee with the second item.

@@ -15,6 +17,9 @@ export default (limit) => {
if (!obj) {
return callback(null);
}
if (isAsyncGenerator(obj)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor thing, but should this be expanded to support async iterators in general? Something similar to how we currently have getIterator except with Symbol.asyncIterator.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, didnt know about that detail of the spec.

replenish()
}

function handleError(err) {

This comment was marked as resolved.


function replenish() {
//console.log('replenish')
if (running >= limit || awaiting || done) return
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I experimented with calling replenish() multiple times, awaiting mutiple .next()s at the same time, but there's some grey area in how it could work. Since you don't know the length of the iterator up front, you have to set some arbitrary limits in the pure parallel case, e.g. 10. Whenever you call .next() N times synchronously, you end up calling next() and receiving {done: true} N times spuriously at the end. We don't need to await multiple .next()s at the same time, because as far as I know, there will be a strict linear sequence of yields in the generator, meaning it's pointless to await more than one. It leaves the job of backpressure to the generator implementation too.

@megawac
Copy link
Collaborator

megawac commented Jul 22, 2018

@aearly can we update our eslint config to be consistent on semi colons please

@aearly
Copy link
Collaborator Author

aearly commented Jul 30, 2018

Okay, I've added support for async iterables. Implementation was pretty simple, hard part is getting all the various babel stuff to be happy.

@aearly
Copy link
Collaborator Author

aearly commented Jul 30, 2018

Also, I'd want to do a linter change in a separate PR. Going to be a big ugly diff if we switch.

@aearly aearly changed the title [wip] initial async generator support feat: initial async generator support Aug 5, 2018
@aearly aearly merged commit 61268c5 into master Aug 5, 2018
@aearly aearly deleted the async-generator-support branch August 5, 2018 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants