Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new Poll class for filebrowser contents polling. #6305

Merged
merged 7 commits into from May 11, 2019

Conversation

ian-r-rose
Copy link
Member

Uses the Poll class for the filebrowser refresh scheduling.

References

Fixes #6157.

Code changes

Internal changes to polling logic of the filebrowser. Replaces some crude backoff logic with the smarter Poll.

User-facing changes

Some differences in how polling backs off upon failures to connect, but the user would have to be looking very hard to see it.

Backwards-incompatible changes

None

@ian-r-rose ian-r-rose requested a review from afshin May 4, 2019 02:31
@jupyterlab-dev-mode
Copy link

Thanks for making a pull request to JupyterLab!

To try out this branch on binder, follow this link: Binder

@ian-r-rose ian-r-rose added this to the 1.0 milestone May 4, 2019
let date = new Date().getTime();
if (date - this._lastRefresh > refreshInterval) {
return this.refresh();
}
Copy link
Contributor

@jasongrout jasongrout May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this logic can go away with the new Poll class, I think. It's sufficient to just call the update method. When we want to have the poll immediately refresh, we call the poll's refresh method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. It also does some logic around a minimum refresh interval, which I tried to keep intact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to have a minimum interval option in Poll, but it didn't handle manual refreshes (it assumed a manual refresh was always immediate). Perhaps we add a minimum refresh interval to Poll for this debouncing. I think it would be pretty straightforward - the poll refresh() method would just schedule using logic like here.

So this logic isn't really a true debounce, in that it doesn't schedule a refresh according to MIN_REFRESH, it simply ignores requests if they are too soon after the last tick. That's perhaps a little frustrating - you click and click and click, and if you stop clicking before MIN_REFRESH, it's like you never clicked at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, it's a bit more complicated for Poll, since we really want to debounce the callback (and every time we are about to call the callback, we check to see if we should go into standby, etc.). So that means the Poll should really be storing the last time it called the callback and make sure it's not calling the callback more often than it should.

@@ -104,7 +104,23 @@ export class FileBrowserModel implements IDisposable {
};
window.addEventListener('beforeunload', this._unloadEventListener);
this._scheduleUpdate();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something should be done with _scheduleUpdate too - that should probably be going away with the conversion to using Poll.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the refresh() method may need some work too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_scheduleUpdate has some logic around minimum refresh intervals, which I was endeavoring to leave intact for now. This is triggered upon fileChanged signals. I'm not aware of the history of that logic, but had thought that maybe there was some debouncing that was necessary.

If not, I'm happy to simplify this further, was just starting out with a more conservative approach.

if (date - this._lastRefresh > MIN_REFRESH) {
void this.refresh();
} else {
this._requested = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, so it actually was doing a debounce, in that if we tried to schedule an update too soon, it would note it and follow up with an update at some point in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was doing that.

Note, however, that the user clicking refresh() circumvents the polling all-together, so they get immediate feedback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the original intent was for the auto-poll to not happen too soon after a user or programmatically triggered poll.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should be okay with deleting the logic entirely, then. If a user clicking on the refresh button calls the poll refresh, the poll knows about the refresh and its interval starts over again.

@jasongrout
Copy link
Contributor

jasongrout commented May 4, 2019

Is "refresh" the user-facing function to refresh the browser? If so, I think we should change refresh to call the poll refresh, and move the current refresh logic to the poll's callback. Then I think we can delete _scheduleUpdate and all other logic dealing with the last refresh time, unless we need to keep track of the last refresh time for other reasons (like to display to the user somewhere when the last time things were refreshed...perhaps that would be good tooltip info on the refresh button).

@blink1073
Copy link
Member

I think if we wanted to know the last time it was run that could be a property of the Poll object, but we aren't currently using that info iirc.

@ian-r-rose
Copy link
Member Author

Okay, sounds to me like the consensus is to completely remove the special logic and use a vanilla poll. If we find we still want some of the richer behavior around minimum poll intervals or storing the last-polled-time, we can add them to Poll.

@@ -239,8 +228,7 @@ export class FileBrowserModel implements IDisposable {
* Force a refresh of the directory contents.
*/
refresh(): Promise<void> {
this._lastRefresh = new Date().getTime();
return this.cd('.');
return this._poll.refresh();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to refresh, then return the poll's tick promise. The refresh just schedules the refresh, the actual action happens at the end of the tick.

Copy link
Contributor

@jasongrout jasongrout May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just make an async function like the tests:

await this._poll.refresh();
await this._poll.tick;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, I had misunderstood the function signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed making the refresh return the tick promise, but then it would be inconsistent with the other scheduling functions like start, etc. It would be convenient, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least make a note in the refresh method about this subtlety.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Poll has some logic around pending requests in which new calls to refresh() supersede older ones. That is not really what we want here, since calls to FileBrowserModel.refresh() can depend on a previous cd (which is what #5224 was fixing). I suspect this is a case that is not really intended to be addressed by the Poll functionality, where invocations of its actions are more-or-less memory-less.

We can get around this by not calling poll.refresh in model.refresh, and relying on the custom logic in cd around pending paths, but it's not particularly clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly, the order I think should be (IIRC):

  • Call poll.refresh
  • poll.refresh calls schedule async
  • poll.schedule resolves the current poll state (awaiting current tick promise resolve, which should happen after every other thing that was waiting on the current tick)
  • poll.schedule then requests an animation frame to run the poll action (but does not await it).
  • poll.schedule returns

At this point the await poll.refresh should return, and you await the poll.tick promise. The animation frame triggers, which awaits the poll action, then poll.schedules a 'resolved' state for the normal poll interval, and as part of the poll schedule, it resolves the poll.tick promise you were awaiting. The poll.tick promise is resolved with the poll instance, so you can check the poll state (which should be the 'resolved' state that was scheduled).

Does that help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so (await poll.tick()).state.payload should be the value returned from your factory function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly, the order I think should be (IIRC):

* Call poll.refresh

This was happening as expected.

* poll.refresh calls schedule async

Also happening as expected.

* poll.schedule resolves the current poll state (awaiting current tick promise resolve, which should happen after every other thing that was waiting on the current tick)

I don't think this is occurring as described, in particular, the resolution of the current tick does not seem to wait on the factory function:

// Emit ticked signal, resolve pending promise, and await its settlement.
this._ticked.emit(this.state);
pending.resolve(this);
await pending.promise;

* poll.schedule then requests an animation frame to run the poll action (but does not await it).

Agreed.

* poll.schedule returns

Agreed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is occurring as described, in particular, the resolution of the current tick does not seem to wait on the factory function:

The poll action happens in the execute, and after the factory is run and the result resolves, then schedule is called to schedule a resolve state. That's why the tick happens after the factory runs.

@ian-r-rose
Copy link
Member Author

Ooh, kind-of-confusing test from #5224 paying off.

@jasongrout
Copy link
Contributor

jasongrout commented May 4, 2019

Don't await poll.refresh in filebrowser model refresh, as it cancels previous invocations of refresh, breaking chained directory changes.

The poll design is such that you should always await the tick to ensure the last action was accomplished, otherwise things can preempt it (for example, since promise chains use microtasks, which are scheduled before tasks like requestAnimationFrame, doing: await poll.refresh(); await poll.refresh(); cancels the first poll refresh).

Was it still having an issue when you awaited the poll.tick promise in the filebrowser refresh method?

@ian-r-rose
Copy link
Member Author

Yes, awaiting the poll.tick was not having the intended effect, and was resulting in out-of-order directory changes in a test designed to catch that.

this._scheduleUpdate();
this._startTimer();
this._poll = new Poll({
factory: () => this.cd('.'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is really that this factory function should be async. It's not waiting for the cd to finish.

Copy link
Contributor

@jasongrout jasongrout May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that was an incorrect comment. It is returning a promise, so that should be good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is async, since this.cd('.') returns a promise. In some of my testing it was explicitly marked as async, which made no difference.

@jasongrout
Copy link
Contributor

Was the difference that https://github.com/jupyterlab/jupyterlab/pull/5224/files#diff-d1e4e18fcba11094ca3a885e7d8b9d3aR273 just returned ['src'], that the first refresh was canceled? That sounds like better behavior to me.

@jasongrout
Copy link
Contributor

So I think a consequence of using poll for the model refresh is that you can call refresh all you want in a task, and only the last one will trigger a server request in an upcoming task.

@ian-r-rose
Copy link
Member Author

No, the expected behavior is that it is first in the root directory, then in src, so paths = ['', 'src']. When awaiting poll.tick we get paths = ['src', 'src']

@jasongrout
Copy link
Contributor

No, the expected behavior is that it is first in the root directory, then in src, so paths = ['', 'src']. When awaiting poll.tick we get paths = ['src', 'src']

Why is ['', 'src'] better?

I'm a bit surprised you get two 'src's, unless something in cd causes you to spill to another task.

@ian-r-rose
Copy link
Member Author

The original problem of #4009 was that our initial fetch of the directory contents was called in the constructor and not awaited. That fetch could then come in after subsequent cds and result in a wrong directory content. #5224 was an attempt to specifically handle a slow initial fetch, but using poll.refresh circumvents that catch.

@ian-r-rose
Copy link
Member Author

So ['', 'src'] is better because it means that the asynchronous initialization of the browser model was able to complete. We may want to handle it differently now, but this was a really subtle race condition to figure out, and we want to make sure that #4009 doesn't regress.

@jasongrout
Copy link
Contributor

So ['', 'src'] is better because it means that the asynchronous initialization of the browser model was able to complete.

I see your latest commit changes this. What is your reasoning now?

* #### Notes
* The returned promise resolves after the tick is scheduled, but before
* the polling action is run. To wait until after the poll action executes,
* await the `poll.tick` promise: `await poll.refresh(); await poll.tick;`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@afshin afshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you @ian-r-rose. Looks good.

👍

@ian-r-rose
Copy link
Member Author

I'd like to take one more pass at figuring out the "out-of-order-cd" test before merging.

@afshin
Copy link
Member

afshin commented May 8, 2019

@blink1073: I think if we wanted to know the last time it was run that could be a property of the Poll object, but we aren't currently using that info iirc.

Since the poll emits a signal every tick, a client can keep track of the last tick that matches whatever criterion the client cares about, e.g., if the tick was successful (resolved) or reconnected or whatever else.

Also, Poll#state.timestamp shows the poll's time signature at the time of the last Poll#ticked signal.

@jasongrout
Copy link
Contributor

Also, Poll#state.timestamp shows the poll's time signature at the time of the last Poll#ticked signal.

Another way to think about it is state.timestamp is set when the current tick was scheduled. state.interval gives the intended time until the tick happens.

@jasongrout
Copy link
Contributor

@afshin, where did you get the time machine?
Screen Shot 2019-05-08 at 10 44 44 AM

@ian-r-rose
Copy link
Member Author

Okay, I think this is good to go.

@jasongrout I took a closer look at the cd('.') logic, and did some manual testing, and I think you are right that this behavior is better. It allows previous requests to be superseded by the more recent cd in a way that means we are generally less aggressive about hitting the backend, as the Poll class was intended to do.

@jasongrout
Copy link
Contributor

+29, -62 lines! Hooray for cleaning out code!

@jasongrout jasongrout merged commit f233cb5 into jupyterlab:master May 11, 2019
@jasongrout
Copy link
Contributor

Code looks good, thanks! I also tested this in binder and it seemed to work well there too.

@lock lock bot added the status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label Aug 7, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Aug 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pkg:filebrowser status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor filebrowser model to use Poll class instead of ad hoc setInterval poll.
4 participants