Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Rework kernel status #4724

Closed
wants to merge 24 commits into from
Closed

Conversation

jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jun 14, 2018

Related to work in and built on top of #4697.

@jasongrout jasongrout added this to the Beta 3 milestone Jun 14, 2018
@jasongrout
Copy link
Contributor Author

Related issue about notifying on kernel status change: #4748

@jasongrout
Copy link
Contributor Author

jasongrout commented Jul 19, 2018

@minrk and I talked more about this today. Here are some thoughts:

The kernel states are conflating the connection status with the kernel status. Actually, there are three sources for status here: the client (mostly dealing with connection state), the notebook server (dealing with kernel management, like a restarting state), and the kernel itself.

Perhaps we need two states: the connection state and the kernel state. When we are in the disconnected state, the kernel state is unknown.

Another thing: on restart, don't create a new websocket connection to the kernel. Keep the existing connection, and we'll receive the restarting event.

Here are some scenarios:

New Kernel start

  1. Start at connection disconnected, state unknown
  2. Request a kernel and initiate a websocket connection - connection -> connecting
  3. Websocket connection set up: connection -> connected
  4. Send a request_kernel_info to cache the kernel info.
  5. Receive the kernel info reply, cache the kernel info, thus receiving several kernel state messages such as busy, idle, so that the state ends up on idle.
  6. Send the pending messages queued up while the kernel was not connected.
  7. Resolve the kernel ready promise, signaling that the kernel information is cached and the kernel is ready for messages

Kernel connection to existing kernel
Exactly like the new kernel start, but we don't request a kernel start, but just initiate a websocket connection to the url we already know.

Kernel restart

Don't disconnect the kernel in this entire process.

  1. Request a kernel restart from the rest api. Set a restart_requested flag that we requested a restart. Don't set the state to restarting, since we might have more iopub messages coming through, especially when the notebook flushes the iopub buffer.
  2. Wait until the notebook's restarting kernel status comes through. Change the kernel state to restarting. If the requested_restart flag is set, don't notify the user (since the user requested a restart). If the flag is not set, we have an automatic restart, so notify the user somehow.
  3. Set kernel.isReady to false, and make a new ready() promise. Start queuing messages here?
  4. The kernel starting status might come through, make that a status? Since it's supposed to be an optional status, perhaps we shouldn't make it a recognized value?
  5. On kernel restart (e.g., the first message that comes through that uses a different kernel session id?), send a kernel info request.
  6. Just as in kernel start set the kernel to ready and resolve the ready promise

@jasongrout
Copy link
Contributor Author

A big question: do we want the kernel status / signal to be only some specific values, or do we want to acknowledge kernels can have their own optional status? Do we want the kernel's status to just be a string? Or do we want to ignore status values we don't understand?

@blink1073
Copy link
Member

Set kernel.isReady to false, and make a new ready() promise. Start queuing messages here?

Sounds good

The kernel starting status might come through, make that a status? Since it's supposed to be an optional status, perhaps we shouldn't make it a recognized value?

It doesn't look like it is optional here. # The kernel will publish state 'starting' exactly once at process startup.

On kernel restart (e.g., the first message that comes through that uses a different kernel session id?), send a kernel info request.

Sounds good

Do we want the kernel's status to just be a string? Or do we want to ignore status values we don't understand?

I'd say forward them all since kernel-specific extensions may want that information

@jasongrout
Copy link
Contributor Author

Perhaps also the connected state should just be a boolean flag, rather than also having a 'connecting' state signaling when we are actively trying to connect.

@blink1073
Copy link
Member

I think connection state should be: connecting, connected, disconnected.

@blink1073
Copy link
Member

We are disconnected if we explicitly disconnect or give up on retries.

@jasongrout
Copy link
Contributor Author

I think connection state should be: connecting, connected, disconnected.

I guess we want to make sure the user knows through some UI element that we are trying to connect, so I think you're right that it makes sense to make that a specific state.

@jasongrout
Copy link
Contributor Author

It doesn't look like it is optional here. # The kernel will publish state 'starting' exactly once at process startup.

See the open PR to jupyter_client, based on conversations with @minrk about kernel status: jupyter/jupyter_client#388

@ellisonbg
Copy link
Contributor

@jasongrout are you still wanting this to get into Monday's 0.34 release?

@blink1073 blink1073 modified the milestones: 0.34, 0.35 Aug 13, 2018
@blink1073 blink1073 modified the milestones: 0.35, 1.0 Sep 5, 2018
@jasongrout
Copy link
Contributor Author

It doesn't look like it is optional here. # The kernel will publish state 'starting' exactly once at process startup.

By the way, we were also discussing lots of updates to that document: jupyter/jupyter_client#388

@jasongrout
Copy link
Contributor Author

jasongrout commented Nov 29, 2018

In the kernel restart scenario in #4724 (comment), we should switch steps 2 and 3, so that a user can listen for the 'restarting' status and in that handler, await the kernel ready promise. So the revised steps look like:

Kernel restart

Don't disconnect the kernel in this entire process.

  1. Request a kernel restart from the rest api. Set a restart_requested flag that we requested a restart. Don't set the state to restarting, since we might have more iopub messages coming through, especially when the notebook flushes the iopub buffer.
  2. Wait until the notebook's restarting kernel status comes through.
  3. Set kernel.isReady to false, and make a new ready() promise. Start queuing messages here?
  4. Change the kernel state to restarting. If the requested_restart flag is set, don't notify the user (since the user requested a restart). If the flag is not set, we have an automatic restart, so notify the user somehow.
  5. On kernel restart (e.g., the first message that comes through that uses a different kernel session id?), send a kernel info request.
  6. Just as in kernel start set the kernel to ready and resolve the ready promise

@jasongrout
Copy link
Contributor Author

jasongrout commented Nov 29, 2018

Some further simplifications based on conversations with @saulshanabrook, @blink1073:

  1. Get rid of the public kernel isReady value and ready promise. Instead, if you want to do something on kernel startup, you can just listen for the kernel status to change to restarting (at which point you can send a message and it will be queued).
  2. we discussed removing the connection status (implying it from an unknown kernel state), but decided to keep it because it really is different information, and this nicely paves the way to a app-wide connection status when we eventually have a single websocket for all server communication.

So now the scenarios look like:

New Kernel start

  1. Start at connection disconnected, state unknown
  2. Request a kernel and initiate a websocket connection - connection -> connecting
  3. Websocket connection set up: connection -> connected
  4. Send a request_kernel_info to cache the kernel info.
  5. Receive the kernel info reply, cache the kernel info, thus receiving several kernel state messages such as busy, idle, so that the state ends up on idle.
  6. Send the pending messages queued up while the kernel was not connected.

Kernel connection to existing kernel

Exactly like the new kernel start, but we don't request a kernel start, but just initiate a websocket connection to the url we already know.

Kernel restart

Don't disconnect the kernel in this entire process.

  1. Request a kernel restart from the rest api. The kernel object that sends this message is also responsible to add the kernel id a static singleton set kernelsRestarting, perhaps in the Kernel base class. Don't set the state to restarting, since we might have more iopub messages coming through, especially when the notebook flushes the iopub buffer.
  2. When the notebook's restarting kernel status message comes, change the kernel state to restarting. Clear any pending messages and start storing new pending messages. Anyone can check the kernelsRestarting set to see if this is a kernel restart the user requested, and if it isn't, notify the user (perhaps in each client that is connected to the kernel?) that the kernel has restarted, so state has been cleared.
  3. Immediately do steps 4-6 of the new kernel process above (send a kernel info request synchronously so it's the first queued message).

Execute code when kernel is reset

Immediately send the kernel message (which will be queued if necessary). Also, connect to the kernel status, and send the kernel message when the status changes to restarting.

@jasongrout
Copy link
Contributor Author

@saulshanabrook, @blink1073, can you look at the new writeup at #4724 (comment) ?

@jasongrout
Copy link
Contributor Author

As you can see, I've dug deeper and deeper into how we expose kernel status and other information, including peeling back the session and clientsession objects we have. There were a lot of really confusing dependencies between those concepts.

Here's another thought: how about we get rid of the concept of sessions entirely, at least for most users. They're not extremely useful to us as an abstraction. We would probably still use the concept of sessions in the background to maintain a mapping between kernels and documents, but what is exposed to the user on a document context would be just a kernel, plain and simple. Or perhaps a kernel container of some kind that was very lightweight (i.e., had a kernel attribute, and a kernelChanged signal, or something). Behind the scenes, if you wanted to associate document paths with kernels, you could, and then those would be persisted to the server, but it wouldn't be necessary for things to use sessions.

@saulshanabrook
Copy link
Member

Here's another thought: how about we get rid of the concept of sessions entirely, at least for most users.

I am 100% in favor of removing unneeded abstractions! Happy to walk through this with you tomorrow.

@jasongrout
Copy link
Contributor Author

To elaborate more: let's make kernels first-class objects in JLab. Like you should be able to start a kernel with no attached activity, and a kernel should (optionally?) be able to live beyond any document associated with it.

…eturn undefined if the object was not found.

This makes it more consistent with standard Javascript practices, and preserves the errors for true errors.
@jasongrout jasongrout modified the milestones: 1.0, Future Jan 26, 2019
jasongrout added a commit to jasongrout/jupyterlab that referenced this pull request Sep 21, 2019
This simplified logic quite a bit.

Most of these ideas came from previous work in jupyterlab#4724
jasongrout added a commit to jasongrout/jupyterlab that referenced this pull request Sep 21, 2019
This simplified logic quite a bit.

Most of these ideas came from previous work in jupyterlab#4724
@jasongrout
Copy link
Contributor Author

This is being superseded by #7252

@jasongrout jasongrout closed this Oct 3, 2019
@lock lock bot added the status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label Nov 2, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Nov 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
maintenance pkg:services status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. status:Work in Progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants