Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding exception handling for room start tasks #33

Merged
merged 1 commit into from
Apr 26, 2024

Conversation

jzhang20133
Copy link
Collaborator

@jzhang20133 jzhang20133 commented Apr 26, 2024

Sometimes, we have observed issues that task group in websocket_server is no longer active and user will be stuck after that and no longer able to access files and have to restart nb servers to unblock. _task_group instance variable in websocket_server will become inactive when one of its tasks of starting room or its children task group task fail with exception.

One place where an exception can occur is in the _broadcast_update method, which is a key task in children task group of _task_group in websocket_server. In this PR, the exception is handled and logged to prevent the parent task from crashing while still displaying the issue that will allow us better handle them in the future.

Resolving open source issues:
jupyterlab/jupyter-collaboration#290
jupyterlab/jupyter-collaboration#245

@jzhang20133 jzhang20133 added the enhancement New feature or request label Apr 26, 2024
@jzhang20133 jzhang20133 force-pushed the yroom-exception branch 2 times, most recently from 19c131b to 87eaafa Compare April 26, 2024 17:45
pycrdt_websocket/yroom.py Outdated Show resolved Hide resolved
@Zsailer
Copy link
Member

Zsailer commented Apr 26, 2024

Nice work @jzhang20133! Looks good to me!

Let's merge and iterate from here.

@Zsailer Zsailer merged commit 3764238 into jupyter-server:main Apr 26, 2024
6 checks passed
Copy link

welcome bot commented Apr 26, 2024

Congrats on your first merged pull request in this project! 🎉
congrats
Thank you for contributing, we are very proud of you! ❤️

@davidbrochart
Copy link
Collaborator

Thanks @jzhang20133.
This breaks the context manager API, as the streams are not created there anymore after they have been moved. I opened #35.

Comment on lines +154 to +158
try:
self._task_group.start_soon(self.ystore.write, update)
self.log.debug("Writing Y update to YStore")
except Exception as exception:
self._handle_exception(exception)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a follow-up to this PR would be to not handle exceptions here, but have a YStore have an optional exception handler, and do the handling there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, @davidbrochart. I've created the following issue to track: #36

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants