Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send ProcessCompleted message when job is cancelled #8260

Merged
merged 4 commits into from May 13, 2024

Conversation

freddyaboulton
Copy link
Collaborator

@freddyaboulton freddyaboulton commented May 10, 2024

Description

If a job is cancelled, the server and the client will keep the /queue/data stream open forever.

Notice that in the first image, the multiple heartbeats gets sent after the job is cancelled (indicating the channel was kept open for at least 45 seconds post cancellation) while in the second message the process_completed and close_stream messages are sent after cancellation. This was tested on the fake_diffusion demo.

Main

Screenshot 2024-05-10 at 3 36 47 PM

This PR

Screenshot 2024-05-10 at 3 35 08 PM

🎯 PRs Should Target Issues

Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.

Not adhering to this guideline will result in the PR being closed.

Tests

  1. PRs will only be merged if tests pass on CI. To run the tests locally, please set up your Gradio environment locally and run the tests: bash scripts/run_all_tests.sh

  2. You may need to run the linters: bash scripts/format_backend.sh and bash scripts/format_frontend.sh

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented May 10, 2024

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
🦄 Changes detected! Details

Install Gradio from this PR

pip install https://gradio-builds.s3.amazonaws.com/bfa0dfd7aee551bec5cc1dcf2f0e6e22a9c57b46/gradio-4.31.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@bfa0dfd7aee551bec5cc1dcf2f0e6e22a9c57b46#subdirectory=client/python"

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented May 10, 2024

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
gradio patch
  • Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Send ProcessCompleted message when job is cancelled

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@freddyaboulton freddyaboulton added t: fix A change that implements a fix v: patch A change that requires a patch release labels May 10, 2024
@freddyaboulton freddyaboulton marked this pull request as ready for review May 10, 2024 21:57
# If the function is a cancel function, 'predictions' are the ids of
# the event in the queue that has been cancelled. We need these
# so that the server can put the ProcessCompleted message in the event stream
# Cancel events have no output components, so we need to return early otherise the output
Copy link
Collaborator

@aliabid94 aliabid94 May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got confused by what you were saying here

Suggested change
# Cancel events have no output components, so we need to return early otherise the output
# Cancel events have no output components, so we need to return early to return the ids

@@ -738,6 +738,21 @@ async def predict(
fn_index_inferred=fn_index_inferred,
root_path=root_path,
)
if ( # noqa: SIM102
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I ran ruff, it didn't need this noqa statement? what does it do here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It raised an error because the two if statements can be combined into 1 but I think that would make the code harder to read.

Copy link
Collaborator

@aliabid94 aliabid94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and tested and LGTM, thanks freddy! I hadn't really explored the cancel logic in the backend before, and found the design a bit confusing. When we cancel, the frontend makes a call to both

  1. /reset and
  2. as a regular event listener whose function is defined by get_cancel_function.
    Unless I'm missing something, I think we should remove 2 entirely and handle all the logic needed via 1. Not only does it save a request, but some of the logic is done in a weird way in 1 because it's not being handled by gradio itself but some pre-defined logic. For example, using task names to store and track data for cancelling is weird, but necessary right now. We also have to hack postprocessing specifically for the is_cancel_event case. But if we used the /reset endpoint to handle that logic, gradio can track events directly and I think it would be much simpler and cleaner.

@freddyaboulton
Copy link
Collaborator Author

Thanks for the review @aliabid94 ! Let me merge this in and then we can refactor in subsequent PR.

@freddyaboulton freddyaboulton merged commit 7e976fd into main May 13, 2024
7 checks passed
@freddyaboulton freddyaboulton deleted the close-stream-on-cancel branch May 13, 2024 21:43
@pngwn pngwn mentioned this pull request May 13, 2024
@@ -1656,10 +1669,18 @@ def validate_outputs(self, fn_index: int, predictions: Any | list[Any]):

async def postprocess_data(
self, fn_index: int, predictions: list | dict, state: SessionState | None
):
) -> Any:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you set it to Any to avoid type hinting entirely with the list | dict, but would be great to fix in a refactor soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t: fix A change that implements a fix v: patch A change that requires a patch release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants