Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix progress logging for parallel builds #9096

Closed
wants to merge 2 commits into from

Conversation

peterbell10
Copy link
Contributor

@peterbell10 peterbell10 commented Apr 13, 2021

Feature or Bugfix

  • Bugfix

Purpose

Currently, the progress bar increases when each parallel task is submitted for processing. However, that doesn't necessarily mean any progress has actually been made yet. The result is the progress bar very quickly jumps to 100% and stalls at

waiting for workers ...

This updates the progress bar as individual tasks are joined, for long-running builds this creates a meaningfully sense of progress. This both creates a nicer user-experience for users, and also prevents CI jobs from timing out from lack of activity.

Currently, the progress bar increases when each parallel task is submitted for
processing. However, that doesn't necessarily mean any progress has actually
been made yet. The result is the progress bar very quickly jumps to 100% and stalls at

> waiting for workers ...

This updates the progress bar as individual tasks are joined, for
long-running builds this creates a meaningfully sense of progress.
sphinx/util/__init__.py Outdated Show resolved Hide resolved
# Update progress message for each chunk completed
nonlocal chunks_complete
chunks_complete += 1
log_status_message(display_chunk(docs), summary, 'purple', chunks_complete,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a new option to control the timing of messages (ex. before, after, both) to status_iterator() instead? I think the concept of status_iterator() is separating progress messages from the main code. But this refactoring makes them coupled. It would be nice if we can keep them less coupled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion doesn't relate to the issue because the for loop doesn't wait until the tasks are completed. The entire loop can finish before even a single task has.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tk0miya in case I wasn't clear, the status_iterator here only measures the progress of submitting tasks into the queue with add_task. make_chunks can create more than nproc chunks, and add_task doesn't block when there are more than nproc tasks submitted. It just adds them to a queue ready to be run. So, there is no connection whatsoever between the status_iterator progress and actual progress of building the documentation. i.e. the progress messages are pretty much useless.

For writing, there is some serialised work executed inside of the for loop which is measured by the status_iterator. However, it's still true that none of the parallelised work is represented in the progress messages. So there can still be huge delays between 100% progress and actually finishing the build step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separating progress messages from the main code

The logic in the PR looks good to me. Progress messages cannot be decoupled completely from doing the actual work - there has to be some signal that says "chunk is done, please send out a new progress message".

@tk0miya is there a chance this can be merged in the near future? This will be quite helpful for SciPy. We have a very heavy doc build (~20 minutes) that sometimes times out on CI because no output is received.

If there's some extra validation or other activity that we can do to help move this forward, please let us know!

@tk0miya tk0miya added this to the 4.1.0 milestone Apr 25, 2021
@tk0miya tk0miya changed the base branch from 3.x to 4.x April 25, 2021 16:49
@tk0miya tk0miya modified the milestones: 4.1.0, 4.2.0 Jul 10, 2021
@tk0miya tk0miya modified the milestones: 4.2.0, 4.3.0 Sep 12, 2021
@peterbell10
Copy link
Contributor Author

@tk0miya is it possibly this PR is just getting bumped each release because it's mislabelled as a "refactor". This isn't a refactor and does meaningfully fix the bug of progress messages not actually tracking the number of build jobs completed.

@peterbell10 peterbell10 changed the title Improve progress logging on parallel builds Fix progress logging for parallel builds Sep 30, 2021
@tk0miya tk0miya modified the milestones: 4.3.0, 4.4.0 Nov 9, 2021
@tk0miya tk0miya modified the milestones: 4.4.0, 4.5.0 Jan 15, 2022
@danwos
Copy link
Contributor

danwos commented Mar 24, 2022

Would be great to have this.
We need to use the parallel build for all projects, as they are quite huge, and build time is > 30min with 4 cores.

In some scenarios, there is no update in the console for 20min, which let some developers think that the build has stopped and they stop the process :(

@danwos
Copy link
Contributor

danwos commented Mar 24, 2022

@tk0miya Anything I can do here to finalize it?

@tk0miya tk0miya modified the milestones: 4.5.0, 5.0.0 Mar 27, 2022
tk0miya added a commit to tk0miya/sphinx that referenced this pull request May 15, 2022
…lle build is wrong

The value of progress bar (reading and writing) should be increased just after
each task is finished.
tk0miya added a commit to tk0miya/sphinx that referenced this pull request May 15, 2022
…lle build is wrong

The value of progress bar (reading and writing) should be increased just after
each task is finished.
@tk0miya
Copy link
Member

tk0miya commented May 15, 2022

Sorry for the very very very late response. I posted #10454 as a candidate of the yet-another implementation for this problem. It does not touch the implementation of status_iterator(). I think it keeps main code simple. Could you let me know your opinion. Thanks,

tk0miya added a commit to tk0miya/sphinx that referenced this pull request May 15, 2022
…lle build is wrong

The value of progress bar (reading and writing) should be increased just after
each task is finished.
tk0miya added a commit to tk0miya/sphinx that referenced this pull request May 15, 2022
…lle build is wrong

The value of progress bar (reading and writing) should be increased just after
each task is finished.
tk0miya added a commit that referenced this pull request May 22, 2022
…lel_build

Fix #9096: sphinx-build: the value of progress bar for paralle build is wrong
@tk0miya
Copy link
Member

tk0miya commented May 22, 2022

Finally, I determined to merge #10454 instead of this. I believe it is based on this PR. So I must say thank you for all :-)

@tk0miya tk0miya closed this May 22, 2022
@rgommers
Copy link
Contributor

Thanks a lot @tk0miya, great to have a fix for this!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants