New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix progress logging for parallel builds #9096
Conversation
Currently, the progress bar increases when each parallel task is submitted for processing. However, that doesn't necessarily mean any progress has actually been made yet. The result is the progress bar very quickly jumps to 100% and stalls at > waiting for workers ... This updates the progress bar as individual tasks are joined, for long-running builds this creates a meaningfully sense of progress.
6d201d4
to
ad58d88
Compare
# Update progress message for each chunk completed | ||
nonlocal chunks_complete | ||
chunks_complete += 1 | ||
log_status_message(display_chunk(docs), summary, 'purple', chunks_complete, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a new option to control the timing of messages (ex. before, after, both) to status_iterator()
instead? I think the concept of status_iterator()
is separating progress messages from the main code. But this refactoring makes them coupled. It would be nice if we can keep them less coupled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion doesn't relate to the issue because the for loop doesn't wait until the tasks are completed. The entire loop can finish before even a single task has.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tk0miya in case I wasn't clear, the status_iterator
here only measures the progress of submitting tasks into the queue with add_task
. make_chunks
can create more than nproc
chunks, and add_task
doesn't block when there are more than nproc
tasks submitted. It just adds them to a queue ready to be run. So, there is no connection whatsoever between the status_iterator
progress and actual progress of building the documentation. i.e. the progress messages are pretty much useless.
For writing, there is some serialised work executed inside of the for loop which is measured by the status_iterator
. However, it's still true that none of the parallelised work is represented in the progress messages. So there can still be huge delays between 100% progress and actually finishing the build step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separating progress messages from the main code
The logic in the PR looks good to me. Progress messages cannot be decoupled completely from doing the actual work - there has to be some signal that says "chunk is done, please send out a new progress message".
@tk0miya is there a chance this can be merged in the near future? This will be quite helpful for SciPy. We have a very heavy doc build (~20 minutes) that sometimes times out on CI because no output is received.
If there's some extra validation or other activity that we can do to help move this forward, please let us know!
@tk0miya is it possibly this PR is just getting bumped each release because it's mislabelled as a "refactor". This isn't a refactor and does meaningfully fix the bug of progress messages not actually tracking the number of build jobs completed. |
Would be great to have this. In some scenarios, there is no update in the console for 20min, which let some developers think that the build has stopped and they stop the process :( |
@tk0miya Anything I can do here to finalize it? |
…lle build is wrong The value of progress bar (reading and writing) should be increased just after each task is finished.
…lle build is wrong The value of progress bar (reading and writing) should be increased just after each task is finished.
Sorry for the very very very late response. I posted #10454 as a candidate of the yet-another implementation for this problem. It does not touch the implementation of |
…lle build is wrong The value of progress bar (reading and writing) should be increased just after each task is finished.
…lle build is wrong The value of progress bar (reading and writing) should be increased just after each task is finished.
…lel_build Fix #9096: sphinx-build: the value of progress bar for paralle build is wrong
Finally, I determined to merge #10454 instead of this. I believe it is based on this PR. So I must say thank you for all :-) |
Thanks a lot @tk0miya, great to have a fix for this! |
Feature or Bugfix
Purpose
Currently, the progress bar increases when each parallel task is submitted for processing. However, that doesn't necessarily mean any progress has actually been made yet. The result is the progress bar very quickly jumps to 100% and stalls at
This updates the progress bar as individual tasks are joined, for long-running builds this creates a meaningfully sense of progress. This both creates a nicer user-experience for users, and also prevents CI jobs from timing out from lack of activity.