Avoid zombie processes on parallel build fail #11923
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Subject: Terminate processes correctly when parallel build fails
Feature or Bugfix
Purpose
On our build servers, we have had recurring (every 1-4 weeks) and hard-to-reproduce issues with sphinx builds that do not terminate. We have applied the change in this MR and have seen no reoccurrence of the error.
Detail
Without this change, this error message used to appear in our build logs at rare times:
I assume that when one of the threads dies, it cannot be joined and the build hangs.
Sadly, I lost most of the logs for and description of the original issue. All I know is that this change fixed the issue for us to the best of our knowledge.
I apologize for the lack of documentation, but I hope this can be merged nonetheless.
Relates
finally
to terminate parallel processes #10952, but it did not cover this instance of_join_one
.cc @cielavenir