-
Notifications
You must be signed in to change notification settings - Fork 28k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-41497][CORE] Fixing accumulator undercount in the case of the …
…retry task with rdd cache ### What changes were proposed in this pull request? As described in [SPARK-41497](https://issues.apache.org/jira/browse/SPARK-41497), when a task with rdd cache failed after caching the block data successfully, the retry task will load from the cache. While since the first task attempt failed, so the registered accumulators won't get updated. The general idea to fix the issue in this PR is to add a visibility status for RDDBlocks, a RDDBlock will be visible only when one of the tasks generating the RDDBlock succeed to guarantee that accumulators have been updated. Making below changes to do this: 1. In `BlockManagerMasterEndpoint`, adding `visibleRDDBlocks` to help record the RDDBlocks which are visible, and `tidToRddBlockIds` to help to track the RDDBlocks generated in each taskId so that we can update the visibility status based on task status; 2. In `BlockInfoManager`, adding `visibleRDDBlocks` to track the visible RDDBlocks in the block manager, once a RDDBlock is visible, master will ask BlockManagers having the block to update the visibility status; 3. When do `RDD` getOrCompute, re-compute the partition to update accumulators if the cached RDDBlock is not visible event if the cached data exists, and report the taskId and RDDBlock relationship to `BlockManagerMasterEndpoint`; 4. When a task finished successfully, ask `BlockManagerMasterEndpoint` to update the blocks to be visible, and broadcast the visibility status to `BlockManagers` having the cached data. ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Adding new UT. Closes #39459 from ivoson/SPARK-41497. Authored-by: Tengfei Huang <tengfei.h@gmail.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
- Loading branch information
Showing
12 changed files
with
489 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.