New blob storage scheme avoiding large base dir count #13884
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
Bug fix
What does this PR do? Why is it needed?
This PR changes the naming scheme for blob file storage. Currently the scheme is a single
blobs
directory which contains a subdir for each unique block root, within which we store blobs by their index number, eg:blobs/0xff09895e2ff6232ac1ef6f08b1940e0177aaed5d8b682a30574a1cbd3ec6b487/0.ssz
The problem with this approach is that it leads to a large directory entry for the
blobs
dir, which can cause issues with older filesystems. With this PR, an extra subdir is added to group block root directories by their first byte, eg the directory:blobs/0xff09895e2ff6232ac1ef6f08b1940e0177aaed5d8b682a30574a1cbd3ec6b487
is renamed to:
blobs/0xff/0xff09895e2ff6232ac1ef6f08b1940e0177aaed5d8b682a30574a1cbd3ec6b487
In order to minimizes changes to the storage code and reduce the risk of new bugs, we perform a one-time migration of the legacy structure during the blob cache warm up, which runs at node startup. This migration makes the appropriate containing directory for each subdir in the old format (eg
0xff
in the above example) and callsRename
to move the existing subdir into the new enclosing directory. On most systems this should be an atomic syscall that should be fairly cheap.Which issues(s) does this PR fix?
Fixes #13880
Other notes for review
We're keeping this as a draft PR while we assess whether it makes sense to release before a minor version bump. We would prefer to wait for a version bump because this change is not backwards compatible - the new directory structure won't be understood by nodes running previous releases. It appears the
dir_nlink
ext4 feature flag is enabled by default on modern ext4 systems, so this should only be an issue for users running older kernels.