Optimize performance for scanning trees in partial clones #5699
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Right now, if the user is using partial clone, our call to
git ls-tree
against HEAD is expensive becausegit ls-tree
needs to download each blob, which it does incrementally instead of all at once.If we're scanning the tree from HEAD, then we can avoid the expense of doing this by running
git ls-files
with a pattern that matches only LFS files, which makes the operation much cheaper, since we avoid needing to download blobs for many of those objects. We can format the data such that it matches the pattern we expect forgit ls-tree
so that we can avoid modifying most of the calls and continue to let things function in the same way. Do so, but limit our changes to Git 2.42.0 and newer, since theobjecttype
argument is new in that version.In addition, let's fix some tests which rely on an invalid assumption about how we discover and process LFS files.