Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations when parsing compact index #6971

Merged

Conversation

segiddins
Copy link
Member

This still allocates a ton (a string for each line, plus a bunch of
splits into arrays), but it helps a bit when Bundler has to go through
dependency resolution.

==> memprof.after.txt <==
Total allocated: 194.14 MB (2317172 objects)
Total retained:  60.81 MB (593164 objects)

==> memprof.before.txt <==
Total allocated: 211.97 MB (2404890 objects)
Total retained:  62.85 MB (640342 objects)

What was the end-user or developer problem that led to this PR?

What is your fix for the problem, implemented in this PR?

Make sure the following tasks are checked

@martinemde
Copy link
Member

I fixed the linting errors so this can be merged.

@@ -95,7 +95,12 @@ def checksum_for_file(path)
# because we need to preserve \n line endings on windows when calculating
# the checksum
SharedHelpers.filesystem_access(path, :read) do
SharedHelpers.digest(:MD5).hexdigest(File.read(path))
File.open(path, "rb") do |f|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about to move this into shared helper directly and just call it here? There is no need to complicate this class with logic for "streamed" MD5 IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me.

...I wonder if we could use Gem::Package::DigestIO here to calculate the checksum as we write the file rather than performing an additional step. It would replace copy_file so might cancel out the benefit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simi @segiddins I got a DigestIO based solution working over at #7039.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since it's only used here, it's better if we leave it. There's nothing else doing this pattern in bundler that I could find.

@martinemde martinemde force-pushed the segiddins/reduce-allocations-when-parsing-compact-index branch from 85566af to f389aa1 Compare October 7, 2023 20:36
This still allocates a ton (a string for each line, plus a bunch of
splits into arrays), but it helps a bit when Bundler has to go through
dependency resolution.

```
==> memprof.after.txt <==
Total allocated: 194.14 MB (2317172 objects)
Total retained:  60.81 MB (593164 objects)

==> memprof.before.txt <==
Total allocated: 211.97 MB (2404890 objects)
Total retained:  62.85 MB (640342 objects)
```
@martinemde martinemde force-pushed the segiddins/reduce-allocations-when-parsing-compact-index branch from f389aa1 to c68b41b Compare October 7, 2023 20:39
@martinemde martinemde merged commit 740cf90 into master Oct 8, 2023
92 checks passed
@martinemde martinemde deleted the segiddins/reduce-allocations-when-parsing-compact-index branch October 8, 2023 04:17
@martinemde
Copy link
Member

Based on testing in #7039, this is a perfectly fine solution that shaves about 30MB of memory off a larger bundle lock run. We can follow up if there's additional improvements to be made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants