New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Earthly has overhead in simple cases #2049
Comments
VERSION 0.6
hello:
FROM alpine:3.15
COPY data .
RUN cat data | md5sum > my_output
SAVE ARTIFACT my_output AS LOCAL my_output takes approx 7seconds on my ubuntu machine. However when I run it under earthly-v0.6.6, it takes approx 1.3 seconds. between v0.6.6 and v0.6.7, there was an update to buildkit: 8d4eb6b24e7253afd621f91f2589221bfd1dabdf..60bb7fe66d316609854ee3c89466df464044ef8f and an update to fsutil: |
|
here's buildkit logs from v0.6.7
|
This might be showing up in #2307 (comment) where failed commands each introduce a 5 second delay. |
I've identified a fix for this. I'm not super confident it's a solution though; it may be just addressing a symptom of a bug elsewhere in the system. Though a one-line fix will do the trick, I thought my two-line was "better". The
The cached times are insanely different. The "cpu" times look even more dramatic, but the 'real' times are what we care about, and still very substantial (13s -> 2s for building earthly, and 2m -> 18s for running its tests). I had always suspect (per my previous comment) that the part responsible for the delay was this delay. Changing that value is an easy way to validate that. One fix would be to decrease that, but it's not a very good solution. Removing the timeout altogether is not an option (without other changes). My solution (which yielded those results above) was to change this small block to rrjjvv/buildkit@7bb02cd With the addition of extra logging, I found zero occurrences of the requested session id ever actually being in the map. Since it appeared destined to always fail, it seemed like the best option. From what I could gather, that only a new session can release that Since the pertinent code doesn't seem to deviate from upstream buildkit in any significant way, my only guesses are 1) something within earthly is artificially stopping the creation of a new session that would unblock that The only thing stopping me from creating a PR is not fully knowing if this is a true solution or just a band-aid. (The fact that my fix is in the buildkit fork, and around code that doesn't deviate too much from upstream, is another reason I think the real fix may actually lie elsewhere.) If the latter, hopefully it's a pointer in the direction of a permanent/correct fix! |
great find @rrjjvv ! It would appear that this slowness happens in the upstream moby/buildkit source too. When no sessionID is specified on the llb.Local call, the localSourceHandler will do a quick return at https://github.com/moby/buildkit/blob/b3e8c63a48ad8c015f5631fc1947945b229b3919/source/local/local.go#L92
however if I try I will have to dig in deeper to see if there's a mistake in how we use SessionID. |
As you probably saw (or will see), the After taking another peek, my guess isn't that you're using the SessionIDs wrong per-se (but maybe you are). But more that it started being used in ways that you didn't anticipate. Prior to moby/buildkit@b3e8c63, the SessionID wasn't referenced at all for local snapshots (nor was that session->client map referenced). Anecdotally, the "shape" of ids that are present in the map, i.e. as a result of new sessions, are quite a bit different than the ids giving us problems. The ones in the map are relatively short and all lowercase, whereas the cache-miss ids are noticeably longer and mixed case. So that does hint at incorrect usage, but just a hint. Just glad I could help! |
I have a potential fix under #2413 which my intial repro-case in approximately 1second. I also tried a
and
|
This is now complete & released 🎉 |
Example: https://gist.github.com/lynaghk/0f9b66cf889398a7c73b01f39beaffee
This takes 7 seconds when it's fully cached. (Toast, for example, takes 800ms in the same situation)
The text was updated successfully, but these errors were encountered: