Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow image import/export during WITH DOCKER #1268

Closed
konradmalik opened this issue Oct 7, 2021 · 4 comments · Fixed by #2266
Closed

Very slow image import/export during WITH DOCKER #1268

konradmalik opened this issue Oct 7, 2021 · 4 comments · Fixed by #2266
Labels
type:performance Performance improvement request or issue

Comments

@konradmalik
Copy link

Hey,
Recently I started to use Earthly for some of my projects.
So far, the majority of stuff works great, but the biggest drawback seems to be performance.
Everything runs a little bit slower, which is expected and completely fine. However, the worst impact that we see is for docker-compose based integration tests.

Specifically, in the example below:

             context | transferred 1 file(s) for context /tmp/earthly-docker-load290768814 (72 MB, 1 file/dir stats)
             context | transferred 1 file(s) for context /tmp/earthly-docker-load850151696 (564 MB, 1 file/dir stats)
             context | transferred 1 file(s) for context /tmp/earthly-docker-load498375883 (623 MB, 1 file/dir stats)
             context | transferred 1 file(s) for context /tmp/earthly-docker-load533876533 (1.3 GB, 1 file/dir stats)
               +test | --> WITH DOCKER RUN --privileged docker run --network=pider-tests --entrypoint="" scraper ./wait-for-it.sh azurite:10000 -- poetry run python -m pytest --numprocesses $PYTEST_PARALLELISM --max-worker-restart 0 --verbose --cov=scraper tests
               +test | Loading images...
               +test | Loaded image: scraper:latest
               +test | Loaded image: mcr.microsoft.com/azure-storage/azurite:3.14.2
               +test | Loaded image: registry.gitlab.com/pider/detectron:0.0.6
               +test | Loaded image: registry.gitlab.com/pider/tohu:0.0.6
               +test | ...done

I have one image that is built via earthly and three others that need to be pulled and used in the docker-compose file I provided (dependencies for the integration test).
The problem is, that the import/export of all those images takes much longer (2, 3 times. Depending on the machine it may be 5-10 mins) than the running of the tests itself.
"Standard" approach with plain docker-compose is much, much faster.

This is the command I run for the reference:

    WITH DOCKER \
        --compose tests.yaml \
        --load scraper=+tests
        RUN docker run --network=pider-tests --entrypoint="" scraper \
            ./wait-for-it.sh azurite:10000 -- poetry run python -m \
            pytest --numprocesses $PYTEST_PARALLELISM --max-worker-restart 0 --verbose tests/integration
    END

Problems I noticed:

  • seems like all the images are first loaded into the earthly buildkit context and then transferred (loaded) into the internal dind instance
  • maybe the above could be optimized somehow, perhaps done in a one-step instead of two?
  • images I use are pretty large (1.3 GB the biggest one), but I think this is by no means a "too large" image, and even bigger images are very common, so this will influence the majority of the users

Any ideas on how to debug, optimize or improve this?
I'll take all:

  • workarounds (except running plain docker-compose commands just for the integration tests :) )
  • suggestions where to look in earthly code and try to come up with some ideas/PRs
@vladaionescu vladaionescu added the type:performance Performance improvement request or issue label Oct 8, 2021
@vladaionescu
Copy link
Member

Hi @konradmalik - this is a known issue. However, I don't think it had been previously documented - so thanks for opening this.

Workarounds

None that I know of. But you could try using WITH DOCKER in a LOCALLY context. This will use the Docker daemon of the host, which may or may not improve the speed. More likely that it doesn't.

Future improvements

There are a few ways to improve the performance of this in the future. It's a bit complex, hence that's why it hasn't been done just yet.

1. Optimizing the images transfer

The most bang for the buck is probably achieved by optimizing the transfer of images from Buildkit to the inner Docker daemon. Currently, this relies on passing image .tar files from Buildkit to the host and then from the host to the inner Docker. Ideally, we use a local registry and docker pull operations instead, as any duplicate layers are not transferred twice. We have done something like this with local image exports (#500 (comment)).

2. Parallelizing the client-side more

There are cases where a WITH DOCKER can stop anything else from executing even though it should take place in parallel.

The original issue is described here: #888. This has been addressed for parallel BUILDs but it hasn't yet been addressed for parallel WITH DOCKER --load ... --load .... That could be improved as continued work as part of that issue. This is the relevant TODO.

In any case, even without the above, enabling the feature provided by #888 could still be beneficial. I just wrote instructions here how: #888 (comment).

3. Future locally workaround

It's possible to improve the LOCALLY workaround even further by performing image exports as part of LOCALLY WITH DOCKER using the same mechanism as the other local image exports (feature introduced in #500). This may end up being the fastest especially if most layers can be reused between runs. If the local Docker already has the layers, there is no need for a transfer.

This has the disadvantages of LOCALLY though. So it can only remain a workaround.

In any case, this is probably a bit easier to implement (but it's still pretty gnarly if first-time contributing given that it's so cross-cutting). This can reuse the work done in #500. The export of individual images is done here: https://github.com/earthly/earthly/blob/main/earthfile2llb/withdockerrunlocal.go#L100. It uses a DockerBuilderFun, which is constructed here https://github.com/earthly/earthly/blob/main/builder/builder.go#L135, which uses a tar-based export implementation https://github.com/earthly/earthly/blob/main/builder/builder.go#L634. This uses the Buildkit tar exporter. To use the local registry approach we'd need to use the Earthly exporter and set it to only export a specific image (maybe even tell it to export multiple images in parallel since WITH DOCKERs often use multiple images). The Earthly exporter uses metadata to inform of what needs to be exported. The image(s) need to be added as refs and then metadata such as this https://github.com/earthly/earthly/blob/main/builder/builder.go#L252 will tell the exporter which image to export via local registry.

@vladaionescu
Copy link
Member

Another thought for optimizing the image transfer:

Besides using a local registry for passing in the images to the inner Docker, we could additionally use the stargz snapshotter in the inner Docker, to lazily download bits and pieces from the loaded images as they are actually needed. Because the download operation is local, the lazy download should be almost unnoticeable. This should reduce the delay caused by loads+transfer of bytes to near zero. One wrinkle though... I'm not sure how re-tagging might work (when downloading from the local registry, the image will have a local address pre-pended to the image tag, which requires re-tagging to get it to the right image name as required by the user).

@vladaionescu
Copy link
Member

vladaionescu commented Mar 18, 2022

The latest release of Earthly improves the performance a bit as it takes care of most of item 2 in the comment above. This is behind a feature flag as it's experimental, however: VERSION --parallel-load 0.6

@vladaionescu
Copy link
Member

Item 1 is being GA'd in the next release. Item 2 has been GA'd for some time already.

vladaionescu added a commit that referenced this issue Oct 13, 2022
…prominent (#2266)

Closes #1268

Note that the feature flag `--use-registry-with-docker` is backward
compatible, hence we can enable it retroactively for all versions, now
that we have confidence in the way it functions.

For earthly-in-earthly used in our integration tests, `WITH DOCKER`
doesn't work correctly with the embedded registry, due to the use of
`NETWORK_MODE=host`. In such cases, we force the old behavior via the
override feature flags functionality.

Co-authored-by: Vlad A. Ionescu <vladaionescu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:performance Performance improvement request or issue
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants