Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove apk go requirement when possible #2318

Merged
merged 18 commits into from
Feb 19, 2023

Conversation

echoix
Copy link
Collaborator

@echoix echoix commented Feb 6, 2023

Fixes #2315 (continued optimization)

Installing go from apk is like installing gcc or build requirements. Removing this package install completely will at least remove at least 384 MB in the usr/bin/lib/go folder (for the ci_light flavor). Go programs are very portable, they are statically compiled and a file for the appropriate OS type and CPU arch will run without problems. Thus, if a Go linter binary is available, it is better to use it directly.

Furthermore, if when I checked at a linter, they had a published docker image of a single file, and it was kept up to date by some CI (so it won't get not updated later on), I changed the installation method to a pair of dockerfile "FROM" and "COPY" instructions. Building with this is way faster, especially with buildx that is now used, since these can be done in parallel. A further optimisation should be done by using a dockerfile frontend (adding # syntax=docker/dockerfile:1 as the first line in the dockerfiles, starting with the # character) in order to use COPY --link. This would enable the buildkit to not invalidate the rest of the cache is this changes, and it could, if possible, rebase an image without pulling, pushing or building layers from the registry. See https://docs.docker.com/engine/reference/builder/#benefits-of-using---link. Adding that magic comment should be done in another PR later on. In the meantime, the new copy instructions that I added include the full file path on destination in order to make sure that there won't be overwriting problems when using COPY --link, since the destination handling has a little difference. I confirm by inspecting the images with dive that it is a file and not a folder on the destination.

Proposed Changes

  1. Remove apk go package from go descriptor, installing revive from Docker image by copying binary.
  2. Install shellcheck from Docker image by copying binary.
  3. Remove apk go package from action descriptor, installing actionlint Docker image by copying binary.
    • Shellcheck and pyflakes are dependencies of actionlint.
    • pyflakes is installed by apk: py3-pyflakes
    • Shellcheck is installed by Docker image by copying binary. It is added again directly inside the action descriptor. Duplicating the from and COPY operations adds a 0B layer, and has no real impact, except for me making sure that any future use of the actionlint in a flavor will have the dependencies generated.
    • (Super-Linter installs it the same way as I figured out to do here)
  4. Install checkmake from Docker image by copying binary.
    •    # make is included in the mrtazz/checkmake:latest image 
         # (DIGEST:sha256:eb6919b20b22d1701a976856e4a224627df0a74b118246101fb6cf5c2e03049f)
         # It may not be a real dependency, like their pandoc mention in the README,
         # that is not included in the docker image they provide.
  5. Remove apk go package from go descriptor:
  6. Remove apk go install for repository dustilock, installing dustilock by copying binary from a Docker build stage since no container image is published but supports it.
  7. Use COPY --link when copying a single file in order to not invalidate caches (it adds them in a layer that is independent of the preceding, and can thus be kept the same if not changed). That implied adding a # syntax=docker/dockerfile:1 at the first line of the Dockerfiles.
    See a test image built to illustrate that duplicating from and copy --link isn't that bad.
    image

To be continued, so it is not ready to merge yet, there are missing some linters with Go to change, like dustilock that doesn't include any prebuilt binaries.

Readiness Checklist

Author/Contributor

  • Add entry to the CHANGELOG listing the change and linking to the corresponding issue (if appropriate)
  • If documentation is needed for this change, has that been included in this pull request

Reviewing Maintainer

  • Label as breaking if this is a large fundamental change
  • Label as either automation, bug, documentation, enhancement, infrastructure, or performance

@echoix echoix self-assigned this Feb 6, 2023
@echoix echoix requested a review from nvuillam as a code owner February 6, 2023 05:24
@echoix
Copy link
Collaborator Author

echoix commented Feb 6, 2023

There was a test failure in my fork with revive version. However, I don't see why it could fail, as the latest image, and my temporary tag that was pushed to the registry seems to give the same result:
image

@nvuillam
Copy link
Member

@echoix for some reason revive does not return a version, that's why it fails :/

image

@nvuillam
Copy link
Member

About the rest of the PR... if you could use latest tags of docker images, it would prevent to have dozens of PRs at each version upgrade, and let auto-upgrade-linters job do his nightly work with just a merge from us after ^^

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

About the rest of the PR... if you could use latest tags of docker images, it would prevent to have dozens of PRs at each version upgrade, and let auto-upgrade-linters job do his nightly work with just a merge from us after ^^

@nvuillam How do you keep track of what tool version is in which MegaLinter release then, if it is only tagged "latest"? Doesn't it mean if you run docker build on the same Dockerfile at two different times apart (weeks) with the repo at the same commit we won't get the same image? (or start including a broken version without us checking it, or worse, start including a vulnerable version, ie: supply chain attacks) Isn't the dependabot there to help us allow when we want each linter to change version?

@nvuillam
Copy link
Member

@echoix new versions arrive in beta :)

About vulnerable versions... we have OX Security itself :)

@nvuillam
Copy link
Member

nvuillam commented Feb 11, 2023

And about trace, auto-update-linters job (trigger at each new commit in main, + schduled daily) calls for the version during unit tests and store the result in a JSON, which itself is reused for documentation :)

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

I encountered something really strange when simply pulling the go flavour image. (Trying to understand why the revive call to version bugs)

failed to register layer: ApplyLayer exit status 1 stdout: stderr: lchown /node-deps/node_modules/character-parser/LICENSE: invalid argument
image

I never heard about that and, seems related to rootless docker (gitpod is running my environment in a container, on a shared machine). moby/moby#43576 Did you ever see something like this?

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

About vulnerable versions... we have OX Security itself :)

Where do we see that they monitor, or keep monitoring? Relying on someone else seems a little easy.

And about trace, auto-update-linters job (trigger at each new commit in main, + schduled daily) calls for the version during unit tests and store the result in a JSON, which itself is reused for documentation :)

So we document, but the code is not stateful?

@nvuillam
Copy link
Member

nvuillam commented Feb 11, 2023

@echoix there is a dashboard on ox.security, where I monitor all my repos (not only MegaLinter)
They are a third party, but supply chain attack is the heart of their business and I trust them :)
You should try the free tier to see what I'm talking about :)

image

About that... @llaville please could you add 2FA auth to your Github account ? It would remove the last high severity issue found :)

The code is stateful within an official released version of MegaLinter, we don't call docker images while linting, we just use them when building then copy their executables within MegaLinter image

The safest way to use MegaLinter is indeed to use a released version, not the beta one (even if the beta one is also monitored)

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

The new beta image of a few minutes ago still jams, but not the released one (latest is still v6.19.0):
image

@nvuillam
Copy link
Member

nvuillam commented Feb 11, 2023

image

Latest beta image is ok on my computer, and I never saw such error

But since yesterday, all images are built using docker-build-push action, so maybe something changed somewhere :/

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

image
Nope, main oxsecurity/megalinter:beta doesn't work while oxsecurity/megalinter:v6.19.0 does.

You are on a windows platform, so the UID stuff doesn't seem to affect.
Its a permission thing that we need to check in what files are included. moby/moby#43576 (comment)

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

But since yesterday, all images are built using docker-build-push action, so maybe something changed somewhere :/

I imagine it might have hidden something on our side.
Users that run CI using containers on shared machine might stumble on this I imagine.

@echoix
Copy link
Collaborator Author

echoix commented Feb 11, 2023

Ok, lets file a real bug to look at this. Until we don't understand what it is, I consider it blocking the release. It's not normal.

@nvuillam
Copy link
Member

Would this solution work on your context ? moby/moby#43576 (comment)

We always knew that MegaLinter has some issues on some contexts because of its root user :/

@echoix
Copy link
Collaborator Author

echoix commented Feb 13, 2023

🦙 MegaLinter status: ⚠️ WARNING

Descriptor Linter Files Fixed Errors Elapsed time
✅ BASH bash-exec 6 0 0.01s
✅ BASH shellcheck 6 0 0.14s
✅ BASH shfmt 6 0 0 0.05s
✅ COPYPASTE jscpd yes no 2.75s
✅ DOCKERFILE hadolint 114 0 15.36s
✅ JSON eslint-plugin-jsonc 21 0 0 1.9s
✅ JSON jsonlint 19 0 0.25s
✅ JSON npm-package-json-lint yes no 0.66s
✅ JSON v8r 21 0 12.42s
⚠️ MARKDOWN markdownlint 309 2 230 6.06s
✅ MARKDOWN markdown-link-check 309 0 5.64s
✅ MARKDOWN markdown-table-formatter 309 2 0 16.93s
✅ OPENAPI spectral 1 0 1.75s
⚠️ PYTHON bandit 183 47 2.07s
✅ PYTHON black 183 0 0 3.35s
✅ PYTHON flake8 183 0 1.89s
✅ PYTHON isort 183 0 0 0.47s
✅ PYTHON mypy 183 0 7.22s
✅ PYTHON pylint 183 0 12.09s
⚠️ PYTHON pyright 183 244 18.75s
✅ REPOSITORY checkov yes no 29.77s
⚠️ REPOSITORY devskim yes 61 1.41s
✅ REPOSITORY dustilock yes no 3.26s
✅ REPOSITORY git_diff yes no 0.04s
✅ REPOSITORY secretlint yes no 9.32s
✅ REPOSITORY syft yes no 0.98s
✅ REPOSITORY trivy yes no 20.18s
✅ SPELL cspell 745 0 18.8s
✅ SPELL misspell 566 2 0 0.58s
✅ XML xmllint 3 0 0 0.03s
✅ YAML prettier 81 2 0 2.58s
✅ YAML v8r 23 0 57.33s
✅ YAML yamllint 82 0 1.15s

See detailed report in MegaLinter reports

You could have same capabilities but better runtime performances if you request a new MegaLinter flavor.

MegaLinter is graciously provided by OX Security

@echoix
Copy link
Collaborator Author

echoix commented Feb 13, 2023

🦙 MegaLinter status: ⚠️ WARNING

Descriptor Linter Files Fixed Errors Elapsed time
✅ BASH bash-exec 6 0 0.01s
✅ BASH shellcheck 6 0 0.14s
✅ BASH shfmt 6 0 0 0.37s
✅ COPYPASTE jscpd yes no 2.58s
✅ DOCKERFILE hadolint 114 0 14.24s
✅ JSON eslint-plugin-jsonc 21 0 0 2.01s
✅ JSON jsonlint 19 0 0.23s
✅ JSON v8r 21 0 12.55s
⚠️ MARKDOWN markdownlint 309 0 230 6.18s
✅ MARKDOWN markdown-link-check 309 0 5.17s
✅ MARKDOWN markdown-table-formatter 309 0 0 14.66s
✅ OPENAPI spectral 1 0 1.63s
⚠️ PYTHON bandit 183 47 1.99s
✅ PYTHON black 183 0 0 4.24s
✅ PYTHON flake8 183 0 1.72s
✅ PYTHON isort 183 0 0 0.85s
✅ PYTHON mypy 183 0 6.19s
✅ PYTHON pylint 183 0 10.39s
⚠️ PYTHON pyright 183 246 15.42s
✅ REPOSITORY checkov yes no 26.57s
✅ REPOSITORY git_diff yes no 0.37s
✅ REPOSITORY secretlint yes no 11.48s
✅ REPOSITORY trivy yes no 26.43s
✅ SPELL cspell 745 0 17.56s
✅ SPELL misspell 566 0 0 0.9s
✅ XML xmllint 3 0 0 0.39s
✅ YAML prettier 81 2 0 2.76s
✅ YAML v8r 23 0 56.69s
✅ YAML yamllint 82 0 1.22s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by OX Security

@echoix
Copy link
Collaborator Author

echoix commented Feb 13, 2023

So theres a test failure with golangci-lint. I wasn't exactly sure if it was safe to remove the GOROOT and GOPATH env since there wasn't go anymore, but it seems that it might need it still. Especially once I read the related issue mentionned in their Dockerfile
https://github.com/golangci/golangci-lint/blob/dd45e44838ed636965c9d2fd1b1abb78f3b7cca9/build/alpine.Dockerfile#L17-L27

And maybe that linter would still need go installed, even if it comes with binaries. To be continued...

@echoix
Copy link
Collaborator Author

echoix commented Feb 19, 2023

If this passes, it should be ready to merge once I add the changelog entry, I forgot. For now, golangci-lint really needs go installed, so it must be present. I added it in the apk install key for that linter only.

I tested locally with the following commands (used a filter here since the tests are wayyyyyy too long to run.

docker buildx build --pull --rm -f "flavors/go/Dockerfile" -t megalinter-go:test-echoix .
TEST_KEYWORDS_TO_USE="go_golangci_lint"
docker run -e TEST_CASE_RUN=true -e OUTPUT_DETAIL=detailed -e TEST_KEYWORDS="${TEST_KEYWORDS_TO_USE}" -e MEGALINTER_VOLUME_ROOT="." -v "/var/run/docker.sock:/var/run/docker.sock:rw" -v $(pwd):/tmp/lint megalinter-go:test-echoix

and

TEST_KEYWORDS_TO_USE="go"
docker run -e TEST_CASE_RUN=true -e OUTPUT_DETAIL=detailed -e TEST_KEYWORDS="${TEST_KEYWORDS_TO_USE}" -e MEGALINTER_VOLUME_ROOT="." -v "/var/run/docker.sock:/var/run/docker.sock:rw" -v $(pwd):/tmp/lint megalinter-go:test-echoix

@echoix echoix changed the base branch from dev/remove-apk-go to main February 19, 2023 00:26
@echoix
Copy link
Collaborator Author

echoix commented Feb 19, 2023

@nvuillam I think it is done!

@echoix
Copy link
Collaborator Author

echoix commented Feb 19, 2023

There are some changes in the build.py that were required for hadolint

Copy link
Member

@nvuillam nvuillam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great :)
Let's see how it decreases the docker mage sizes ^^

@nvuillam nvuillam merged commit fb924c7 into oxsecurity:main Feb 19, 2023
@echoix
Copy link
Collaborator Author

echoix commented Feb 19, 2023

It's sad that golangci-lint really needs go. Without it, it would be even smaller, since it was only really removed for flavors: documentation, dotnet, java, javascript, php, python, ruby, rust, salesforce, security, swift, terraform. Oh, finally it might not be as bad...

@bdovaz
Copy link
Collaborator

bdovaz commented Feb 19, 2023

@echoix I am a little concerned that you have changed the installation of several linters to be through the docker image. Have you checked if with that change we lose compatibility in #2273? The fact of installing them in through bash scripts gives that possibility in most cases but installing through docker images not because not all publish the version in arm64, most are in amd64 only....

@echoix
Copy link
Collaborator Author

echoix commented Feb 19, 2023

My rationale was to have the current state working as best as it could, and that linter specific overrides can be added when it will be a blocker for arm64 images.
What I made sure before changing them to a Docker image was that their repos had their docker images CI-released so they are published at the same time.

The linters changed that way were written in Go, so they can be compiled to arm64 super easily.
If ever a PR to the linter's repo to add these won't be merged in time, we could add an override with the following install (like dustilock or revive):

    install:
      dockerfile:
          # The golang image used as a builder is a temporary workaround 
        - |
          FROM golang:alpine as revive
          RUN GOBIN=/usr/bin go install github.com/mgechev/revive@latest
        - COPY --link --from=revive /usr/bin/revive /usr/bin/revive

If we would want, we could use that for all platforms, since it would work, but there are faster ways than compiling in docker as the target platform.

actionlint is released with platforms linux/amd64,linux/arm64
shellcheck is released with platforms linux/amd64,linux/arm/v6,linux/arm64
checkmake is released with platforms linux/amd64 -> so are the github releases, unofficial docker images exist, but we could go without it.

These are the only ones which the installation method "changed".
For dustilock, it was build in a RUN command, it is still built, but is now built in a stage then copied (that repo is not in good shape/active, but they have a dockerfile and showed how it could be done).
For revive, it was build in a RUN command, it is still built, but is now built in a stage then copied.

Thats pretty much it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants