Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace eval with xargs for parsing BUILDKITE_COMMAND #165

Merged
merged 2 commits into from Aug 29, 2018

Conversation

lox
Copy link
Contributor

@lox lox commented Aug 17, 2018

From the code:

  # Because we receive $BUILDKITE_COMMAND as a single string, it needs to be tokenized as a shell
  # would to respect things like quoted strings. Previously we used eval to do this with some
  # careful control around preventing glob expansion, but this ends up being difficult to get just
  # right and also very risky. The new implementation uses xargs and printf to tokenize the string,
  # which is portable and much harder to shoot ourselves in the foot 🎉

Turns out xargs does a smashing job of tokenizing shell strings. So much safer than eval ☠️.

This introduces a breaking change, where previously parameters could contain escaped shell commands and variables (e.g $$LLAMAS) and they would be evaluated at run time by virtue of being dumped into eval. An example of where folks use this is something like:

steps:
  - command: rake
    plugins:
      docker-compose#2.5.1:
        run: app
        volumes:
          - "$$PWD/thing:/thing"

The problem is that you could also do something like:

steps:
  - command: rake
    plugins:
      docker-compose#2.5.1:
        run: app
        volumes:
          - "$(rm -rf /var/lib/buildkite-agent):/thing"

Which is less desire-able.

This removes that vector, but also removes the flexibility that it provided around runtime env interpolation.

Also closes #164.

Copy link
Contributor

@toolmantim toolmantim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I don’t fully grok how this pattern works. Maybe the xargs/printf pattern just needs a bit of documenting?


while IFS= read -rd '' token; do
[[ -n "$token" ]] && run_params+=("$token")
done < <(xargs printf '%s\0' <<< "$BUILDKITE_COMMAND")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t fully grok how this works. What’s the /0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first interesting bit is that xargs tokenizes quoted strings, which was news to me:

# echo "blah 'a long string'" | xargs -n1 echo token
token blah
token a long string

Then the next bit of magic is getting those tokens into bash so that we can loop over them. Because quoted strings can contain any normal delimiters like newlines or semi-colons or whatever, we use null bytes \0 to delimit our tokens so that we can loop over them. But because bash is c-based, it's slightly allergic to null bytes, because they tend to mean the end of strings, so to produce the null bytes we use printf to output the tokens with a \0 at the end.

Then to read them, we set IFS (the field separator that is normally used for word-splitting) to empty and then use the -d '' param in the read built-in, which will split on those null bytes!

So the end result is to tokenize the $BUILDKITE_COMMAND into arguments, respecting single and double quotes and then iterate over them and put them in an array as strings. 🎉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent story that belongs in the code!

Because quoted strings can contain any normal delimiters like newlines or semi-colons or whatever, we use null bytes \0 to delimit our tokens so that we can loop over them.

Is this because it's impossible for people the contents of the args themselves to contain null bytes, so we can use them to split?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because it's impossible for people the contents of the args themselves to contain null bytes, so we can use them to split?

Yup correct, because if there was a \0 in there it would cause the string to terminate.

@lox
Copy link
Contributor Author

lox commented Aug 19, 2018

So one major thing with this PR is that it will mean there is no accidental (or intentional) env interpolation in arguments. For instance $PWD in volumes.

To me, this is really important. I'd like to opt-in to those things rather than have them happen by accident, as there can be serious security impact.

@toolmantim
Copy link
Contributor

So one major thing with this PR is that it will mean there is no accidental (or intentional) env interpolation in arguments. For instance $PWD in volumes.

This is a good change. Probably worthy of a major version bump.

How do we let people do intentional job-time env interpolation?

@lox
Copy link
Contributor Author

lox commented Aug 20, 2018

How do we let people do intentional job-time env interpolation?

Good question, I don't have a good answer for that. Personally, I think it's a bad idea in the current architecture.

@toolmantim
Copy link
Contributor

I think as soon as we have a good answer for that, we should merge this and do a major version bump.

@lox
Copy link
Contributor Author

lox commented Aug 28, 2018

I don't think there is a good answer holistically, I think it needs to be solved case-by-case.

@lox
Copy link
Contributor Author

lox commented Aug 28, 2018

My take would be, merge this, major version bump and then address issues as they come up. I think not eval-ing shell code in plugin parameters is pretty important.

@toolmantim
Copy link
Contributor

toolmantim commented Aug 29, 2018

Cool cool.

The only use case I can remember off the top of my head was people using $$ with the volumes option to mount relative volume paths without having to touch their docker-compose.yml — but that's mostly covered in #156.

In the release notes for the new major version, perhaps we link to this PR to explain what the breaking change is, and update the PR description to mention a little more what the effects of this update is? "This removes support for using $$ in your pipeline.yml config, as a way of evaluating arguments and commands at agent run-time" type thing?

steps:
  - command: rake
    plugins:
      docker-compose#2.5.1:
        run: app
        volumes:
          - "$$PWD/thing:/thing"

And if people are using the $$ for something they need to solve, once it's removed the advice will be to switch to using their own shell script and running docker-compose themselves.

👍🏼

🚀

@lox
Copy link
Contributor Author

lox commented Aug 29, 2018

I've updated PR words @toolmantim, lemme know what you think.

@toolmantim
Copy link
Contributor

🚀

@jufemaiz
Copy link
Contributor

I'm back @lox and @toolmantim !

Back with ARG evaluation needs for docker-compose build to access private dependency repos.

Has BK been able to resolve this issue?

@toolmantim
Copy link
Contributor

@jufemaiz hello! I'm not quite sure what you're asking sorry. Is it to pass Docker-style ARG values into the build process?

@toolmantim toolmantim deleted the use-xargs-for-tokenizing branch July 14, 2020 06:37
@jufemaiz
Copy link
Contributor

Looking for BK Best Practice for a docker-compose build where we need ssh key during the build phase (hence ARG). Ideally private_ssh_key cat-ed and made available as an ARG.

@toolmantim
Copy link
Contributor

Are you using the Elastic CI Stack? If so, the ssh mount type option using Buildkit (enabled by adding # syntax=docker/dockerfile:experimental to the top of your Dockerfile) should work:
https://docs.docker.com/develop/develop-images/build_enhancements/#using-ssh-to-access-private-data-in-builds

@jufemaiz
Copy link
Contributor

Yep I am. Heading down the rabbit hole.

docker/compose#7046 Indicates that the new ssh flag is still not released as part of docker-compose 😞

I’ve managed to get it working via known ssh key location, copying down and setting as an ENV to be passed in via interpolation I’d the ENS in a compose file.

You don’t happen to have a hybrid solution between the docker plungin and docker-compose, whereby the build step is performed by the docker plugin but the runs are executed by compose plugin?

@jufemaiz
Copy link
Contributor

As an FYI for those coming across this, I needed to add a pre-command hook that cated the contents of the key to an ENV to make it accessible to the docker-compose build stage as an ARG.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Escaping environment variables with spaces?
3 participants