Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add "additionalArchives" config option #1123

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pendo324
Copy link
Contributor

Adds a new config option that allows users to include an "extra" archive in the CIDATA that gets mounted on OS boot.

This allows users to supply their own local packages or package repositories for use with provisioning scripts that run before the host filesystem is available (such as the dependency provisioning scripts proposed in #1105).

With 9p becoming the default at some point in the future, users wouldn't have to use an extra archive for this purpose since the filesystem should be available without any additional dependency installation (like fuse/sshfs), but it would allow parity between 9p and sshfs users.

There's an argument to be made as to whether this should allow users to upload n extra archives or just 1. I'm currently in favor of just 1 because it can contain arbitrary data and users can decide what to do with it.

More details in #1093

@@ -31,6 +31,7 @@ type LimaYAML struct {
UseHostResolver *bool `yaml:"useHostResolver,omitempty" json:"useHostResolver,omitempty"` // DEPRECATED, use `HostResolver.Enabled` instead
PropagateProxyEnv *bool `yaml:"propagateProxyEnv,omitempty" json:"propagateProxyEnv,omitempty"`
CACertificates CACertificates `yaml:"caCerts,omitempty" json:"caCerts,omitempty"`
ExtraArchive *File `yaml:"extraArchive,omitempty" json:"extraArchive,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ExtraArchive *File `yaml:"extraArchive,omitempty" json:"extraArchive,omitempty"`
ExtraArchives []File `yaml:"extraArchives,omitempty" json:"extraArchives,omitempty"`

for supporting multi-arch

if err := y.ExtraArchive.Digest.Validate(); err != nil {
return err
}
eaBuf, err := os.ReadFile(y.ExtraArchive.Location)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to support URL too (see how nerdctl-full-* archive are downloaded)

@@ -196,6 +196,13 @@ containerd:
# vim was not installed in the guest. Make sure the package system is working correctly.
# Also see "/var/log/cloud-init-output.log" in the guest.

# Adds an additional archive to the CIDATA image which is mounted on boot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs more explanation (how to access the archive from the provisioning scripts?)

@pendo324
Copy link
Contributor Author

Thanks for the review @AkihiroSuda. I just pushed a new revision which adds more information to the defaults.yaml example, and supports multi-arch and remote downloads by refactoring and reusing the ensureNerdctlArchiveCache code from start.go.

@jandubois
Copy link
Member

I'm sorry, I'm too busy this week to do a proper review, but I just noticed that this PR uses extraArchives, but the volume one uses additionalDisks. I would prefer to use the same prefix, as the concept is basically the same.

I would change this PR to use additionalArchives; @AkihiroSuda WDYT?

@pendo324
Copy link
Contributor Author

I'm sorry, I'm too busy this week to do a proper review, but I just noticed that this PR uses extraArchives, but the volume one uses additionalDisks. I would prefer to use the same prefix, as the concept is basically the same.

I would change this PR to use additionalArchives; @AkihiroSuda WDYT?

additionalArchives makes sense to me to keep it consistent. Will wait to hear back from Akihiro before changing.

@AkihiroSuda
Copy link
Member

Yes, additional seems better for consistency

@pendo324
Copy link
Contributor Author

Renamed in latest revision. Also updated some comments to be clearer.

@AkihiroSuda
Copy link
Member

Please squash commits and update the PR title

@AkihiroSuda AkihiroSuda added this to the v0.14 (tentative) milestone Nov 2, 2022
@pendo324 pendo324 changed the title add "extraArchive" config option add "additonalArchive" config option Nov 2, 2022
@pendo324 pendo324 changed the title add "additonalArchive" config option add "additonalArchives" config option Nov 2, 2022
AkihiroSuda
AkihiroSuda previously approved these changes Nov 2, 2022
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ningziwen
Copy link
Contributor

There is a typo in PR title: "additionalArchives" instead of "additonalArchives"

@AkihiroSuda AkihiroSuda changed the title add "additonalArchives" config option add "additionalArchives" config option Nov 5, 2022
@jandubois
Copy link
Member

There's an argument to be made as to whether this should allow users to upload n extra archives or just 1. I'm currently in favor of just 1 because it can contain arbitrary data and users can decide what to do with it.

I think it would be preferable to support multiple additional archives:

For example I may want to create an instance template that pulls various other archives via URLs from e.g. Github releases pages. This way I can take advantage of the Lima downloader and caching mechanism to fetch the files just once, and then bundling them automatically on instance start.

If I have to combine all the releases manually, I have to download the files myself, and repackage them. And when any one is being updated, I have to do this all over again.

And when I want to share the template with somebody else, I also need to provide a script to download and repackage the individual releases.

So I think it would be better to design the lima.yaml options for this to support multiple additional archives (in addition to supporting different architectures for each). Changing is later would just become a compatibility hassle.

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Nov 12, 2022

So I think it would be better to design the lima.yaml options for this to support multiple additional archives (in addition to supporting different architectures for each).

Will this be [][]File?
Seems kinda complicated, but I don't see other options.

@jandubois
Copy link
Member

jandubois commented Nov 13, 2022

Will this be [][]File?
Seems kinda complicated, but I don't see other options.

I'm afraid so, but I also don't see any alternative. I agree it is ugly when you only need a single file.

Part of the complication comes from the fact that we support multiple sources for the same arch, so we can't use a map keyed by the archname instead.

BTW, I would like to be able to specify a "target name" for the additional archives, so it can be made available under a different name than the basename of the download URL.

additionalArchives:
- name: helm.tgz
  - location: https://get.helm.sh/helm-v3.10.2-linux-amd64.tar.gz
    arch: x86_64
  - location: https://get.helm.sh/helm-v3.10.2-linux-arm64.tar.gz
    arch: aarch64

That way I can write a provisioning script that just refers to /mnt/lima-cidata/helm.tgz without having to bother about the arch1. And the script doesn't have to be changed if the archive is updated to point to a newer version of the package.

So I guess we could make this a map[string][]File:

additionalArchives:
  helm.tgz:
  - location: https://get.helm.sh/helm-v3.10.2-linux-amd64.tar.gz
    arch: x86_64
  - location: https://get.helm.sh/helm-v3.10.2-linux-arm64.tar.gz
    arch: aarch64

That would prevent us from adding additional keys to the Archive, but I think that is fine. The only thing I could think of adding would be some automated unpacking or filtering options, and I don't think we want to go there.

To avoid conflicts with filenames used by Lima it would probably be best to copy them into a subdirectory of the cidata volume, e.g. /mnt/lima-cidata/archives/helm.tgz.

Footnotes

  1. Yes, I know that helm releases use the arch name in the directory name inside the tarball, e.g. linux-arm64/helm, but this can still be handled with wildcarding.

@AkihiroSuda
Copy link
Member

Yes, map[string][]File seems better

@afbjorklund
Copy link
Contributor

afbjorklund commented Dec 27, 2022

Seems to be missing "digest" ? Better use the same setup as the containerd archives ?

        "containerd": {
            "system": false,
            "user": true,
            "archives": [
                {
                    "location": "https://github.com/containerd/nerdctl/releases/download/v1.1.0/nerdctl-full-1.1.0-linux-amd64.tar.gz",
                    "arch": "x86_64",
                    "digest": "sha256:5440c7b3af63df2ad2c98e185e06a27b4a21eea334b05408e84f8502251d9459"
                },
                {
                    "location": "https://github.com/containerd/nerdctl/releases/download/v1.1.0/nerdctl-full-1.1.0-linux-arm64.tar.gz",
                    "arch": "aarch64",
                    "digest": "sha256:3b613a1be5a24460c44bb93a3609b790ada94e06efd1a86467d45bec7da8b449"
                }
            ]
        },

But it is there in the code, since it is using File

@pendo324
Copy link
Contributor Author

pendo324 commented Jan 14, 2023

Thanks for all the feedback. I just rebased and switched additionalArchives to use a map[string]string instead of just a list. I didn't add a new struct to hold the values, since its just a simple KVP for now (name => location), can definitely change that though

AkihiroSuda
AkihiroSuda previously approved these changes Jan 14, 2023
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

# from a provisioning script.
# additionalArchives:
# archive.tgz:
# - location: "/path/to/additional.amd64.tar.gz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# - location: "/path/to/additional.amd64.tar.gz"
# - location: "/path/to/additional.x86_64.tar.gz"

or s/aarch64.tar.gz/arm64.tar.gz/ below

@AkihiroSuda
Copy link
Member

@jandubois LGTY?

@AkihiroSuda
Copy link
Member

@lima-vm/maintainers LGTY?

@@ -28,6 +28,7 @@ func newHostagentCommand() *cobra.Command {
hostagentCommand.Flags().StringP("pidfile", "p", "", "write pid to file")
hostagentCommand.Flags().String("socket", "", "hostagent socket")
hostagentCommand.Flags().String("nerdctl-archive", "", "local file path (not URL) of nerdctl-full-VERSION-linux-GOARCH.tar.gz")
hostagentCommand.Flags().StringToString("additional-archive", map[string]string{}, "local file path (not URL) of an arbitrary archive to add to CIDATA")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, why not move the downloading of these archive to hostagent ?
This way, we don't need these specific args here.

One advantage i see in this way is, hostagent can just rely on the yaml config and do necessary stuffs.
Also this model provides flexibility if we want to ignore failures.

From start.go, we can only do the necessary things to bootstrap the hostagent itself (creating & downloading disks).

@AkihiroSuda AkihiroSuda removed this from the v0.15.1 milestone Apr 5, 2023
@AkihiroSuda
Copy link
Member

Needs rebase

Signed-off-by: Justin Alvarez <alvajus@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants