Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCI fallback #159

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

OCI fallback #159

wants to merge 2 commits into from

Conversation

aidy
Copy link
Contributor

@aidy aidy commented Feb 27, 2024

connects #121

This isn't a full implementation of a OCI registry client at this point, and doesn't support auth etc. Only intended as a last resort fallback for now, and as such I've not wired it in to be selectable by host via command line flags.

if err != nil {
continue
}
image, err := remote.Image(img, remote.WithContext(ctx))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Image will resolve manifest lists (multi-arch images) to a platform-specific manifest. The default is linux/amd64. Is this the desired behaviour?

Looking at the selfhosted client, I think it will return the tag.SHA and tag.Tag for the manifest list, it won't resolve down to a particular platform.

Based on that, I think this should be a remote.Get and then a check on desc.MediaType to figure out whether it's an image or an index. If it's an image then we can call remote.Image and extract the arch, OS and created time.

You can see here in the cosign codebase how they perform a similar check: https://github.com/sigstore/cosign/blob/main/pkg/oci/remote/remote.go#L71-L95.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, yes - good call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't relevant now that we're not looking to retrieve metadata in this first pass.

Comment on lines +15 to +17
type Client struct{}

func New() (*Client, error) {
return &Client{}, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could construct a *remote.Puller here and use it for the remote operations in Tags. This will cache auth across different calls, reducing the potential number of requests made for each operation.

See:
https://github.com/google/go-containerregistry/blob/main/pkg/v1/remote/puller.go

Copy link
Contributor Author

@aidy aidy Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had this in mind for the future in building this out to be a fully fledged client in it's own right, but I didn't want to do it here as a first pass - mostly I try to work in incremental value, rather than hitting too many things at once.

}

bareTags, err := remote.List(rpo, remote.WithContext(ctx))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

nit: rm line

func (c *Client) Tags(ctx context.Context, host, repo, image string) ([]api.ImageTag, error) {
src := fmt.Sprintf("%s/%s/%s", host, repo, image)
rpo, err := name.NewRepository(src)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

nit: rm line

Comment on lines 47 to 49
// Only fetch metadata for the highest version
// Could be very slow otherwise if there are a lot of tags, and
// we only really care about the highest version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive my ignorance of version-checker here, but is this definitely true? Isn't it possible that there will be some custom sorting of the []api.ImageTag this function returns that will require full details? For instance, finding the latest image by Timestamp?

The cost of fetching the config for every tag may just be the cost of doing business here?

I suppose if we want to make it more efficient, we could cache details by digest? That way we'd only ever have to fetch the config for each digest once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, looking at it harder, I missed that there's a use-sha option, where it checks against the latest sha by date. I kinda dismissed that as something that people might want to do, because the latest by date might often be a hotfix against an older version. But I guess there is a use-case for if people are using latest or something - although then I'd expect them to either flush their pods regularly, or have some CI around it.

The cost of fetching the config for every tag may just be the cost of doing business here?

I tested against the fluentbit repo, I think it hadn't finished after an hour. I think if we were talking a few minutes I'd accept the cost, but potentially for repos with large numbers of tags and api rate limits, we could be talking days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, in that case, maybe we just don't fetch the config at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prom metrics output wants to give you the sha of the latest version.

I think the best thing to do here is to fetch the metadata for the latest semver, and for any currently deployed tags. Still a gap for empty tags, but that seems like an edge case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we load the highest semver, and any non-semver tags - that'd be easier to implement, hopefully should be a reasonably small set, and I think would cater for all use cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a chat with @ribbybibby and think it's better to skip the metadata for now. There's a few different ways in which we consider a version to be "current", and it's not trivial to pull that logic in here.

Copy link
Member

@ribbybibby ribbybibby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments.

Two other things worth mentioning:

  1. Auth. By default, ggcr will try to find credentials in the environment, for instance in ~/.docker/config.json. I reckon we should document this, so users are aware that they can make creds available like that.
  2. Tests. It's pretty straightforward and powerful to use the the ggcr registry for testing. You can see an example of that in paranoia.

aidy added 2 commits May 1, 2024 12:20
OCI registries don't provide a way to retrieve metadata in the same call
as listing tags.

This means that you have to do a separate API request for each tag if
you want the metadata, which could be very slow for a large number of
tags.

As such, preferably use the selfhosted client as a fallback, but add a
fallback fallback to the oci handler for registries that are
incompatible with the selfhosted API implementation.
As we might not have retrieved a SHA (dependent on registry type), don't
try to include one if it's empty.
@davidcollom davidcollom added this to the v1 release milestone May 16, 2024
@hawksight hawksight added the enhancement New feature or request label May 23, 2024
@aidy
Copy link
Contributor Author

aidy commented May 23, 2024

I'll probably look to add tests in a separate PR - As there's not currently a comprehensive testing framework, I think bringing tests into this PR might make it difficult to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants