Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no image found in manifest list for architecture amd64, variant "", OS linux #6457

Closed
ccoager opened this issue Dec 18, 2022 · 6 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@ccoager
Copy link

ccoager commented Dec 18, 2022

What happened?

The issue started when I cross-built images for another architecture (arm64).

# crictl images
E1218 14:40:16.522236 1283636 remote_image.go:136] "ListImages with filter from image service failed" err="rpc error: code = Unknown desc = choosing image instance: no image found in manifest list for architecture amd64, variant \"\", OS linux" filter="&ImageFilter{Image:&ImageSpec{Image:,Annotations:map[string]string{},},}"
FATA[0000] listing images: rpc error: code = Unknown desc = choosing image instance: no image found in manifest list for architecture amd64, variant "", OS linux

The error message also appears in the journald log for service kubelet every few seconds.

What did you expect to happen?

I expect to be able to list images using 'crictl images' without getting an error message.

How can we reproduce it (as minimally and precisely as possible)?

Build images in another architecture using buildah and qemu-user-static. Both crio/crictl can no longer list images.

Anything else we need to know?

Alternatively, I can still list images using Podman without errors, e.g. 'podman images'. This leads me to believe that the underlying storage is not corrupt and that this is a bug with crio/crictl.

CRI-O and Kubernetes version

$ crio --version
crio version 1.25.1
Version:        1.25.1
GitCommit:      unknown
GitCommitDate:  unknown
GitTreeState:   clean
BuildDate:      2022-12-08T18:23:29Z
GoVersion:      go1.19
Compiler:       gc
Platform:       linux/amd64
Linkmode:       dynamic
BuildTags:
  apparmor
  seccomp
  containers_image_ostree_stub
  exclude_graphdriver_btrfs
  exclude_graphdriver_devicemapper
  containers_image_openpgp
LDFlags:          -s -w -X github.com/cri-o/cri-o/internal/pkg/criocli.DefaultsPath="" -X github.com/cri-o/cri-o/internal/version.buildDate=2022-12-08T18:23:29Z
SeccompEnabled:   true
AppArmorEnabled:  true
Dependencies:
$ kubectl --version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:29:58Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

$ uname -a
Linux gaia 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Additional environment details (AWS, VirtualBox, physical, etc.)

This is a physical machine running Vanilla Kubernetes. I also use this machine to cross-build images for my Raspberry Pi's running arm64.
@ccoager ccoager added the kind/bug Categorizes issue or PR as related to a bug. label Dec 18, 2022
@haircommander
Copy link
Member

@mtrmac @nalind @giuseppe @vrothberg is there a knob that cri-o isn't setting correctly?

@vrothberg
Copy link
Member

I wished the question would be more narrow. I don't want to browse through the ListImages call stack and look for potential errors. Podman uses libimage which does as follows: https://github.com/containers/common/blob/main/libimage/runtime.go#L544

@mtrmac
Copy link
Contributor

mtrmac commented Dec 23, 2022

I think the relevant CRI-O code is

imageFull, err := ref.NewImage(svc.ctx, systemContext)
: this uses a @image-id image reference, and that does not specify which of the manifest blobs to use. I think it could be the case that resolving that reference ends up choosing a manifest list instead of a per-arch manifest. CRI-O then calls OCIConfig and Inspect, which are relevant to a single per-arch variant, and require choosing a per-arch variant from the manifest list — and that (broadly correctly) wants to match the current run-time architecture.

c/image doesn’t understand that the manifest list is “sparse” and that it has to choose the architecture that matches the rest of the storage.Image (and there might be several options that match simultaneously! e.g. if the Zstd variant support is added).

At a first guess, c/image should contain some kind of dual of the multiArchImageMatchesSystemContext logic, for references where s.id != "" && s.named == "" (and in other cases?). I’m not at all sure about that, I have never studied how podman manifest records data.


Turning that error into a warning, and silently ignoring images that can’t be inspected, might be an option for CRI-O, OTOH it would make Kubelet’s understanding of the set of images, and possibly recovery from corrupt images, harder.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 23, 2023
@github-actions
Copy link

Closing this issue since it had no activity in the past 90 days.

@github-actions github-actions bot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 24, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 24, 2023
@mtrmac
Copy link
Contributor

mtrmac commented Apr 24, 2023

Filed containers/image#1929 for this so that the knowledge is not completely lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

4 participants