New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
including git attributes in "vendor" makes archive checksum change #6387
Comments
Previously seen in cri-o: Containerd 1.4.x didn't vendor k8s.io, so didn't have this problem (seen after change to 1.5.x) It would be better if the .gitattributes was removed, and the fields filled out when vendoring. |
Also affects fedora and alpine etc https://src.fedoraproject.org/rpms/containerd/blob/rawhide/f/sources
https://github.com/alpinelinux/aports/blob/master/community/containerd/APKBUILD
and all other similar checksums https://github.com/containerd/containerd/archive/refs/tags/v1.5.8.tar.gz
|
this issue will block minikube from releasing, I would appreciate any visibility for this issue |
@thaJeztah @AkihiroSuda What do you think just removing the .gitattributes file? |
could u please full-fill the steps to reproduce? |
Make a containerd package, wait some weeks, watch every build fail due to checksum mismatch (see above) ? It would fail also in the other Linux distributions, if they hadn't cached the old archive (with a different checksum). It also doesn't serve any purpose, because it will put the containerd commit where the k8s.io commit should go. // Base version information.
//
// This is the fallback data used when version information from git is not
// provided via go ldflags. It provides an approximation of the Kubernetes
// version for ad-hoc builds (e.g. `go build`) that cannot get the version
// information from git.
//
// If you are looking at these fields in the git tree, they look
// strange. They are modified on the fly by the build process. The
// in-tree values are dummy values used for "git archive", which also
// works for GitHub tar downloads.
//
// When releasing a new Kubernetes version, this file is updated by
// build/mark_new_version.sh to reflect the new version, and then a
// git annotated tag (using format vX.Y where X == Major version and Y
// == Minor version) is created to point to the commit that updates
// pkg/version/base.go
var (
// TODO: Deprecate gitMajor and gitMinor, use only gitVersion
// instead. First step in deprecation, keep the fields but make
// them irrelevant. (Next we'll take it out, which may muck with
// scripts consuming the kubectl version output - but most of
// these should be looking at gitVersion already anyways.)
gitMajor string = "" // major version, always numeric
gitMinor string = "" // minor version, numeric possibly followed by "+"
// semantic version, derived by build scripts (see
// https://git.k8s.io/community/contributors/design-proposals/release/versioning.md
// for a detailed discussion of this field)
//
// TODO: This field is still called "gitVersion" for legacy
// reasons. For prerelease versions, the build metadata on the
// semantic version is a git hash, but the version itself is no
// longer the direct output of "git describe", but a slight
// translation to be semver compliant.
// NOTE: The $Format strings are replaced during 'git archive' thanks to the
// companion .gitattributes file containing 'export-subst' in this same
// directory. See also https://git-scm.com/docs/gitattributes
gitVersion string = "v0.0.0-master+1e5ef943eb7"
gitCommit string = "1e5ef943eb76627a6d3b6de8cd1ef6537f393a71" // sha1 from git, output of $(git rev-parse HEAD)
gitTreeState string = "" // state of git tree, either "clean" or "dirty"
buildDate string = "1970-01-01T00:00:00Z" // build date in ISO8601 format, output of $(date -u +'%Y-%m-%dT%H:%M:%SZ')
) Kubernetes version I'm actually not sure that these values are ever correct, except in some special kubernetes-client ? go.mod
Deleting the .gitattributes would leave the template values, should be OK. |
Having git commits and build dates in source code and in binary releases is mostly useless, except for causing confusion. The commit is not really needed, when versions are tagged. As seen by having a commit from the wrong git repository ? And the build date makes it hard to do reproducible builds. It is also frequently wrong, making go binaries live in the 70's. To be useful, there would need to make some kind of |
IIUC, This problem is because of ERROR: v1.5.8.tar.gz has wrong sha256 hash:
ERROR: expected: a41ab8d39393c9456941b477c33bb1b221a29b635f1c9a99523aab2f5e74f790
ERROR: got : 0890f7b0ee8e20a279a617c60686874b3c7a99e064adb2b38d884499b5284c43
ERROR: Incomplete download, or man-in-the-middle (MITM) attack I wonder where does sha256 hash |
When upstream doesn't publish the checksums of a tarball, it is normally computed locally at the time of import. This also goes if upstream uses a different checksum algorithm, like if you want sha512 but it only has sha256 But ultimately, it's even signed. Note that it is not the checksum of the source code, that would be contained in the git commit itself (via tree etc) It is the checksum after first doing dist transformations, and then applying compression (maybe another timestamp) Debian uses "pristine-tar" for this.
https://git-scm.com/docs/gitattributes#_export_subst https://github.com/containerd/containerd/blob/v1.5.8/vendor/k8s.io/client-go/pkg/version/base.go#L59 // NOTE: The $Format strings are replaced during 'git archive' thanks to the
// companion .gitattributes file containing 'export-subst' in this same
// directory. See also https://git-scm.com/docs/gitattributes
gitVersion string = "v0.0.0-master+$Format:%h$"
gitCommit string = "$Format:%H$" // sha1 from git, output of $(git rev-parse HEAD)
gitTreeState string = "" // state of git tree, either "clean" or "dirty" This will potentially change the output, every time that GitHub does a "git archive" for you The alternative would be to generate and attach a static tarball, which is not really practical (and wasteful) The killer here is using the "short" hash. |
The workarounds only last for "so long", until the number of signficants digits in the commit changes again: They also flip back and forth, depending on which server the GitHub workloads ends up on running on, etc. Which lessens the confidence in having checksums in the first place Minikube sorta made it worse by using the wrong file name (forgot the And by not stating clearly that it was "computed locally", like our OS upstream so carefully did (and we ignored) So a much better checksum file looks like: (it wasn't used because of the older version, 1.4.4 and not 1.5.8) |
@afbjorklund Thanks for explaining it. I agree this is an issue that should be fixed. I do the follow steps:
Want to known more about how |
It was calculated the same way, just some time ago (the contents vary, over time) This is because the length of the git hash varies, due to random factors on GitHub. It might be The "long" hash remains at: Ps, for
|
I'm a bit concerned about making changes or "manually" excluding files in the vendor directories, as doing so would complicate validating the vendored files, and checksums. Given that this is an issue with a dependency that we vendor, perhaps it should be reported with the k8s maintainers to see if an alternative solution would be possible for this; note that go 1.18 now also supports using version control metadata in builds (golang/go#37475, golang/go#35667), so wondering if those could be used there as alternatives. Another option would be to propose /cc @dims (for k8s) |
Requiring the latest/greatest/bleeding version of Go is a problem in itself, but one that we have to live with. Removing all git attributes is probably overkill, this was specifically about Not using the short version at all was another option. Or maybe leaving the default commit as |
Yes, that would either require a transition period until older Go versions reach EOL
Right; but the issue is that we should avoid getting on a sliding slope where we "randomly" modify vendor code, as we may loose the benefit of being able to validate that vendored code matches what's expected. So if a generic solution is possible, or if the problem could be fixed at the source (in the k8s.io/client-go project) that would be my preference. |
FYI, kubernetes also using this https://github.com/kubernetes/component-base/blob/master/version/base.go to fetch version for components. |
Looks like they changed to use the non-abbreviated commit recently, which may resolve the problem as well; kubernetes/component-base@130dc3a |
Looks like that change was made in kubernetes/kubernetes#99377, which also updates |
Oh, ignore me; I was looking at the wrong commit; that PR was from February last year and client-go now also has that fix, so I guess the problem should be resolved on main / v1.6.x of containerd? |
code from k/client-go is sync from k/k/staging |
As long as you can live with the commit coming from the wrong repository, at least it won't change over time... i.e. in containerd it will show the containerd commit, in cri-o it will show the cri-o commit, and so on and so on. Somewhat useless to have the commit SHA, if you don't have the repository URL ? Or if you use git instead of tarballs, you get a different value (the raw template, instead) Only in containerd: .git
diff -ur containerd/vendor/k8s.io/client-go/pkg/version/base.go containerd-1.5.8/vendor/k8s.io/client-go/pkg/version/base.go
--- containerd/vendor/k8s.io/client-go/pkg/version/base.go 2020-10-07 19:58:39.000000000 +0200
+++ containerd-1.5.8/vendor/k8s.io/client-go/pkg/version/base.go 2021-11-17 21:04:57.000000000 +0100
@@ -55,8 +55,8 @@
// NOTE: The $Format strings are replaced during 'git archive' thanks to the
// companion .gitattributes file containing 'export-subst' in this same
// directory. See also https://git-scm.com/docs/gitattributes
- gitVersion string = "v0.0.0-master+$Format:%h$"
- gitCommit string = "$Format:%H$" // sha1 from git, output of $(git rev-parse HEAD)
+ gitVersion string = "v0.0.0-master+1e5ef943eb7"
+ gitCommit string = "1e5ef943eb76627a6d3b6de8cd1ef6537f393a71" // sha1 from git, output of $(git rev-parse HEAD)
gitTreeState string = "" // state of git tree, either "clean" or "dirty"
buildDate string = "1970-01-01T00:00:00Z" // build date in ISO8601 format, output of $(date -u +'%Y-%m-%dT%H:%M:%SZ')
|
Yes, it's unclear to me what the intent is of that "feature"; even more so given that k/k itself isn't using it, so it's a bit odd. I'd prefer them to set a Perhaps @dims has some insight into what the expected use of it is, or if it's just an oversight (and should be removed?) |
ugh! @thaJeztah thanks for tagging me. Can we please open a issue in kubernetes/kubernetes? i'll help track this down. |
let me read the full back scroll / history |
Yes "master" / "main" is ok(ish) as it doesn't use truncated commits, but even in that case, the commit will be of the project that vendors the code, not the commit of k/k or k8s.io/go-client (so won't contain useful information), so perhaps the whole "automatic" commit info should be removed (I may be missing context on why it's there of course). |
Confirmed, Only in containerd: .git
diff -ur containerd/vendor/k8s.io/client-go/pkg/version/base.go containerd-1.6.0-beta.5/vendor/k8s.io/client-go/pkg/version/base.go
--- containerd/vendor/k8s.io/client-go/pkg/version/base.go 2022-01-11 15:45:38.595533590 +0100
+++ containerd-1.6.0-beta.5/vendor/k8s.io/client-go/pkg/version/base.go 2022-01-06 18:16:54.000000000 +0100
@@ -55,8 +55,8 @@
// NOTE: The $Format strings are replaced during 'git archive' thanks to the
// companion .gitattributes file containing 'export-subst' in this same
// directory. See also https://git-scm.com/docs/gitattributes
- gitVersion string = "v0.0.0-master+$Format:%H$"
- gitCommit string = "$Format:%H$" // sha1 from git, output of $(git rev-parse HEAD)
+ gitVersion string = "v0.0.0-master+857b35de6c6d40962d24dc1e561e8446e9f3197f"
+ gitCommit string = "857b35de6c6d40962d24dc1e561e8446e9f3197f" // sha1 from git, output of $(git rev-parse HEAD)
gitTreeState string = "" // state of git tree, either "clean" or "dirty"
buildDate string = "1970-01-01T00:00:00Z" // build date in ISO8601 format, output of $(date -u +'%Y-%m-%dT%H:%M:%SZ')
diff -ur containerd/vendor/k8s.io/component-base/version/base.go containerd-1.6.0-beta.5/vendor/k8s.io/component-base/version/base.go
--- containerd/vendor/k8s.io/component-base/version/base.go 2022-01-11 15:45:38.599533626 +0100
+++ containerd-1.6.0-beta.5/vendor/k8s.io/component-base/version/base.go 2022-01-06 18:16:54.000000000 +0100
@@ -55,8 +55,8 @@
// NOTE: The $Format strings are replaced during 'git archive' thanks to the
// companion .gitattributes file containing 'export-subst' in this same
// directory. See also https://git-scm.com/docs/gitattributes
- gitVersion = "v0.0.0-master+$Format:%H$"
- gitCommit = "$Format:%H$" // sha1 from git, output of $(git rev-parse HEAD)
+ gitVersion = "v0.0.0-master+857b35de6c6d40962d24dc1e561e8446e9f3197f"
+ gitCommit = "857b35de6c6d40962d24dc1e561e8446e9f3197f" // sha1 from git, output of $(git rev-parse HEAD)
gitTreeState = "" // state of git tree, either "clean" or "dirty"
buildDate = "1970-01-01T00:00:00Z" // build date in ISO8601 format, output of $(date -u +'%Y-%m-%dT%H:%M:%SZ') And as stated above, containerd 1.4.x doesn't vendor k8s.io so it doesn't have the issue either. |
@afbjorklund thanks for confirming |
@thaJeztah @afbjorklund @jonyhy96 thanks for helping nail that down! I've created kubernetes/publishing-bot#285 in Kubernetes to stop including |
Thanks @nikhita ! |
The old files haven't flipped yet, so still holding up. It's a bit random, how many months pass between it happens.
Think it varies between OS, so it could be which backend server ends up generating the archive etc. Or "moon phase". containerd-1.5.8/vendor/k8s.io/client-go/pkg/version/base.go: gitVersion string = "v0.0.0-master+1e5ef943eb7" |
After docker upgrading containerd to 1.5.10, this started happening again. diff -ur containerd-1.5.10.orig/vendor/k8s.io/client-go/pkg/version/base.go containerd-1.5.10/vendor/k8s.io/client-go/pkg/version/base.go
--- containerd-1.5.10.orig/vendor/k8s.io/client-go/pkg/version/base.go 2022-03-02 19:35:48.000000000 +0100
+++ containerd-1.5.10/vendor/k8s.io/client-go/pkg/version/base.go 2022-03-02 19:35:48.000000000 +0100
@@ -55,7 +55,7 @@
// NOTE: The $Format strings are replaced during 'git archive' thanks to the
// companion .gitattributes file containing 'export-subst' in this same
// directory. See also https://git-scm.com/docs/gitattributes
- gitVersion string = "v0.0.0-master+2a1d4dbdb2a"
+ gitVersion string = "v0.0.0-master+2a1d4dbdb2"
gitCommit string = "2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc" // sha1 from git, output of $(git rev-parse HEAD)
gitTreeState string = "" // state of git tree, either "clean" or "dirty"
Note that the historic releases mentioned above (1.5.8, 1.5.9) also broke...
So only "fixed" for containerd 1.6. |
@afbjorklund this will be fixed for 1.24 kubernetes. (Fix in publishing-bot kubernetes/kubernetes#108970). |
Not sure when it will be backported to containerd 1.5 though, or when docker-containerd will be upgraded to 1.6
k8s.io/client-go 1a6e9022ba699e71f82c056acd7fe532bcd442c5 |
Happened again, for v1.5.11 (as in docker 20.10.14) - gitVersion string = "v0.0.0-master+3df54a8523"
+ gitVersion string = "v0.0.0-master+3df54a85234" |
Still happening (it flipped back again) |
@afbjorklund the fix for Kubernetes is in v1.24, so it'll reflect in client-go in |
Closing? #6905 has upgraded our Kubernetes dependencies to v0.24.x. |
Resolving. The |
containerd 1.6.4 doesn't have the issue, and was bumped in docker 20.10.15 |
Description
Due to including git attributes, the archive gets different checksums:
vendor/k8s.io/client-go/pkg/version/.gitattributes
This is because the amount of "significant digits" varies, in the git rev.
vendor/k8s.io/client-go/pkg/version/base.go
Steps to reproduce the issue
Describe the results you received and expected
What version of containerd are you using?
v1.5.8
Any other relevant information
No response
Show configuration if it is related to CRI plugin.
No response
The text was updated successfully, but these errors were encountered: