Proof of concept: nightly dependency treadmill #15910

edsantiago · 2022-09-22T19:27:59Z

As discussed in f2f: this is the cleanest, simplest mechanism
I can think of to auto-test the Big Three dependencies: simply
run go mod edit immediately after git checkout, then run the
entire CI test suite.

If this approach works, we can set up a new CIRRUS_CRON=treadmill
job. I'm expecting it not to work, because Murphy, but want to
see what the unexpected gotchas are.

This differs significantly from the buildah treadmill, in that
buildah is almost impossible to re-vendor without manual intervention.
(In practice, so are these, but let's dream of a world in which
this will run and pass every night). (I want a pony too).

Signed-off-by: Ed Santiago santiago@redhat.com

None

openshift-ci · 2022-09-22T19:28:05Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [edsantiago]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

edsantiago · 2022-09-22T20:55:21Z

This is kind of weird: it compiles, and almost all the int tests have passed! (one flake, the FD one which I still haven't filed an issue for). I'm actually kind of worried that maybe I'm NOP'ing, and this is not testing an actual revendor. Is there a way to look at test logs and see which containers/foo versions got picked up in the built binary?

Anyhow, gotta go. Thoughts welcome on this approach.

rhatdan · 2022-09-23T11:58:25Z

I would figure the validate tests would fail on this with unvendored content?

edsantiago · 2022-09-23T12:00:05Z

That's what the git add vendor is for.

rhatdan · 2022-09-23T12:44:33Z

We need to commit a breaking change to make sure this fails. :^)

edsantiago · 2022-09-24T18:58:51Z

Re-pushed on 40e8bcb, passed again. (With two flakes, both of them nasty ones, but I'll follow up on Monday).

edsantiago · 2022-09-26T15:57:57Z

@containers/podman-maintainers hi! I re-pushed during scrum, and, lo, it's failing:

[+0008s]   >$ ./bin/podman pull -q quay.io/libpod/alpine_labels:latest
[+0008s]   >Error: writing blob: adding layer with blob "sha256:59bf1c3509f33515622619af21ed55bbe26d24913cedbca106468a5fb37a50c3": processing tar file(time="2022-09-26T10:55:31-05:00" level=error msg="stat /home/some9968dude/.config/containers/storage.conf: no such file or directory"

also

[+0045s] #  ERROR: for nginx  Cannot create container for service nginx: no such image: docker.io/library/alpine: image not known

Vendor diffs here.

I restarted each one, failed the same way. Does not seem to be a flake.

edsantiago · 2022-09-26T15:58:53Z

Another failure, in int tests now:

         $ podman [options] pull quay.io/libpod/cirros:latest
         time="2022-09-26T10:56:57-05:00" level=error msg="stat containers/storage.conf: no such file or directory"

edsantiago · 2022-09-26T16:40:41Z

culprit (for at least one of the failures, haven't checked the others) seems to be containers/storage#1357

rhatdan · 2022-09-26T17:51:29Z

Do the tests have XDG_CONFIG_HOME set and no storage.conf defined?

edsantiago · 2022-09-26T18:32:01Z

I'm guessing you don't need an answer, because you submitted containers/storage#1363 11 minutes after this question, but: using view-source on one result shows $XDG_RUNTIME_DIR set but no $XDG_CONFIG_ANYTHING.

rhatdan · 2022-09-26T18:59:03Z

Weird that XDG_CONFIG_HOME is now showing it is set. Will see if updated PR to storage fixes the problem.
I guess your nightly treadmill is already paying off.

edsantiago · 2022-09-26T19:01:50Z

And it's not even nightly yet!

P.S. does this help?

podman/libpod/runtime.go

Lines 123 to 163 in aaeabb0

    
           // SetXdgDirs ensures the XDG_RUNTIME_DIR env and XDG_CONFIG_HOME variables are set. 
        
           // containers/image uses XDG_RUNTIME_DIR to locate the auth file, XDG_CONFIG_HOME is 
        
           // use for the containers.conf configuration file. 
        
           func SetXdgDirs() error { 
        
           	if !rootless.IsRootless() { 
        
           		return nil 
        
           	} 
        
           	// Set up XDG_RUNTIME_DIR 
        
           	runtimeDir := os.Getenv("XDG_RUNTIME_DIR") 
        
           	if runtimeDir == "" { 
        
           		var err error 
        
           		runtimeDir, err = util.GetRuntimeDir() 
        
           		if err != nil { 
        
           			return err 
        
           		} 
        
           	} 
        
           	if err := os.Setenv("XDG_RUNTIME_DIR", runtimeDir); err != nil { 
        
           		return fmt.Errorf("cannot set XDG_RUNTIME_DIR: %w", err) 
        
           	} 
        
           	if rootless.IsRootless() && os.Getenv("DBUS_SESSION_BUS_ADDRESS") == "" { 
        
           		sessionAddr := filepath.Join(runtimeDir, "bus") 
        
           		if _, err := os.Stat(sessionAddr); err == nil { 
        
           			os.Setenv("DBUS_SESSION_BUS_ADDRESS", fmt.Sprintf("unix:path=%s", sessionAddr)) 
        
           		} 
        
           	} 
        
           	// Set up XDG_CONFIG_HOME 
        
           	if cfgHomeDir := os.Getenv("XDG_CONFIG_HOME"); cfgHomeDir == "" { 
        
           		cfgHomeDir, err := util.GetRootlessConfigHomeDir() 
        
           		if err != nil { 
        
           			return err 
        
           		} 
        
           		if err := os.Setenv("XDG_CONFIG_HOME", cfgHomeDir); err != nil { 
        
           			return fmt.Errorf("cannot set XDG_CONFIG_HOME: %w", err) 
        
           		} 
        
           	} 
        
           	return nil 
        
           }

rhatdan · 2022-09-26T19:25:53Z

So it is self inflicted.

edsantiago · 2022-09-26T19:35:36Z

ISTR a very very good reason for adding that. tig blame dates it to #3466, August 2019.

nalind · 2022-09-26T20:03:05Z

Do the tests have XDG_CONFIG_HOME set and no storage.conf defined?

time="2022-09-26T10:56:57-05:00" level=error msg="stat containers/storage.conf: no such file or directory"
suggests that the path name passed to stat() was just "containers/storage.conf", as though XDG_CONFIG_HOME (and the path that was combined with "containers" and "storage.conf") was an empty value.

edsantiago · 2022-09-27T20:07:31Z

This is ready for review, and to merge as soon as CI passes (which it will, because the $CIRRUS_CRON conditional will not trigger). I have NOT added a Cirrus cron job, because I don't seem to have enough Cirrus privs.

Luap99 · 2022-09-27T20:29:29Z

.cirrus.yml

+          # Big Three dependencies, and run full CI test suite. Notification
+          # email will go out to monitor-list upon failure.
+          if [[ "$CIRRUS_CRON" = "treadmill" ]]; then
+              for pkg in common image/v5 storage; do


Why is buildah excluded? This will fail if common makes a breaking change that also affects buildah.
IMO we need to vendor everything at once, excluding buildah will cause even more troubles.

buildah is impossible to vendor automatically. Seriously, absolutely impossible. That's the whole drama behind #13808.

However... does this address your concern?

Not really, this thing here will just start to fail we break something in c/common. Until common is declared stable this job will fail regularly.

Is there any reason to not just vendor everything in your treadmill PR vs this automated job?

Is there any reason to not just vendor everything in your treadmill PR vs this automated job?

My sanity. Buildah treadmill is already a pretty big time suck (some days). I'm reluctant to spend more time on this sort of integration work.

rhatdan · 2022-09-29T14:49:01Z

LGTM

edsantiago · 2022-09-29T14:53:00Z

There's one problem with this, that I haven't been able to figure out: immediately after a vendor-everything-into-podman PR, this nightly cron job is going to fail, because I (deliberately) did not add --allow-empty to the git commit. I'm now reevaluating that decision:

As it is, the nightly cron job will fail, and will force me to look at it and see "oh, no problem, it's just because nothing has changed".
With --allow-empty, the check will pass, and will run the rest of CI even though it's a waste of cycles.

I was trying to save the planet. Now I'm thinking it's more important to save human time. Thoughts?

vrothberg · 2022-09-29T14:57:07Z

We're doomed, one way or another. So I think you should have a good time and not be bothered by false positives.

rhatdan · 2022-09-29T15:06:54Z

Agreed. Humans should be primary.

edsantiago · 2022-09-29T18:20:42Z

So done. Rebased & repushed with --allow-empty.

vrothberg

/lgtm

As discussed in f2f: this is the cleanest, simplest mechanism I can think of to auto-test the Big Three dependencies: simply run go mod edit immediately after git checkout, then run the entire CI test suite. This differs significantly from the buildah treadmill, in that buildah is almost impossible to re-vendor without manual intervention. (In practice, so are these, but let's dream of a world in which this will run and pass every night). (I want a pony too). Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago · 2022-10-04T15:30:06Z

This has been working well in buildah. Apart from, sigh, flakes.

baude · 2022-10-06T02:23:57Z

/lgtm

openshift-ci bot added the release-note-none label Sep 22, 2022

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 22, 2022

edsantiago force-pushed the cron_treadmill branch 4 times, most recently from c4d1b2b to da331be Compare September 22, 2022 19:53

edsantiago force-pushed the cron_treadmill branch from da331be to cf8ffd8 Compare September 24, 2022 11:44

edsantiago force-pushed the cron_treadmill branch from cf8ffd8 to 28b45eb Compare September 26, 2022 15:34

edsantiago mentioned this pull request Sep 27, 2022

Use system defaults if storage.conf does not exist in XDG_CONFIG_HOME containers/storage#1363

Merged

edsantiago force-pushed the cron_treadmill branch 2 times, most recently from 33a2137 to 9ece902 Compare September 27, 2022 19:31

Luap99 reviewed Sep 27, 2022

View reviewed changes

edsantiago force-pushed the cron_treadmill branch from 9ece902 to c370fce Compare September 29, 2022 15:31

vrothberg reviewed Sep 30, 2022

View reviewed changes

openshift-ci bot assigned vrothberg Sep 30, 2022

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 30, 2022

edsantiago force-pushed the cron_treadmill branch from c370fce to 1e71d12 Compare September 30, 2022 11:30

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 30, 2022

openshift-ci bot assigned baude Oct 6, 2022

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 6, 2022

openshift-merge-robot merged commit 94cfe7b into containers:main Oct 6, 2022

edsantiago deleted the cron_treadmill branch October 6, 2022 11:36

edsantiago mentioned this pull request Oct 6, 2022

sdnotify play kube policies: podman container wait, hangs #16076

Closed

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proof of concept: nightly dependency treadmill #15910

Proof of concept: nightly dependency treadmill #15910

edsantiago commented Sep 22, 2022

openshift-ci bot commented Sep 22, 2022

edsantiago commented Sep 22, 2022

rhatdan commented Sep 23, 2022

edsantiago commented Sep 23, 2022

rhatdan commented Sep 23, 2022

edsantiago commented Sep 24, 2022

edsantiago commented Sep 26, 2022

edsantiago commented Sep 26, 2022

edsantiago commented Sep 26, 2022

rhatdan commented Sep 26, 2022

edsantiago commented Sep 26, 2022

rhatdan commented Sep 26, 2022

edsantiago commented Sep 26, 2022

rhatdan commented Sep 26, 2022

edsantiago commented Sep 26, 2022

nalind commented Sep 26, 2022

edsantiago commented Sep 27, 2022

Luap99 Sep 27, 2022

edsantiago Sep 27, 2022

Luap99 Sep 27, 2022

edsantiago Sep 27, 2022

rhatdan commented Sep 29, 2022

edsantiago commented Sep 29, 2022

vrothberg commented Sep 29, 2022

rhatdan commented Sep 29, 2022

edsantiago commented Sep 29, 2022

vrothberg left a comment

edsantiago commented Oct 4, 2022

baude commented Oct 6, 2022

Proof of concept: nightly dependency treadmill #15910

Proof of concept: nightly dependency treadmill #15910

Conversation

edsantiago commented Sep 22, 2022

openshift-ci bot commented Sep 22, 2022

edsantiago commented Sep 22, 2022

rhatdan commented Sep 23, 2022

edsantiago commented Sep 23, 2022

rhatdan commented Sep 23, 2022

edsantiago commented Sep 24, 2022

edsantiago commented Sep 26, 2022

edsantiago commented Sep 26, 2022

edsantiago commented Sep 26, 2022

rhatdan commented Sep 26, 2022

edsantiago commented Sep 26, 2022

rhatdan commented Sep 26, 2022

edsantiago commented Sep 26, 2022

rhatdan commented Sep 26, 2022

edsantiago commented Sep 26, 2022

nalind commented Sep 26, 2022

edsantiago commented Sep 27, 2022

Luap99 Sep 27, 2022

Choose a reason for hiding this comment

edsantiago Sep 27, 2022

Choose a reason for hiding this comment

Luap99 Sep 27, 2022

Choose a reason for hiding this comment

edsantiago Sep 27, 2022

Choose a reason for hiding this comment

rhatdan commented Sep 29, 2022

edsantiago commented Sep 29, 2022

vrothberg commented Sep 29, 2022

rhatdan commented Sep 29, 2022

edsantiago commented Sep 29, 2022

vrothberg left a comment

Choose a reason for hiding this comment

edsantiago commented Oct 4, 2022

baude commented Oct 6, 2022