Dynamic upstreamStages #1967

wmiller112 · 2024-05-07T03:04:17Z

Checklist

I've searched the issue queue to verify this is not a duplicate feature request.
I've pasted the output of kargo version, if applicable.
I've pasted logs, if applicable.

Proposed Feature

Ability to more dynamically specify upstream stage subscriptions.

Motivation

I'm trying to add a stage that subscribes to multiple upstream stages (a stage without a promotionMechanism). The upstream stages use argoCDAppUpdate promotionMechanism, and the associated ArgoCD Applications are generated by an applicationset. These applicationsets use git file/directory, but I think this would apply to any applicationset generators, as the resulting application names are unknown. The stage associated with the application can easily be generated via applicationset, but the consolidating stage doesnt appear to have any way to dynamically select upstreamStages.

Suggested Implementation

I think theres a lot of options here. Maybe the easiest would be something like a regex or support for wildcards in either the upstreamStages or new field likeupstreamStageRegex. Since the stages can be applied via applicationset as well, the naming could be configured to be consistent. Another option could be some kind of upstreamStageSelectionTag, and ability to tag stages.

The text was updated successfully, but these errors were encountered:

krancour · 2024-05-07T04:18:28Z

Keep in mind that Stages and Applications are different things. Stages without Applications are a thing. Stages with multiple Applications are a thing. Is it possible to reframe this question of more dynamic linkage between Stages without involving Applications?

I foresee a number of possible difficulties, but want to start by simply getting a better handle on the request.

wmiller112 · 2024-05-07T18:19:37Z

Definitely understand the distinction, and that a stage can have none/one/many, and that the project is working towards optional argocd. One example that comes to mind would be stages based on a service deployed across different clusters where the clusters and stages are generated by terraform/crossplane/etc and can be added/removed at any time. If one wanted a stage that depended on any of those cluster based stages succeeding (or potentially all of them eventually) before promoting to the downstream stage, they'd need some way to generate that list.

In the case of ArgoCD applications, as you pointed out, a single stage can reference multiple argocdApps to refresh/sync. This presents a similar issue though as the Application names must be known ahead of time.

krancour · 2024-05-13T23:43:05Z

@wmiller112 a lot of that makes sense to me. This is sort of an epic, so let's break it into smaller parts.

where the clusters and stages are generated by terraform/crossplane/etc and can be added/removed at any time

The notion of a "StageSet" is one we've been entertaining for quite some time. See #339. My current take on it is that we probably need to achieve a greater level of maturity before implementing it. I think this is probably a > 1.0.0 feature. (Note that Argo CD introduced ApplicationSet as part of the standard install only in 2.3.)

Whenever we do get around to that, I expect that we'd take a page from ApplicationSet's playbook and use pluggable generators as a way of dynamically creating/removing Stages owned/managed by a given StageSet.

This presents a similar issue though as the Application names must be known ahead of time.

I would expect various generators to have a way either for you to specify Argo CD App names or else derive them from each Stage's name.

If one wanted a stage that depended on any of those cluster based stages succeeding (or potentially all of them eventually) before promoting to the downstream stage, they'd need some way to generate that list.

I see you already stumbled on #1168 😄

but the consolidating stage doesnt appear to have any way to dynamically select upstreamStages

Right. I suppose that's something to follow up on after StageSets are working. Even without this feature, StageSets would be useful because they would enable a less repetitious way of defining many Stages which often have only small and predictable differences from one to the next. But the whole notion of dynamically adding and removing Stages is certainly more powerful if we also have the ability to dynamically add/remove subscriptions to those Stages.

Maybe the easiest would be something like a regex or support for wildcards in either the upstreamStages or new field like upstreamStageRegex.

Don't want to get too mired in implementation details just yet since there are obviously a lot of other things that have to fall into place before we have to worry about this, but regexes or wildcards are possibly not as simple as they appear on the surface because there isn't an efficient way to query Kubernetes for resources with names matching a pattern.

I do have one follow-up question for you, which has to do more with the specifics of your use case. The notion of dynamically adding and removing Stages strikes me as treating Stages more like "cattle" than "pets." While promoting cattle over pets is often a laudable goal, I am trying to better understand what the purpose of that would be where Stages are concerned. Note that Kargo deals with "Stages" and not "Environments," and this was a deliberate choice so that the nodes in your pipeline are aligned with each (little "a") application instances' purpose in life (e.g. for smoke tests, performance tests, UAT tests, production, etc.) as opposed to its location (e.g. a particular cluster, zone, region, etc.) Given this, it feels natural to me to say things like "move on to UAT after passing smoke tests and performance tests," but the benefit of something like "move on to UAT after success in this growing and shrinking pool of upstream Stages" isn't quite as obvious. I'd be curious to know more about how you'd be leveraging that capability.

wmiller112 · 2024-05-20T18:09:38Z

Awesome to see the concept of StageSet already having been discussed. I was considering suggesting something similar, but definitely understand that it's something that will require more maturity, so figured it was far out of scope for the time being.
We run a number of applications, each with same docker image and different entrypoint. All apps are deployed to several clusters in a given environment. These are all deployed via ApplicationSets:

A top level ApplicationSet with a git generator looks at a git repo where infra automation (TF in this case) stores cluster details, allowing an Application per cluster
Next level looks at a git repo where we define kustomize directories with patches for each application, creating an Application for each
This allows both clusters and applications to be created or removed any time, and every cluster in the env to run all required apps.

Up to this point, we've used the traditional mess of bash to track AnalysisRun results and overall Application health before allowing all services across all clusters in an env to promote to full. We do not promote from staging to prod, but rather deploy in parallel to staging and prod, unless a dev wants to use staging to test something specifically, out of band, before deploying to prod.

So our pipelines are currently very basic with a single Stage that involves an promotionMechanism.ArgoCDAppUpdates with a list of all Applications in the env. I've managed to generate this dynamically using ArgoCD plugins for now. Specifically the ArgoCD Lovely plugin with a preprocessor that generates a new kustomize patch based on those cluster files and application folders in git. That then patches the Stage.

I don't believe our current topology is particularly suited for the intended use of a 'Stage'. I've also run into issues since we really only care about the state of git, but want to subscribe to an image repo to update git (this issue). That said, we do intent to move in the direction of the Stage concept within the envs, as we start to break apart production either regionally, compliance based, or otherwise, so it will make sense to move in the long term. In the short term, the use case is replacing bash scripts, tracking when all Applications across each env are successfully rolled out, as well as a GUI for devs to rollback from.

wmiller112 added the kind/feature-request label May 7, 2024

github-actions bot added needs/priority needs/area labels May 7, 2024

krancour added kind/discussion and removed kind/feature-request labels May 13, 2024

krancour self-assigned this May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic upstreamStages #1967

Dynamic upstreamStages #1967

wmiller112 commented May 7, 2024 •

edited

krancour commented May 7, 2024

wmiller112 commented May 7, 2024

krancour commented May 13, 2024 •

edited

wmiller112 commented May 20, 2024

Dynamic upstreamStages #1967

Dynamic upstreamStages #1967

Comments

wmiller112 commented May 7, 2024 • edited

Checklist

Proposed Feature

Motivation

Suggested Implementation

krancour commented May 7, 2024

wmiller112 commented May 7, 2024

krancour commented May 13, 2024 • edited

wmiller112 commented May 20, 2024

wmiller112 commented May 7, 2024 •

edited

krancour commented May 13, 2024 •

edited