Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving forward with v5 of EKS Blueprints #1421

Closed
fcarta29 opened this issue Feb 8, 2023 · 33 comments
Closed

Moving forward with v5 of EKS Blueprints #1421

fcarta29 opened this issue Feb 8, 2023 · 33 comments
Labels
enhancement New feature or request refactor
Milestone

Comments

@fcarta29
Copy link
Contributor

fcarta29 commented Feb 8, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

This issue is being created to provide community notice and to help track the work and cutover to v5 of EKS Blueprints. More details of the changes can be found here:
Direction for v5 of EKS Blueprints
Motivation for v5 of EKS Blueprints

@fcarta29 fcarta29 added this to the v5.0 milestone Feb 8, 2023
@fcarta29 fcarta29 added enhancement New feature or request refactor labels Feb 8, 2023
@fcarta29 fcarta29 added this to Prioritized in Old EKS Blueprints (Archived) Feb 8, 2023
@fcarta29 fcarta29 moved this from Prioritized to Working on it in Old EKS Blueprints (Archived) Feb 8, 2023
@Jumziey
Copy link

Jumziey commented Feb 14, 2023

Do you have any ambitions when it comes to migrating from v4 to v5? It seems it will get complicated given the goals 🤔

That said, great initiative! We've been bothered by the same issues you describe and it seems like the correct way forward.

Edit: Did not get any direct answer but we can see an UPGRADE-5.0.md committed into the repository now so it would seem it's something they are thinking about quite a lot.

@kcoleman731 kcoleman731 pinned this issue Feb 14, 2023
@mk2134226
Copy link

Any idea how long it will take for v5 to be production ready ? and how long you guys will maintain / patch / update v4 ?

@fcarta29
Copy link
Contributor Author

fcarta29 commented Feb 16, 2023

FAQ for v5

When will v5 be GA and ready for production?
EKS Blueprints are community driven examples of how to build on AWS EKS. Only AWS services like AWS EKS have formal release dates, are officially supported, and are certified as production ready.

How long you guys will v4 be maintained and updated?
Once v5 reaches development/refactoring readiness and testing is complete, notice will be made to the community via this issue and on the top level README.md regarding when cross over from v4 to v5 will be merged. From this point v4 will be tagged/branched and left for those who desire to remain on v4 and for historical context. No further development/updates will occur for v4 and all future changes will only occur on the main v5 version.

What examples are being moved and where are they going?

@Hokwang
Copy link
Contributor

Hokwang commented Feb 20, 2023

@askulkarni2
Copy link
Contributor

@Hokwang for the moment we will keep using the existing IRSA module in the new addons repo.

@ManuelMueller1st
Copy link

This should’ve been communicated more clearly. I just have set up a new cluster using your control plane module.

@bryantbiggs
Copy link
Contributor

bryantbiggs commented Feb 28, 2023

This should’ve been communicated more clearly. I just have set up a new cluster using your control plane module.

@ManuelMueller1st Do you have suggestions on what that might have looked like - how we could have communicated the changes better?

@Hokwang
Copy link
Contributor

Hokwang commented Mar 7, 2023

@bryantbiggs do you have target date for v5? month or quarter?

@ManuelMueller1st
Copy link

This should’ve been communicated more clearly. I just have set up a new cluster using your control plane module.

@ManuelMueller1st Do you have suggestions on what that might have looked like - how we could have communicated the changes better?

A bullet point of major changes to the project on the first README page. Potential Breaking Changes could mean anything. I realized that the cluster module was discontinued after reading the v5 article.

@fcarta29
Copy link
Contributor Author

fcarta29 commented Jun 8, 2023

dmonagha

Thanks!

Updated link to Motivation for v5 of EKS Blueprints

@fcarta29 fcarta29 closed this as completed Jun 8, 2023
@fcarta29 fcarta29 reopened this Jun 8, 2023
@atheiman
Copy link

@bryantbiggs Do you know if the recommended migration path will require cluster rebuilds? My company is currently using v4 to build out a new managed K8s platform. We aren't running production workloads yet so if there isn't a way to migrate to v5 without down time we would prefer to bite the bullet and do rebuilds now by calling the terraform-aws-eks modules directly.

No - you should be able to modify the Terraform state to move from v4 to v5 and maintain the current control plane without a recreation

@bryantbiggs I think a markdown doc should be created walking thru migrating a basic v4 cluster with a couple addons updated to v5. This is a very popular tool for deploying eks, a lot of customers will need this information. Im thinking these terraform state modifications will be quite complex.

@bryantbiggs
Copy link
Contributor

https://aws-ia.github.io/terraform-aws-eks-blueprints/main/v4-to-v5/cluster/

@atheiman
Copy link

excellent! thanks!

@bryantbiggs
Copy link
Contributor

The team is still working on the guidance for migrating addons and teams which is why those doc pages are currently empty

@taliesins
Copy link

Starting assumption: most infrastructure people want to abstract the underlying cloud from resources running inside the cluster. Security people want to see the scope Kubernetes permissions being a subset of cloud permissions and want to reduce access to privileged accounts during run time. This means they want to orchestrate the creation of AWS resources outside of the cluster and not make use of something like ACK (AWS Controllers for Kubernetes). So, this generally leads to creating resources in a pipeline using something like Terraform that uses an elevated user at install time.

With these assumptions, I don't think that the current motivation clearly outlines the following:

  • Add-on's value is to create the prerequisite AWS resources required by a helm chart using Terraform and then to pass on any references to this AWS and other AWS specific configuration further down the chain
  • Add-on’s should probably aggregate most of their configuration and hopefully batch them into a 2 or 3 waves using an (app-of-apps) umbrella helm charts that create native Kubernetes CRD resources such as ArgoCD application. I think the mistake is that currently most people focus on them as defining which helm chart to use and the deployment of the helm chart. Ideally we would take this info out of the add-on concept. The only drawback I can think of is not knowing which helm chart configuration schema (values.yaml file) was used when forwarding the configuration downstream (app-of-apps -> wrapper helm chart -> helm chart). Individual interactions with Kubernetes via Terraform resource providers should be avoided as much as possible! Writing your own kube-wait-for-resource / kube-wait-for-resource-to-be-gone Terraform provider doesn’t work either. Even inside Kubernetes multiple waves might be required due to buggy/badly written resources that are depended on by downstream resources e.g. CSI drivers / order to delete CRDs in.
  • Blueprints should just be an example of how to organise a group of add-ons and not a generic list of all add-ons that is feature togglable. At the moment the motivation does this in too subtly a way for most users to get and we will see this repeated by other users spawning off github projects repeating this by offering this aggregated tf module

add-ons

@frank-bee
Copy link

frank-bee commented Jun 27, 2023

https://aws-ia.github.io/terraform-aws-eks-blueprints/main/v4-to-v5/cluster/

@bryantbiggs Is this the migration guide?

I started migrating the addons from v4 repo to the new one. ( The eks cluster is already setup the new way, using the eks module ).
Is there a way how to migrate the addons one by one via state migration?
I have to keep the old blueprint module, because e.g. argo cd won't be supported in v5.

One add-on ( aws_load_balancer_controller ) I migrated already by deleting it first in the old module and adding it in a second TF apply in the new module:

//depriacted
module "eks_blueprints_kubernetes_addons" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons?ref=v4.32.1"
...

  enable_argocd = var.argocd_enabled
  //argo config
  ...
 
  // some addons I want to migrate to v5 repo , e.g.
  enable_amazon_eks_aws_ebs_csi_driver = true
  enable_aws_for_fluentbit                 = var.enable_aws_for_fluentbit
  aws_for_fluentbit_cw_log_group_retention = var.aws_for_fluentbit_cw_log_group_retention
  enable_external_dns            = true
  external_dns_route53_zone_arns = var.dns_extra_zones
  external_dns_helm_config = {
    values = [jsonencode(yamldecode(<<-EOT
      txtOwnerId: ${local.name}
      zoneIdFilters: ${local.zoneIdFilters}
      policy: 'sync'
      aws:
        zoneType: 'public'
        zonesCacheDuration: '1h'
      logLevel: 'debug'
      EOT
    ))]
  }
...
  
}

//new addons
module "eks_blueprints_addons" {
  source  = "aws-ia/eks-blueprints-addons/aws"
  version = "~> 1.0"
...

  enable_aws_load_balancer_controller      = true
  aws_load_balancer_controller = {
    create_namespace = true
    namespace        = "lb-controller"
    values = [jsonencode(yamldecode(<<-EOT
      clusterName: ${local.name}
    ))]
  }
}

@nmindz
Copy link

nmindz commented Jul 5, 2023

This should’ve been communicated more clearly. I just have set up a new cluster using your control plane module.

@ManuelMueller1st Do you have suggestions on what that might have looked like - how we could have communicated the changes better?

At least a NOTICE at the top of the README with a link to the issue/etc regarding the new v5.0 should have been added. A nice "things to consider before you go ahead using things as they are".

Adding a notice to the release tag was good, but not enough as one may simply follow the README instructions and use the module without ever visiting the releases page.

I also happened to have just implemented a new EKS model repository that depended on this module.

@bryantbiggs
Copy link
Contributor

@nmindz we had a notice on the main README for several months

It was only recently removed when we unveiled the changes for v5

@nmindz
Copy link

nmindz commented Jul 5, 2023

I also began works with that module back in Feb, but I didn't realize the README had changed since then and just today I was trying to figure out the options for the AWS Load Balancer plugin since that was not deployed by default in our internal template.

I've been having several mixed search results (such as the missing aws_load_balancer_controller_helm_config option in the ~> 1.0 version, which I assume is v5) and from guides/references that still point to that old option.

When I visited the v4 branch just now I realized the notice was there, but has recently been removed from the main branch, which was the one I checked before commenting, so first and foremost please excuse my lack of attention when replying today.

I know the addons/plugins guides are still WIP, but are there any pointers/common naming/config schemes already defined for v5 or are they still subject to change? (see snippets below)

I got some indications for the external_secrets deployment from another issue if I recall correctly, and made it so I could run it on Fargate:

  enable_external_secrets = true
  external_secrets = {
    namespace = "external-secrets",
    values = [yamlencode({
      "webhook" : { "port" = "9443" },
      "tolerations" : [{ "key" : "eks.amazonaws.com/compute-type", "operator" : "Equal", "value" : "fargate", "effect" : "NoSchedule" }]
    })]
  }

And re-reading @frank-bee's comment I also realized he had a similar config for AWS Load Balancer:

  enable_aws_load_balancer_controller = true
  aws_load_balancer_controller = {
    create_namespace = true
    namespace        = "lb-controller"
    values = [jsonencode(yamldecode(<<-EOT
      clusterName: ${local.name}
    ))]
  }

Also, not sure if I clearly understood but according to @fcarta29 FAQ comment, only examples are being moved away from this repository, is that correct?

@bryantbiggs
Copy link
Contributor

in v5, the Terraform modules are removed from this repository and only examples/blueprints will remain. You can see the new home of addons here https://github.com/aws-ia/terraform-aws-eks-blueprints-addons which has its own documentation https://aws-ia.github.io/terraform-aws-eks-blueprints-addons/main/

The new addons are 1.x and stable for use today

@techdragon
Copy link

techdragon commented Jul 26, 2023

It would be good to get some examples of migration scripts. While the full path of the resources and data in the terraform state of an individual deployment will obviously differ a lot between users individual terraform deployments, overall there will be significant similarity in the paths inside the modules that we need to migrate from v4 to v5 with new external addons/modules.

So ideally we should be able to get documentation that outlines what we need to run to migrate the terraform state, otherwise everyone will have to individually work out how they need to perform the dozens of commands like the ones I'll put below and a lot of duplicate work could be saved if this was documented as part of the migration to v5.

terraform state mv '"${v4_module_name}"/${example_eks_blueprints_v4_component}/example_resource_one' '"${v5_module_name}"/${example_eks_blueprints_v5_separated_component}/example_resource_one'
terraform state mv '"${v4_module_name}"/${example_eks_blueprints_v4_component}/example_resource_two' '"${v5_module_name}"/${example_eks_blueprints_v5_separated_component}/example_resource_two'
terraform state mv '"${v4_module_name}"/${example_eks_blueprints_v4_component}/example_resource_three' '"${v5_module_name}"/${example_eks_blueprints_v5_separated_component}/example_resource_three'
terraform state mv '"${v4_module_name}"/${example_eks_blueprints_v4_component}/example_submodule/example_subresource_one' '"${v5_module_name}"/${example_eks_blueprints_v5_separated_component}/example_submodule_with_a_new_name/example_subresource_one'

The v4 and v5 structures are known and it should be possible to provide some level of assistance to the many users who are currently stuck trying to build a huge list of terraform state mv commands and implement migration plans for the many subcomponents.

Edit: (In case someone just says to replace the module and use the documentation to our own settings in the new way to each addon...)
As nice as it is having Terraform managed idempotent infrastructure we can easily teardown and replace, it shouldn't be considered acceptable to tear down a whole production Kubernetes cluster just to clean up this v4 to v5 process where the individual components we used v4 to install may have only changed by a patch version or not at all. Tearing down a cluster can involve a lot of annoying downtime, that has to be scheduled, and planned for, and is generally speaking a lot of non productive work that can be avoided if we have documentation like the kind I mentioned.

@bryantbiggs
Copy link
Contributor

the changes for v5 are now complete - please see our docs section on v4 to v5 for details on the motivation, context, and migration paths https://aws-ia.github.io/terraform-aws-eks-blueprints/main/v4-to-v5/motivation/

@vishwa-trulioo
Copy link

@bryantbiggs Does this mean v5 work is 100% complete and ready to use?

@bryantbiggs
Copy link
Contributor

yes, the v5 approach has been available for some time now - we left this open for questions/concerns while updating docs/messaging/etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refactor
Projects
Status: Done
Development

No branches or pull requests