Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup and restore the Terraform state #310

Open
loheagn opened this issue May 19, 2022 · 6 comments · Fixed by #314
Open

Backup and restore the Terraform state #310

loheagn opened this issue May 19, 2022 · 6 comments · Fixed by #314

Comments

@loheagn
Copy link
Contributor

loheagn commented May 19, 2022

We can build a command line tool to backeup and restore the Configuration Object and the Terraform state. The command line tool may have two subcommands: backup and restore.

The usage of the backup subcommand may like this:

$ tfc backup --help
Backup the Configuration Object, and if the configuration use the default Terraform kubernetes backend, backup the Terraform state too

Usage:
  tfc backup [OPTIONS] CONFIGURATION_NAME

Examples:

  tfc backup -d ./backup_dir configuration-oss-demo


Flags:
  -d, --dir string   Destination directory for saving the backed up files
  -h, --help         help for backup

Global Flags:
      --as string                      Username to impersonate for the operation
      --as-group stringArray           Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
      --cache-dir string               Default cache directory (default "/Users/loheagn/.kube/cache")
      --certificate-authority string   Path to a cert file for the certificate authority
      --client-certificate string      Path to a client certificate file for TLS
      --client-key string              Path to a client key file for TLS
      --cluster string                 The name of the kubeconfig cluster to use
      --context string                 The name of the kubeconfig context to use
      --insecure-skip-tls-verify       If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
      --kubeconfig string              Path to the kubeconfig file to use for CLI requests.
  -n, --namespace string               If present, the namespace scope for this CLI request
      --password string                Password for basic authentication to the API server
      --request-timeout string         The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
  -s, --server string                  The address and port of the Kubernetes API server
      --tls-server-name string         Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
      --token string                   Bearer token for authentication to the API server
      --user string                    The name of the kubeconfig user to use
      --username string                Username for basic authentication to the API server

The restore subcommand can be used to restore the Configuration Object and it will also restore the Terraform state if the configuration uses the default Terraform kubernetes backend.

When the retore subcommand works, it will accept a yaml file or dir. It fetches the configuration definations from the yaml file or the dir, and check if the configurations use the default kubernetes backend, if the configuration does, the restore subcommand will read the state.json file and restore the bakcend secret before resotring the configuration objects.

The usage of the restore subcommand may like this:

$ tfc restore --help
Restore the Configuration Object, and if the configuration use the default Terraform kubernetes backend, restore the Terraform state too

Usage:
  tfc restore OPTIONS

Examples:

  tfc restore -f ./configuration.yaml


Flags:
  -f, --from string   A file or a directory from where the command read the source definitions
  -h, --help          help for restore

Global Flags:
      --as string                      Username to impersonate for the operation
      --as-group stringArray           Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
      --cache-dir string               Default cache directory (default "/Users/loheagn/.kube/cache")
      --certificate-authority string   Path to a cert file for the certificate authority
      --client-certificate string      Path to a client certificate file for TLS
      --client-key string              Path to a client key file for TLS
      --cluster string                 The name of the kubeconfig cluster to use
      --context string                 The name of the kubeconfig context to use
      --insecure-skip-tls-verify       If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
      --kubeconfig string              Path to the kubeconfig file to use for CLI requests.
  -n, --namespace string               If present, the namespace scope for this CLI request
      --password string                Password for basic authentication to the API server
      --request-timeout string         The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
  -s, --server string                  The address and port of the Kubernetes API server
      --tls-server-name string         Server name to use for server certificate validation. If it is not provided, the hostname used to contact the server is used
      --token string                   Bearer token for authentication to the API server
      --user string                    The name of the kubeconfig user to use
      --username string                Username for basic authentication to the API server
@loheagn
Copy link
Contributor Author

loheagn commented May 21, 2022

Let's discuss a typical restore scenario.

Assume that we have a configuration.yaml like the following, and we use the default Terraform kubernetes backend (the state.json will be sotred in the same cluster as the Configuration Object).

apiVersion: terraform.core.oam.dev/v1beta1
kind: Configuration
metadata:
  name: alibaba-oss-bucket-hcl
spec:
  hcl: |
    resource "alicloud_oss_bucket" "bucket-acl" {
      bucket = var.bucket
      acl = var.acl
    }

    output "BUCKET_NAME" {
      value = "${alicloud_oss_bucket.bucket-acl.bucket}.${alicloud_oss_bucket.bucket-acl.extranet_endpoint}"
    }

    variable "bucket" {
      description = "OSS bucket name"
      default = "vela-website"
      type = string
    }

    variable "acl" {
      description = "OSS bucket ACL, supported 'private', 'public-read', 'public-read-write'"
      default = "private"
      type = string
    }

  backend:
    secretSuffix: oss
    inClusterConfig: true

  variable:
    bucket: "vela-website-20211130-1900-51"
    acl: "private"

  writeConnectionSecretToRef:
    name: oss-conn
    namespace: default

After we apply the configuration.yaml, the cloud resource (the oss bucket named vela-website-20211130-1900-51) will be created, and we can check the status of the cofiguration:

$ kubectl get configuration.terraform.core.oam.dev
NAME                     STATE       AGE
alibaba-oss-bucket-hcl   Available   13s

Now, we can backup the configuration alibaba-oss-bucket-hcl and the terraform state to the local file system, and their filenames are configuration.back.yaml and state.back.json.

Next, we assume that the kubernetes cluster has a disaster and is no longer avaialbe. We need to restore the configuration to a new kuberentes and don't recreate the cloud resource (the oss bucket is this scenario).

First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.

Second, we should restore the configuration object. We can just apply the configuration.back.yaml.

@zzxwill
Copy link
Collaborator

zzxwill commented May 23, 2022

@loheagn What are the exact procedures for restoring the state?

First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.

@loheagn
Copy link
Contributor Author

loheagn commented May 23, 2022

@loheagn What are the exact procedures for restoring the state?

First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.

  1. Read the cofiguration.back.yaml and get the meta (namespace and name) of the backend secret which should be created.

  2. Read the state.back.json and use the content (the Terraform state) to build the backend secret. The data should have a key tfstate and the value is the encoded Terraform state string.

@zzxwill
Copy link
Collaborator

zzxwill commented May 25, 2022

@loheagn What are the exact procedures for restoring the state?

First, we should rebuild the Terraform backend. we can read the Terraform state from the state.back.json and write the data to a secret (whose name and namespace should be detectd by the configuration) in the new kubernetes cluster.

  1. Read the cofiguration.back.yaml and get the meta (namespace and name) of the backend secret which should be created.
  2. Read the state.back.json and use the content (the Terraform state) to build the backend secret. The data should have a key tfstate and the value is the encoded Terraform state string.

Any executable commands for step 1 and 2. And how do you verify your restore is successful? Append any evidence for it please.

@loheagn
Copy link
Contributor Author

loheagn commented May 25, 2022

Hi, @zzxwill , I created a command tool to show how to resotre the state. You can review the code here.

You can just run go run main.go resotre -h for help. And I will add examples and the README later.

@zzxwill
Copy link
Collaborator

zzxwill commented May 25, 2022

@loheagn Please also take a look at this requirement.

And how do you verify your restore is successful? Append any evidence for it please.

@zzxwill zzxwill reopened this May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants