Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support KIND running inside container #30

Open
axsaucedo opened this issue Jan 27, 2021 · 10 comments
Open

Support KIND running inside container #30

axsaucedo opened this issue Jan 27, 2021 · 10 comments

Comments

@axsaucedo
Copy link

Currently we have KIND running in our Kubernetes CI, where KIND runs inside of a pod/container, which requires the following mounts with the node:

                volumeMounts:
                  - mountPath: /lib/modules
                    name: modules
                    readOnly: true
                  - mountPath: /sys/fs/cgroup
                    name: cgroup
                  - name: dind-storage
                    mountPath: /var/lib/docker
                securityContext:
                  privileged: true
                imagePullPolicy: Always
              volumes:
                - name: modules
                  hostPath:
                    path: /lib/modules
                    type: Directory
                - name: cgroup
                  hostPath:
                    path: /sys/fs/cgroup
                    type: Directory
                - name: dind-storage
                  emptyDir: {}

When running a KIND docker action inside a KIND enabled container (which works on kubernetes) such as:

...
    runs-on: ubuntu-18.04
    container: repo/container:tag
...

It seems to work when running it locally using act (https://github.com/nektos/act), but when running it on the github Actions worker, I get the error Error: Kubernetes cluster unreachable: Get https://127.0.0.1:34221/version?timeout=32s: dial tcp 127.0.0.1:34221: connect: connection refused.

The KIND cluster does seem to get correctly created, but the issue seems to mainly be that the cluster is not reachable. Is this an issue that you have come across before?

@gopisaba
Copy link

I have same issue. The issue looks like the nodeport for KinD api server (in your case 34221) is not exposed to the runner container. So it is failing to connect to the cluster.

@0-sv
Copy link

0-sv commented Aug 18, 2021

Any new insights @axsaucedo?

@axsaucedo
Copy link
Author

No, it still doesn't work unfortunately - at least to my knowledge

@0-sv
Copy link

0-sv commented Aug 19, 2021

Thanks for the quick reply.

Then I am considering to create a self-hosted runner, unless I have some buy-in from my team to continue debugging, I'll keep you updated.

@axsaucedo
Copy link
Author

Ok, yeah it would be great to hear your thoughts. We actually use KIND on our JX cluster where we run CI jobs, we were looking at migrating, but even if this support was added, we would reach the limits of the default runners very fast on memory and disk space due to the images required, so at least for now we've decided to stick with our current setup.

@0-sv
Copy link

0-sv commented Aug 19, 2021

I was able to solve it, the tools provided by Github Actions proved to be sufficient.

The issue looks like the nodeport for KinD api server (in your case 34221) is not exposed to the runner container. So it is failing to connect to the cluster.

So yes, you need to configure the user-defined network by Github. E.g.:

kind-config.yaml:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  apiServerAddress: '127.0.0.1'
  apiServerPort: 6443

Github Actions workflow:

export KUBECONFIG=$HOME/.kube/config # if not already set 
export KIND_EXPERIMENTAL_DOCKER_NETWORK=${{ job.container.network }}

kind create cluster \
            --kubeconfig $KUBECONFIG \
            --config=./kind-config.yaml

kubectl config set-cluster kind-kind --server=https://kind-control-plane:6443 

Optionally in case you connect from inside another container:

docker network connect kind $(cat /etc/hostname)

@axsaucedo
Copy link
Author

That's fantastic, thank you for this @svatwork, will give it a try, as this could still be useful for some simpler smokecheck tests whilst leaving the more complex integration tests still to the larger CI nodes.

@callum-tait-pbx
Copy link

This is super helpful information. It would be great if this solution was somewhere a bit more accessible?

@justinmchase
Copy link

Can you expand upon ${{ job.container.network }} this seems to be unset for me I can't find docs on it.

@justinmchase
Copy link

I've been struggling with this for a while, lacking an exact recipe only to find that downgrading to v0.19.0 seems to fix it.

I'm not totally sure but it seems like in v0.20.0 there is an issue where if you specify apiServerPort: 6443 it will use that port but the healthz checker still appears to be checking on what would be the random port.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants