Skip to content
This repository has been archived by the owner on Feb 10, 2022. It is now read-only.

Kubelet fails to start on bosh-lite #363

Open
gitstn opened this issue Oct 30, 2019 · 2 comments
Open

Kubelet fails to start on bosh-lite #363

gitstn opened this issue Oct 30, 2019 · 2 comments

Comments

@gitstn
Copy link

gitstn commented Oct 30, 2019

What happened:

While deploying Kubo using Kubo-deployment/bin/deploy-cfcr-lite, kubelet failed to start with following error message

Task 41 | 11:38:48 | Updating instance master: master/9161da11-aa5c-46fc-8aec-f9dc3f5b4090 (0) (canary) (00:01:29)
Task 41 | 11:40:18 | Updating instance worker: worker/187a97f3-78a7-424c-a73c-765fb64810aa (0) (canary) (00:03:21)
L Error: Action Failed get_task: Task 56d27322-891f-4a11-5037-842239858931 result: 1 of 2 post-start scripts failed. Failed Jobs: kubelet. Successful Jobs: bosh-dns.
Task 41 | 11:43:39 | Error: Action Failed get_task: Task 56d27322-891f-4a11-5037-842239858931 result: 1 of 2 post-start scripts failed. Failed Jobs: kubelet. Successful Jobs: bosh-dns.

Task 41 Started Wed Oct 30 11:38:25 UTC 2019
Task 41 Finished Wed Oct 30 11:43:39 UTC 2019
Task 41 Duration 00:05:14
Task 41 error

Updating deployment:
Expected task '41' to succeed but state is 'error'

Exit code 1


What you expected to happen:

The expectation was that Kubo would be deployed successfully with kubelet running.

How to reproduce it (as minimally and precisely as possible):

  1. Deploy Bosh-lite on virtualbox
  2. Clone Kubo-release
  3. Clone kubo -deployment
  4. From kubo-deployment run bin/deploy-cfcr-lite

Anything else we need to know?:

From the kubelet log file, following line seems to be the issue.

F1030 12:05:12.887447 14621 kubelet.go:1407] Failed to start OOM watcher open /dev/kmsg: no such file or directory

Environment:

  • Deployment Info (bosh -d <deployment> deployment):

Using environment '192.168.50.6' as client 'admin'

Name Release(s) Stemcell(s) Config(s) Team(s)
cfcr bosh-dns/1.15.0 bosh-warden-boshlite-ubuntu-xenial-go_agent/456.30 2 runtime/default -
bpm/1.0.4 1 cloud/default
cfcr-etcd/1.11.1
docker/35.3.4
kubo/0.41.0+dev.1572435471

1 deployments

Succeeded

  • Environment Info (bosh -e <environment> environment):

Using environment '192.168.50.6' as client 'admin'

Name���������������bosh-lite
UUID���������������3d3b97df-196c-4d82-bd86-903b6061a6b3
Version������������270.7.0 (00000000)
Director Stemcell��ubuntu-xenial/456.40
CPI����������������warden_cpi
Features�����������compiled_package_cache: disabled
�������������������config_server: enabled
�������������������local_dns: enabled
�������������������power_dns: disabled
�������������������snapshots: disabled
User���������������admin

Succeeded

  • Kubernetes version (kubectl version):

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"20c265fef0741dd71a66480e35bd69f18351daea", GitTreeState:"clean", BuildDate:"2019-10-15T19:16:51Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?

  • Cloud provider (e.g. aws, gcp, vsphere):
    Virtualbox
@generalinterest
Copy link

In my testing, I find the Kubelet would not start with this failure in kubelet.stderr.log

kubelet.go:1407] Failed to start OOM watcher open /dev/kmsg: no such file or directory

It looks like the bosh-stemcell-456.XX-warden-boshlite-ubuntu-xenial-go_agent is missing this kernel support.

As a test, and I would not propose this as a solution....

sudo touch /dev/kmsg

And the kubelet will start.

@ramonskie
Copy link

ramonskie commented Apr 21, 2020

i had the same issue with stemcell bosh-warden-boshlite-ubuntu-xenial-go_agent 621.59
when touched the /dev/kmsg it started immediately.

im wondering if this should be solved in the stemcell or in the deployment at this point

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants