Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to enable disk encryption in the storage on EKS anywhere bare metal nodes #14133

Closed
ygao-armada opened this issue Apr 26, 2024 · 1 comment
Labels

Comments

@ygao-armada
Copy link

ygao-armada commented Apr 26, 2024

Is this a bug report or feature request? Bug report

  • Bug Report
    I try to run following commands:
  1. git clone --single-branch --branch v1.13.6 https://github.com/rook/rook.git
  2. cd rook/deploy/examples
  3. uncomment line: " # encryptedDevice: "true" ..."
  4. kubectl create -f crds.yaml -f common.yaml -f operator.yaml
  5. kubectl create -f cluster.yaml

I failed with 2 tries:

  1. create partition for each data disk.
  2. no partition for each data disk

The former has RUNNING rook-ceph-osd-prepare-xxx pods, and such rook-ceph-osd-prepare logs:

2024-04-26 09:15:00.537540 D | exec: Running command: lsblk --noheadings --path --list --output NAME /dev/sda
2024-04-26 09:15:00.538671 I | inventory: skipping device "sda" because it has child, considering the child instead.
...
2024-04-26 09:15:00.602857 D | exec: Running command: ceph-volume inventory --format json /dev/sda1
2024-04-26 09:15:00.866116 I | cephosd: device "sda1" is available.
2024-04-26 09:15:00.866129 I | cephosd: partition "sda1" is not picked because encrypted OSD on partition is not allowed

The latter has CrashLoopBackOff rook-ceph-osd-prepare-xxx pods, and such rook-ceph-osd-prepare logs:

2024-04-26 15:41:31.128026 I | cephosd: device "sda" is available.
2024-04-26 15:41:31.128043 I | cephosd: old lsblk can't detect bluestore signature, so try to detect here
...
2024-04-26 15:41:31.393242 I | cephclient: getting or creating ceph auth key "client.bootstrap-osd"
2024-04-26 15:41:31.393254 D | exec: Running command: ceph auth get-or-create-key client.bootstrap-osd mon allow profile bootstrap-osd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --name=client.admin --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json
2024-04-26 15:41:31.776134 D | cephosd: won't use raw mode since encryption is enabled
2024-04-26 15:41:31.776150 D | exec: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt -- /usr/sbin/lvm --help
2024-04-26 15:41:31.776892 D | cephosd: failed to call nsenter. failed to execute nsenter. output: nsenter: failed to execute /usr/sbin/lvm: No such file or directory: exit status 127

Then I try to copy lvm to /usr/sbin/lvm, I get this logs:

Traceback (most recent call last):
 File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
  return f(*a, **kw)
 File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
  terminal.dispatch(self.mapper, subcommand_args)
 File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
  instance.main()
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main
  terminal.dispatch(self.mapper, self.argv)
 File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
  instance.main()
 File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
  return func(*a, **kw)
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 414, in main
  self._execute(plan)
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 429, in _execute
  p.safe_prepare(argparse.Namespace(**args))
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 200, in safe_prepare
  rollback_osd(self.args, self.osd_id)
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/common.py", line 35, in rollback_osd
  Zap(['--destroy', '--osd-id', osd_id]).main()
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 403, in main
  self.zap_osd()
 File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
  return func(*a, **kw)
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 301, in zap_osd
  devices = find_associated_devices(self.args.osd_id, self.args.osd_fsid)
 File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 88, in find_associated_devices
  '%s' % osd_id or osd_fsid)
RuntimeError: Unable to find any LV for zapping OSD: 0
2024-04-26 16:23:19.076114 C | rookcmd: failed to configure devices: failed to initialize osd: failed ceph-volume: exit status 1

Deviation from expected behavior:
no osd created
Expected behavior:
osd created properly

How to reproduce it (minimal and precise):

Just run above commands with EKS anywhere bare metal cluster with ubuntu 20.04 (to be honest, I'm afraid it's a general issue)

File(s) to submit:

  • Cluster CR (custom resource), typically called cluster.yaml, if necessary
    available upon request

Logs to submit:
mentioned above

  • Operator's logs, if necessary

  • Crashing pod(s) logs, if necessary

    To get logs, use kubectl -n <namespace> logs <pod name>
    When pasting logs, always surround them with backticks or use the insert code button from the Github UI.
    Read GitHub documentation if you need help.

Cluster Status to submit:

  • Output of kubectl commands, if necessary

    To get the health of the cluster, use kubectl rook-ceph health
    To get the status of the cluster, use kubectl rook-ceph ceph status
    For more details, see the Rook kubectl Plugin

Environment:

  • OS (e.g. from /etc/os-release): ubuntu 20.04
  • Kernel (e.g. uname -a): 5.4.0-177-generic
  • Cloud provider or hardware configuration: Dell Power Edge R650
  • Rook version (use rook version inside of a Rook Pod): v1.13.6
  • Storage backend version (e.g. for ceph do ceph -v):
  • Kubernetes version (use kubectl version):
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): EKSA
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):
@ygao-armada ygao-armada changed the title Failed to enable disk encryption in the storage Failed to enable disk encryption in the storage on EKS anywhere nodes Apr 26, 2024
@ygao-armada ygao-armada changed the title Failed to enable disk encryption in the storage on EKS anywhere nodes Failed to enable disk encryption in the storage on EKS anywhere bare metal nodes Apr 26, 2024
@ygao-armada
Copy link
Author

ygao-armada commented Apr 27, 2024

After I install lvm2 in the osImage, the osd pods are created successfully:

$ kubectl -n rook-ceph get pod
...
rook-ceph-osd-0-556d6d75f9-l6pbz                                 2/2     Running     0          9m14s
rook-ceph-osd-1-59c4c76ccc-6wwpv                                 2/2     Running     0          8m45s
rook-ceph-osd-2-54dddf59bf-r69m8                                 2/2     Running     0          7m58s
rook-ceph-osd-3-696d9dd87b-5wh4w                                 2/2     Running     0          7m58s

In the node, we can see:

# lsblk -f
NAME                                FSTYPE LABEL UUID                                   FSAVAIL FSUSE% MOUNTPOINT
...
sdd                                 LVM2_m       QSU1CB-Vxkn-jXah-Rufx-MhiB-IWKu-6z8sN0                
└─ceph--bad12e0b--fe26--44e9--897a--10cfe0ac0d50-osd--block--5268c673--8ffb--4a19--ac9a--c8a49e96a2e2
                                                                                                       
  └─4ttf0w-cSmK-XB06-Vgum-kWoE-MtTk-lsr1e9
                                                                                                       
sde                                 LVM2_m       gMrIC4-EL9a-ON47-UIFz-7uDd-RY8U-2bwbxX                
└─ceph--3e3400bb--073b--43a6--9759--0735fb4bf8fd-osd--block--891a74d0--ba9c--4eb3--8a39--8eff473015ec
                                                                                                       
  └─ebxL39-JSUq-NZYY-NBEC-GuSM-Ai1e-V044mU
                                                                                                       
sdf                                 LVM2_m       GnUl5A-LI89-qgBB-t2sx-Pj3w-bIxb-UatsxQ                
└─ceph--8ee5f47b--695c--4dda--9dab--1e0b16579f62-osd--block--dac48c85--c1f6--494d--b74b--fae0919810eb
                                                                                                       
  └─K9vETz-8Afu-rvuL-3YZd-1rd0-aQg5-MjeTwS
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant