Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rook volume.go initializeDevicesLVMMode() incompatible with ceph-volume #8266

Closed
lyind opened this issue Jul 5, 2021 · 5 comments · Fixed by #8267
Closed

Rook volume.go initializeDevicesLVMMode() incompatible with ceph-volume #8266

lyind opened this issue Jul 5, 2021 · 5 comments · Fixed by #8267
Labels
Projects

Comments

@lyind
Copy link
Contributor

lyind commented Jul 5, 2021

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:
Bogus error message emitted by pod/rook-ceph-osd-prepare-node-*, prevents OSD initialization:

2021-07-05 21:23:49.139840 I | cephosd: configuring new device sdd
2021-07-05 21:23:49.139844 I | cephosd: using /dev/vg-metadata-0/metadata-0-3 as metadataDevice for device /dev/sdd and let ceph-volume lvm batch decide how to create volumes
2021-07-05 21:23:49.139857 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm batch --prepare --bluestore --yes --dmcrypt --osds-per-device 1 --crush-device-class hdd --block-db-size 71999422464 /dev/sdd --db-devices /dev/vg-metadata-0/metadata-0-3 --report
2021-07-05 21:23:49.545869 D | exec: 
2021-07-05 21:23:49.545891 D | exec: Total OSDs: 1
2021-07-05 21:23:49.545895 D | exec: 
2021-07-05 21:23:49.545898 D | exec:   Type            Path                                                    LV Size         % of device
2021-07-05 21:23:49.545901 D | exec: ----------------------------------------------------------------------------------------------------
2021-07-05 21:23:49.545904 D | exec:   encryption:     dmcrypt        
2021-07-05 21:23:49.545907 D | exec:   data            /dev/sdd                                                1.55 TB         100.00%
2021-07-05 21:23:49.545910 D | exec:   block_db        vg-metadata-0/metadata-0-3                              67.07 GB        10000.00%
2021-07-05 21:23:49.545915 D | exec: --> passed data devices: 1 physical, 0 LVM
2021-07-05 21:23:49.545920 D | exec: --> relative data size: 1.0
2021-07-05 21:23:49.545922 D | exec: --> passed block_db devices: 0 physical, 1 LVM
2021-07-05 21:23:49.562754 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm batch --prepare --bluestore --yes --dmcrypt --osds-per-device 1 --crush-device-class hdd --block-db-size 71999422464 /dev/sdd --db-devices /dev/vg-metadata-0/metadata-0-3 --report --format json
2021-07-05 21:23:50.026549 D | cephosd: ceph-volume reports: [{"data": "/dev/sdd", "data_size": "1.55 TB", "encryption": "dmcrypt", "block_db": "vg-metadata-0/metadata-0-3", "block_db_size": "67.07 GB"}]
failed to configure devices: failed to initialize lvm based osd: wrong db device for /dev/sdd, required: /dev/vg-metadata-0/metadata-0-3, actual: vg-metadata-0/metadata-0-3

Expected behavior:
Configuration with LVM metadataDevice is applied and OSD initialized.

How to reproduce it (minimal and precise):
Configure a dedicated, pre-assembled LV (logical volume) as a metadataDevice for a whole-device OSD.

Environment:

  • OS: Debian Bullseye
  • Kernel: 5.12.14
  • Cloud provider or hardware configuration: bare-metal
  • Rook version: v1.6.7
  • Storage backend version: Ceph v16.2.4
  • Kubernetes version: 1.21.1
  • Kubernetes cluster type: kubeadm
  • Storage backend status: HEALTH_WARN (no OSDs up)
@travisn
Copy link
Member

travisn commented Jul 12, 2021

Rook expects raw devices and partitions as specified in the prerequisites, rather than providing an LV.

@lyind
Copy link
Contributor Author

lyind commented Jul 13, 2021

Ceph-volume itself supports passing LVs and I am using 4 LVs on a GPT partition on a fast device as metadataDevice now. I will try to replicate this setup with 4 plain GPT partitions for the databases.

I did try one GPT partition as --db-device for 4 data devices before, but it didn't work. Rook passes separately specified devices on-by-one to ceph-volume, which doesn't allow ceph-volume to automatically split the db-device into slots.

I had to specify the data devices separately in CephCluster to force ceph-volume to accept the metadataDevice for the slower OSD SSDs (non-rotational by default, forced to type rotational) .

Being able to force the use of one LV per data device as metadataDevice was a life-saver, as I was able to get my setup running as planned. The only change missing was the linked PR (rook-ceph volume.go always prefixes mdPath with "/dev", so checking if the path reflected by ceph-volume --report matches the original mdPath seems safe).

@lyind
Copy link
Contributor Author

lyind commented Jul 13, 2021

I tried to use a GPT partition as metadataDevice again and the error message suggests using a LV (see below).

Partition metadataDevice

  storage:
    useAllNodes: true
    useAllDevices: false
    deviceFilter: "^(vdb)"
    location:
    config:
      storeType: bluestore
      osdsPerDevice: "1"
      encryptedDevice: "true"
      metadataDevice:
    config:
      storeType: bluestore
      osdsPerDevice: "1"
      encryptedDevice: "true"
      metadataDevice:
    devices:
    - name: 'vdb'
      config:
        metadataDevice: "/dev/vdc1"
        deviceClass: "hdd" # forces block.db device (journal on NVMe) to be used
        encryptedDevice: "true"
...
2021-07-13 14:24:15.507263 I | cephosd: discovering hardware
...
2021-07-13 14:24:15.891992 D | exec: Running command: sgdisk --print /dev/vdb
2021-07-13 14:24:15.944913 D | exec: Running command: udevadm info --query=property /dev/vdb
2021-07-13 14:24:16.036371 D | exec: Running command: lsblk --noheadings --pairs /dev/vdb
2021-07-13 14:24:16.195213 D | exec: Running command: lsblk /dev/vdc --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME
2021-07-13 14:24:16.200376 D | exec: Running command: sgdisk --print /dev/vdc
2021-07-13 14:24:16.280986 D | exec: Running command: udevadm info --query=property /dev/vdc
2021-07-13 14:24:16.305128 D | exec: Running command: lsblk --noheadings --pairs /dev/vdc
2021-07-13 14:24:16.337918 I | inventory: skipping device "vdc" because it has child, considering the child instead.
2021-07-13 14:24:16.338506 D | exec: Running command: lsblk /dev/vdc1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME
2021-07-13 14:24:16.345283 D | exec: Running command: udevadm info --query=property /dev/vdc1
2021-07-13 14:24:16.435972 D | inventory: discovered disks are:
2021-07-13 14:24:16.437050 D | inventory: &{Name:vda Parent: HasChildren:false DevLinks:/dev/disk/by-uuid/8d8df9c2-afa5-4ba6-8cf2-aff96260d93d /dev/disk/by-path/virtio-pci-0000:00:07.0 /dev/disk/by-path/pci-0000:00:07.0 Size:21474836480 UUID:e9778a97-147c-4e47-82a4-df65d80132fd Serial: Type:disk Rotational:true Readonly:false Partitions:[] Filesystem:swap Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/vda KernelName:vda Encrypted:false}
2021-07-13 14:24:16.437476 D | inventory: &{Name:vdb Parent: HasChildren:false DevLinks:/dev/disk/by-path/virtio-pci-0000:00:08.0 /dev/disk/by-path/pci-0000:00:08.0 Size:21474836480 UUID:e6838adb-7f6b-4bdc-bc88-8be7225aae16 Serial: Type:disk Rotational:true Readonly:false Partitions:[] Filesystem: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/vdb KernelName:vdb Encrypted:false}
2021-07-13 14:24:16.437984 D | inventory: &{Name:vdc1 Parent:vdc HasChildren:false DevLinks:/dev/disk/by-path/pci-0000:00:09.0-part1 /dev/disk/by-partuuid/ad021c79-a8b8-4add-8436-b5affb1c4205 /dev/disk/by-path/virtio-pci-0000:00:09.0-part1 /dev/disk/by-partlabel/osd Size:6440353792 UUID: Serial: Type:part Rotational:true Readonly:false Partitions:[] Filesystem: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/vdc1 KernelName:vdc1 Encrypted:false}
2021-07-13 14:24:16.438409 I | cephosd: creating and starting the osds
2021-07-13 14:24:16.439534 D | cephosd: desiredDevices are [{Name:vdb OSDsPerDevice:1 MetadataDevice:/dev/vdc1 DatabaseSizeMB:0 DeviceClass:hdd InitialWeight: IsFilter:false IsDevicePathFilter:false}]
2021-07-13 14:24:16.440145 D | cephosd: context.Devices are:
2021-07-13 14:24:16.442965 D | cephosd: &{Name:vda Parent: HasChildren:false DevLinks:/dev/disk/by-uuid/8d8df9c2-afa5-4ba6-8cf2-aff96260d93d /dev/disk/by-path/virtio-pci-0000:00:07.0 /dev/disk/by-path/pci-0000:00:07.0 Size:21474836480 UUID:e9778a97-147c-4e47-82a4-df65d80132fd Serial: Type:disk Rotational:true Readonly:false Partitions:[] Filesystem:swap Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/vda KernelName:vda Encrypted:false}
2021-07-13 14:24:16.443506 D | cephosd: &{Name:vdb Parent: HasChildren:false DevLinks:/dev/disk/by-path/virtio-pci-0000:00:08.0 /dev/disk/by-path/pci-0000:00:08.0 Size:21474836480 UUID:e6838adb-7f6b-4bdc-bc88-8be7225aae16 Serial: Type:disk Rotational:true Readonly:false Partitions:[] Filesystem: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/vdb KernelName:vdb Encrypted:false}
2021-07-13 14:24:16.444161 D | cephosd: &{Name:vdc1 Parent:vdc HasChildren:false DevLinks:/dev/disk/by-path/pci-0000:00:09.0-part1 /dev/disk/by-partuuid/ad021c79-a8b8-4add-8436-b5affb1c4205 /dev/disk/by-path/virtio-pci-0000:00:09.0-part1 /dev/disk/by-partlabel/osd Size:6440353792 UUID: Serial: Type:part Rotational:true Readonly:false Partitions:[] Filesystem: Vendor: Model: WWN: WWNVendorExtension: Empty:false CephVolumeData: RealPath:/dev/vdc1 KernelName:vdc1 Encrypted:false}
2021-07-13 14:24:16.446903 I | cephosd: skipping device "vda" because it contains a filesystem "swap"
2021-07-13 14:24:16.447607 D | exec: Running command: lsblk /dev/vdb --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME
2021-07-13 14:24:16.472971 D | exec: Running command: ceph-volume inventory --format json /dev/vdb
2021-07-13 14:24:17.867000 I | cephosd: device "vdb" is available.
2021-07-13 14:24:17.868700 I | cephosd: "vdb" found in the desired devices
2021-07-13 14:24:17.869396 I | cephosd: device "vdb" is selected by the device filter/name "vdb"
2021-07-13 14:24:17.870103 D | exec: Running command: parted --machine --script /dev/vdc1 print
2021-07-13 14:24:17.923273 D | exec: Running command: udevadm info --query=property /dev/vdc1
2021-07-13 14:24:17.972967 D | exec: Running command: lsblk /dev/vdc1 --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME
2021-07-13 14:24:17.979883 D | exec: Running command: ceph-volume inventory --format json /dev/vdc1
2021-07-13 14:24:20.032870 I | cephosd: device "vdc1" is available.
2021-07-13 14:24:20.037840 I | cephosd: skipping device "vdc1" that does not match the device filter/list ([{vdb 1 /dev/vdc1 0 hdd  false false}]). <nil>
2021-07-13 14:24:20.057844 I | cephosd: configuring osd devices: {"Entries":{"vdb":{"Data":-1,"Metadata":null,"Config":{"Name":"vdb","OSDsPerDevice":1,"MetadataDevice":"/dev/vdc1","DatabaseSizeMB":0,"DeviceClass":"hdd","InitialWeight":"","IsFilter":false,"IsDevicePathFilter":false},"PersistentDevicePaths":["/dev/disk/by-path/virtio-pci-0000:00:08.0","/dev/disk/by-path/pci-0000:00:08.0"]}}}
2021-07-13 14:24:20.058025 I | cephclient: getting or creating ceph auth key "client.bootstrap-osd"
2021-07-13 14:24:20.058565 D | exec: Running command: ceph auth get-or-create-key client.bootstrap-osd mon allow profile bootstrap-osd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --name=client.admin --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/446565337
2021-07-13 14:24:21.795987 D | cephosd: will use raw mode since cluster version is at least pacific
2021-07-13 14:24:21.799025 D | cephosd: won't use raw mode since encryption is enabled
2021-07-13 14:24:21.800806 D | exec: Running command: nsenter --mount=/rootfs/proc/1/ns/mnt -- /usr/sbin/lvm --help
2021-07-13 14:24:21.867159 I | cephosd: successfully called nsenter
2021-07-13 14:24:21.867728 I | cephosd: binary "/usr/sbin/lvm" found on the host, proceeding with osd preparation
2021-07-13 14:24:21.870095 I | cephosd: Successfully updated lvm config file "/etc/lvm/lvm.conf"
2021-07-13 14:24:21.870898 I | cephosd: initializing osd disk with lvm mode
2021-07-13 14:24:21.871371 I | cephosd: configuring new device vdb
2021-07-13 14:24:21.871710 I | cephosd: using /dev/vdc1 as metadataDevice for device /dev/vdb and let ceph-volume lvm batch decide how to create volumes
2021-07-13 14:24:21.872193 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm batch --prepare --bluestore --yes --dmcrypt --osds-per-device 1 --crush-device-class hdd /dev/vdb --db-devices /dev/vdc1 --report
2021-07-13 14:24:23.201871 D | exec: usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
2021-07-13 14:24:23.202024 D | exec:                              [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
2021-07-13 14:24:23.202064 D | exec:                              [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
2021-07-13 14:24:23.202137 D | exec:                              [--auto] [--no-auto] [--bluestore] [--filestore]
2021-07-13 14:24:23.202173 D | exec:                              [--report] [--yes]
2021-07-13 14:24:23.202236 D | exec:                              [--format {json,json-pretty,pretty}] [--dmcrypt]
2021-07-13 14:24:23.202270 D | exec:                              [--crush-device-class CRUSH_DEVICE_CLASS]
2021-07-13 14:24:23.202317 D | exec:                              [--no-systemd]
2021-07-13 14:24:23.202366 D | exec:                              [--osds-per-device OSDS_PER_DEVICE]
2021-07-13 14:24:23.202407 D | exec:                              [--data-slots DATA_SLOTS]
2021-07-13 14:24:23.202482 D | exec:                              [--block-db-size BLOCK_DB_SIZE]
2021-07-13 14:24:23.202517 D | exec:                              [--block-db-slots BLOCK_DB_SLOTS]
2021-07-13 14:24:23.202579 D | exec:                              [--block-wal-size BLOCK_WAL_SIZE]
2021-07-13 14:24:23.202625 D | exec:                              [--block-wal-slots BLOCK_WAL_SLOTS]
2021-07-13 14:24:23.202691 D | exec:                              [--journal-size JOURNAL_SIZE]
2021-07-13 14:24:23.202726 D | exec:                              [--journal-slots JOURNAL_SLOTS] [--prepare]
2021-07-13 14:24:23.202806 D | exec:                              [--osd-ids [OSD_IDS [OSD_IDS ...]]]
2021-07-13 14:24:23.202859 D | exec:                              [DEVICES [DEVICES ...]]
2021-07-13 14:24:23.202923 D | exec: ceph-volume lvm batch: error: /dev/vdc1 is a partition, please pass LVs or raw block devices
failed to configure devices: failed to initialize lvm based osd: failed ceph-volume report: exit status 2

OHOH!

LV metadataDevice

  storage:
    useAllNodes: true
    useAllDevices: false
    deviceFilter: "^(vdb)"
    location:
    config:
      storeType: bluestore
      osdsPerDevice: "1"
      encryptedDevice: "true"
      metadataDevice:
    config:
      storeType: bluestore
      osdsPerDevice: "1"
      encryptedDevice: "true"
      metadataDevice:
    devices:
    - name: 'vdb'
      config:
        metadataDevice: "vg-metadata-0/metadata-0-0"
        deviceClass: "hdd" # forces block.db device (journal on NVMe) to be used
        encryptedDevice: "true"
...
2021-07-13 13:54:25.840905 I | cephosd: configuring osd devices: {"Entries":{"vdb":{"Data":-1,"Metadata":null,"Config":{"Name":"vdb","OSDsPerDevice":1,"MetadataDevice":"vg-metadata-0/metadata-0-0","DatabaseSizeMB":0,"DeviceClass":"hdd","InitialWeight":"","IsFilter":false,"IsDevicePathFilter":false},"PersistentDevicePaths":["/dev/disk/by-path/pci-0000:00:08.0","/dev/disk/by-path/virtio-pci-0000:00:08.0"]}}}
...
2021-07-13 13:54:26.568272 D | cephosd: will use raw mode since cluster version is at least pacific
2021-07-13 13:54:26.568477 D | cephosd: won't use raw mode since encryption is enabled
2021-07-13 13:54:26.598251 I | cephosd: binary "/usr/sbin/lvm" found on the host, proceeding with osd preparation
2021-07-13 13:54:26.599700 I | cephosd: Successfully updated lvm config file "/etc/lvm/lvm.conf"
2021-07-13 13:54:26.599756 I | cephosd: initializing osd disk with lvm mode
2021-07-13 13:54:26.602883 I | cephosd: configuring new device vdb
2021-07-13 13:54:26.602978 I | cephosd: using vg-metadata-0/metadata-0-0 as metadataDevice for device /dev/vdb and let ceph-volume lvm batch decide how to create volumes
2021-07-13 13:54:26.603020 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm batch --prepare --bluestore --yes --dmcrypt --osds-per-device 1 --crush-device-class hdd /dev/vdb --db-devices /dev/vg-metadata-0/metadata-0-0 --report
2021-07-13 13:54:27.823184 D | exec: 
2021-07-13 13:54:27.823214 D | exec: Total OSDs: 1
2021-07-13 13:54:27.823223 D | exec: 
2021-07-13 13:54:27.823231 D | exec:   Type            Path                                                    LV Size         % of device
2021-07-13 13:54:27.823239 D | exec: ----------------------------------------------------------------------------------------------------
2021-07-13 13:54:27.823248 D | exec:   encryption:     dmcrypt        
2021-07-13 13:54:27.823255 D | exec:   data            /dev/vdb                                                20.00 GB        100.00%
2021-07-13 13:54:27.823262 D | exec:   block_db        vg-metadata-0/metadata-0-0                              6.00 GB         10000.00%
2021-07-13 13:54:27.823412 D | exec: --> passed data devices: 1 physical, 0 LVM
2021-07-13 13:54:27.823443 D | exec: --> relative data size: 1.0
2021-07-13 13:54:27.823450 D | exec: --> passed block_db devices: 0 physical, 1 LVM
2021-07-13 13:54:27.861637 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm batch --prepare --bluestore --yes --dmcrypt --osds-per-device 1 --crush-device-class hdd /dev/vdb --db-devices /dev/vg-metadata-0/metadata-0-0 --report --format json
2021-07-13 13:54:28.620922 D | cephosd: ceph-volume reports: [{"data": "/dev/vdb", "data_size": "20.00 GB", "encryption": "dmcrypt", "block_db": "vg-metadata-0/metadata-0-0", "block_db_size": "6.00 GB"}]
2021-07-13 13:54:28.621058 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm batch --prepare --bluestore --yes --dmcrypt --osds-per-device 1 --crush-device-class hdd /dev/vdb --db-devices /dev/vg-metadata-0/metadata-0-0
2021-07-13 13:55:07.696668 D | exec: --> passed data devices: 1 physical, 0 LVM
2021-07-13 13:55:07.697997 D | exec: --> relative data size: 1.0
2021-07-13 13:55:07.698580 D | exec: --> passed block_db devices: 0 physical, 1 LVM
...
2021-07-13 13:55:07.722271 D | exec: --> ceph-volume lvm prepare successful for: /dev/vdb
2021-07-13 13:55:07.727961 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log lvm list  --format json
2021-07-13 13:55:08.268586 D | cephosd: {
    "0": [
        {
            "devices": [
                "/dev/vdb"
            ],
            "lv_name": "osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
            "lv_path": "/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
            "lv_size": "21470642176",
            "lv_tags": "ceph.block_device=/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187,ceph.block_uuid=dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC,ceph.cephx_lockbox_secret=AQAVm+1gF/P1LxAA/nD0DzOUTG8aUyQ2bwWlUQ==,ceph.cluster_fsid=d4295205-512f-4fe6-8fb7-16df5bfc542f,ceph.cluster_name=ceph,ceph.crush_device_class=hdd,ceph.db_device=/dev/vg-metadata-0/metadata-0-0,ceph.db_uuid=30v8Zx-3M2C-3Clr-lxVN-M1Ja-bpc0-THI5MQ,ceph.encrypted=1,ceph.osd_fsid=8a7f37f3-fddf-41b9-91a8-a3672cf7d187,ceph.osd_id=0,ceph.osdspec_affinity=,ceph.type=block,ceph.vdo=0",
            "lv_uuid": "dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC",
            "name": "osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
            "path": "/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
            "tags": {
                "ceph.block_device": "/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
                "ceph.block_uuid": "dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC",
                "ceph.cephx_lockbox_secret": "AQAVm+1gF/P1LxAA/nD0DzOUTG8aUyQ2bwWlUQ==",
                "ceph.cluster_fsid": "d4295205-512f-4fe6-8fb7-16df5bfc542f",
                "ceph.cluster_name": "ceph",
                "ceph.crush_device_class": "hdd",
                "ceph.db_device": "/dev/vg-metadata-0/metadata-0-0",
                "ceph.db_uuid": "30v8Zx-3M2C-3Clr-lxVN-M1Ja-bpc0-THI5MQ",
                "ceph.encrypted": "1",
                "ceph.osd_fsid": "8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
                "ceph.osd_id": "0",
                "ceph.osdspec_affinity": "",
                "ceph.type": "block",
                "ceph.vdo": "0"
            },
            "type": "block",
            "vg_name": "ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903"
        },
        {
            "devices": [
                "/dev/vdc"
            ],
            "lv_name": "metadata-0-0",
            "lv_path": "/dev/vg-metadata-0/metadata-0-0",
            "lv_size": "6438256640",
            "lv_tags": "ceph.block_device=/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187,ceph.block_uuid=dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC,ceph.cephx_lockbox_secret=AQAVm+1gF/P1LxAA/nD0DzOUTG8aUyQ2bwWlUQ==,ceph.cluster_fsid=d4295205-512f-4fe6-8fb7-16df5bfc542f,ceph.cluster_name=ceph,ceph.crush_device_class=hdd,ceph.db_device=/dev/vg-metadata-0/metadata-0-0,ceph.db_uuid=30v8Zx-3M2C-3Clr-lxVN-M1Ja-bpc0-THI5MQ,ceph.encrypted=1,ceph.osd_fsid=8a7f37f3-fddf-41b9-91a8-a3672cf7d187,ceph.osd_id=0,ceph.osdspec_affinity=,ceph.type=db,ceph.vdo=0",
            "lv_uuid": "30v8Zx-3M2C-3Clr-lxVN-M1Ja-bpc0-THI5MQ",
            "name": "metadata-0-0",
            "path": "/dev/vg-metadata-0/metadata-0-0",
            "tags": {
                "ceph.block_device": "/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
                "ceph.block_uuid": "dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC",
                "ceph.cephx_lockbox_secret": "AQAVm+1gF/P1LxAA/nD0DzOUTG8aUyQ2bwWlUQ==",
                "ceph.cluster_fsid": "d4295205-512f-4fe6-8fb7-16df5bfc542f",
                "ceph.cluster_name": "ceph",
                "ceph.crush_device_class": "hdd",
                "ceph.db_device": "/dev/vg-metadata-0/metadata-0-0",
                "ceph.db_uuid": "30v8Zx-3M2C-3Clr-lxVN-M1Ja-bpc0-THI5MQ",
                "ceph.encrypted": "1",
                "ceph.osd_fsid": "8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
                "ceph.osd_id": "0",
                "ceph.osdspec_affinity": "",
                "ceph.type": "db",
                "ceph.vdo": "0"
            },
            "type": "db",
            "vg_name": "vg-metadata-0"
        }
    ]
}
2021-07-13 13:55:08.286039 I | cephosd: osdInfo has 2 elements. [{Name:osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187 Path:/dev/ceph-155b6c6c-01ae-4715-b1dc-fa6b21361903/osd-block-8a7f37f3-fddf-41b9-91a8-a3672cf7d187 Tags:{OSDFSID:8a7f37f3-fddf-41b9-91a8-a3672cf7d187 Encrypted:1 ClusterFSID:d4295205-512f-4fe6-8fb7-16df5bfc542f CrushDeviceClass:hdd} Type:block} {Name:metadata-0-0 Path:/dev/vg-metadata-0/metadata-0-0 Tags:{OSDFSID:8a7f37f3-fddf-41b9-91a8-a3672cf7d187 Encrypted:1 ClusterFSID:d4295205-512f-4fe6-8fb7-16df5bfc542f CrushDeviceClass:hdd} Type:db}]
2021-07-13 13:55:08.286077 I | cephosd: 1 ceph-volume lvm osd devices configured on this node
2021-07-13 13:55:08.286125 D | exec: Running command: stdbuf -oL ceph-volume --log-path /tmp/ceph-log raw list --format json
2021-07-13 13:55:08.848602 D | cephosd: {
    "8a7f37f3-fddf-41b9-91a8-a3672cf7d187": {
        "ceph_fsid": "d4295205-512f-4fe6-8fb7-16df5bfc542f",
        "device": "/dev/mapper/dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC",
        "osd_id": 0,
        "osd_uuid": "8a7f37f3-fddf-41b9-91a8-a3672cf7d187",
        "type": "bluestore"
    }
}
2021-07-13 13:55:08.850920 D | exec: Running command: lsblk /dev/mapper/dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC --bytes --nodeps --pairs --paths --output SIZE,ROTA,RO,TYPE,PKNAME,NAME,KNAME
2021-07-13 13:55:08.855968 D | exec: Running command: sgdisk --print /dev/mapper/dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC
2021-07-13 13:55:08.861942 I | cephosd: setting device class "hdd" for device "/dev/mapper/dZ0DLF-DDta-0fAF-Y1X5-ALj5-0KSG-q12DRC"
2021-07-13 13:55:08.862453 I | cephosd: 1 ceph-volume raw osd devices configured on this node
2021-07-13 13:55:08.862785 I | cephosd: devices = [{ID:0 Cluster:ceph UUID:8a7f37f3-fddf-41b9-91a8-a3672cf7d187 DevicePartUUID: DeviceClass:hdd BlockPath:/dev/vg-metadata-0/metadata-0-0 MetadataPath: WalPath: SkipLVRelease:false Location:root=default host=rescue-52-54-98-76-54-33 LVBackedPV:false CVMode:lvm Store:bluestore TopologyAffinity:}]
rook-ceph            csi-rbdplugin-provisioner-869c555b7-6c4m8              6/6     Running     0          9m1s
rook-ceph            csi-rbdplugin-x9rpp                                    3/3     Running     0          9m2s
rook-ceph            rook-ceph-mgr-a-fcf7b8447-5jgm2                        1/1     Running     0          8m43s
rook-ceph            rook-ceph-mon-a-5fbcd6ccf8-2kkl5                       1/1     Running     4          8m57s
rook-ceph            rook-ceph-operator-7498b6fc88-lz86x                    1/1     Running     0          9m17s
rook-ceph            rook-ceph-osd-0-6958fbbc7f-2pvld                       1/1     Running     3          7m47s
rook-ceph            rook-ceph-osd-prepare-rescue-52-54-98-76-54-33-7fb8q   0/1     Completed   0          8m36s
rook-ceph            rook-ceph-rgw-object-store-a-6ccfb5fcd5-spftd          1/1     Running     3          7m4s
rook-ceph            rook-discover-gxmbt                                    1/1     Running     0          9m13s

As evident, the later case (with LV) works, while the former (partition) doesn't. With Ceph v16.2.4, that is.

This means I can only fully utilize the hardware I have by specifying the LV metadataDevice while managing Ceph using rook-ceph.

(These are just VM config/test results, as testing on bare-metal takes takes too much time. Kernel/userland/rook-ceph/kubernetes/application is identical.)

@lyind
Copy link
Contributor Author

lyind commented Jul 13, 2021

I know current/future cluster builds may not wish to use LVM anymore, especially on NVMe-only nodes.

But for the time being, maybe a little fix here (see PR) can allow lower-cost builds to operate.

@travisn
Copy link
Member

travisn commented Jul 13, 2021

Thanks for the all the detailed background, makes sense to go with #8267. At least it's a simple fix!

@leseb leseb added this to To do in v1.7 via automation Sep 9, 2021
v1.7 automation moved this from To do to Done Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
v1.7
Done
Development

Successfully merging a pull request may close this issue.

2 participants