Resolve device names for attached Cinder volumes #1240

rionpy · 2018-06-19T15:21:49Z

As highlighted in #1183, Cinder suggests a device name for a volume, which is not always used by the hypervisor. This issue is reliably reproducible with force detach:

Mount Cinder volume on instance
Force detach volume from outside instance
Re-mount volume on instance

Doing this leads to Cinder always offering the same device name, while the hypervisor will pick something else (possibly due to abruptly losing said device).

This patch aims to resolve the issue by resolving the device names by matching the volume IDs to the virtio-XXXXXXXX-XXXX-XXXX-X-symlinks created under /dev/disk/by-id (adapted from ederst@0f57fc7)

This patch aims to resolve the issue with hypervisors not honoring device names suggested by Cinder, by resolving the device names from the symlinks created under /dev/disk/by-id

CLAassistant · 2018-06-19T15:21:58Z

All committers have signed the CLA.

codecov-io · 2018-06-19T16:01:37Z

Codecov Report

Merging #1240 into master will increase coverage by 0.29%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1240      +/-   ##
==========================================
+ Coverage   34.25%   34.54%   +0.29%     
==========================================
  Files          36       36              
  Lines        2362     2362              
==========================================
+ Hits          809      816       +7     
+ Misses       1450     1442       -8     
- Partials      103      104       +1

Impacted Files	Coverage Δ
libstorage/api/types/types_localdevices.go	`84.21% <0%> (-1.76%)`	⬇️
...storage/drivers/storage/vfs/storage/vfs_storage.go	`42.15% <0%> (+2.61%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ed0b78...4015cf8. Read the comment docs.

bartmatthaei · 2018-07-02T11:42:33Z

When can we expect this pull request to come through? We're really waiting for this fix :-)

codenrhoden

My concern here is that this only works when running REX-Ray in an all-in-one configuration, but not in client/server.

Additionally, does this only work when using KVM+virtio? I think part of the reason there wasn't specific parsing of paths right now is due to the fact that OpenStack can be using multiple hypervisors, and potentially different device drivers. meaning, there's no guarantee of what specific device path you could rely on (e.g. /dev/disk/by-id/virtio-*). I mean that as a question, as I don't actually know.

If it's true there could me multiple places that volumes could be attached, but KVM+virtio is most common, we could make this configurable, and also make it the default. Thoughts?

codenrhoden · 2018-07-02T16:26:15Z

libstorage/drivers/storage/cinder/storage/cinder_storage.go

+	return volume, resolveDeviceName(volumeID, volumeAttach.Device), nil
+}
+
+func getDeviceLink(


Resolving devices that are on the node by peaking in /dev cannot be done in this code, as it may not be running on the same node. REX-Ray supports a client/server mode of operation, where a single REX-Ray instance running in server mode is configured with the Cinder API credentials, but all client nodes are just talking to the master REX-Ray instance.

Any code that is specific to a single node/instance needs to be in the executor, not here.

The way this should be handled is in the LocalDevices function in the executor. This function returns a map of attached devices, with their actual path to their volume ID. Right now, this function is returning all attached devices, and an empty volume ID. This would need to modified to extract the volume ID from your virtio path used below...

codenrhoden · 2018-07-02T16:28:11Z

libstorage/drivers/storage/cinder/storage/cinder_storage.go

+func getDeviceLink(
+	volumeID string) string {
+
+	return fmt.Sprintf("/dev/disk/by-id/virtio-%s", volumeID[:20])


The bit about the executor aside, is this universal across OpenStack/Cinder? I think one of the reasons things are the way they are now is because not everything is using virtio, correct? What if this was on a Xen hypervisor? Would this path be the same?

I recognize things are broken right now, and we should target the most common usecase, which is probably KVM with virtio. But if that's not universal, this needs to be a configurable behavior.

Wouldn't it be sufficient for now to use /dev/disk/by-id/virtio-%s if it exists (and obviously virtio is used), and fallback to /dev/devicename if not?

From what I could find, using Cinder with Xen will not make the volume ID available to the OS, so if Xen + Cinder is facing similar issues with the devicename order in case of read errors, then there is no way to fix it in rexray...

virtio-%s this won't work in all cases, as for example in my case, devices are named as:

/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2afa0091-d1c5-46e2-9 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_426c308d-a355-4cca-a /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_4dfda350-9cd9-4901-8 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_52645ecc-2ac0-4534-a /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_5569888b-5d88-4a57-9 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_698a3ded-990a-420b-8

(Openstack from OVH)

ICYMI, for OVH/Openstack I hacked a variant and made the plugin available here: https://hub.docker.com/r/hervenicol/rexray.

...it appears the hack is not enough, we still have a lot of issues: wrong volume mounted, volumes created although already existing, unable to mount volumes...
So I decided to go to a simpler, cinder-dedicated plugin: https://github.com/hervenicol/docker-plugin-cinder

It means a lot less code, easier to understand / debug / evolve.
It seems rexray was overkill for my use case. And on top of that, too complex for me to maintain / fix it by myself.

codenrhoden · 2018-07-02T16:33:00Z

libstorage/drivers/storage/cinder/storage/cinder_storage.go

@@ -330,7 +333,7 @@ func translateVolume(
 			libstorageAttachment := &types.VolumeAttachment{
 				VolumeID:   attachment.VolumeID,
 				InstanceID: &types.InstanceID{ID: attachment.ServerID, Driver: cinder.Name},
-				DeviceName: attachment.Device,
+				DeviceName: resolveDeviceName(attachment.VolumeID, attachment.Device),


This instead needs to match the volume ID to those found in the LocalDevices map, which right now is empty, and pull the device name from that. This code may be running on a different node than where the volume is attached.

Each driver has to do this a bit differently, but here is an example from EBS and ScaleIO

ederst · 2018-08-03T10:35:50Z

Hi there @codenrhoden,
So to recap, just so that I understand this: The current proposed change won't work for all deployments of rex ray b/c stuff can run on different nodes and not everyone is using KVM+virtio, of course (this was probably also the reason the earlier attempts were rejected as well I guess).

To fix this the following is needed to be done (feel free to correct me if I'm wrong):

move virtio device resolution to executor (cinder_executor.go -> LocalDevices function), since this runs on the machine where the device will be mounted
get device information in cinder_storage.go from this LocalDevices generated map
make KVM+virtio stuff configurable (or use normal cinder device resolution as fallback when no virtio device is found as suggested by @bartmatthaei)

I could try to give this a go, but since my current workload is at an all-time high (at least it feels this way), it probably will take at least a month (needless to say that I'm not the most proficient go programmer out there).

Resolve names of devices mounted with Cinder

4015cf8

This patch aims to resolve the issue with hypervisors not honoring device names suggested by Cinder, by resolving the device names from the symlinks created under /dev/disk/by-id

codenrhoden requested changes Jul 2, 2018

View reviewed changes

codenrhoden added the openstack label Jul 2, 2018

kumy mentioned this pull request Apr 1, 2019

open /dev/vdc: no such file or directory #1274

Open

at-ohms mentioned this pull request Jul 11, 2019

rexray/cinder failing find folder #1320

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve device names for attached Cinder volumes #1240

Resolve device names for attached Cinder volumes #1240

rionpy commented Jun 19, 2018

CLAassistant commented Jun 19, 2018 •

edited

codecov-io commented Jun 19, 2018 •

edited

bartmatthaei commented Jul 2, 2018

codenrhoden left a comment

codenrhoden Jul 2, 2018

codenrhoden Jul 2, 2018

bartmatthaei Jul 5, 2018

kumy Apr 1, 2019

hervenicol Dec 1, 2020

hervenicol Apr 4, 2021

codenrhoden Jul 2, 2018

ederst commented Aug 3, 2018 •

edited

Resolve device names for attached Cinder volumes #1240

Are you sure you want to change the base?

Resolve device names for attached Cinder volumes #1240

Conversation

rionpy commented Jun 19, 2018

CLAassistant commented Jun 19, 2018 • edited

codecov-io commented Jun 19, 2018 • edited

Codecov Report

bartmatthaei commented Jul 2, 2018

codenrhoden left a comment

Choose a reason for hiding this comment

codenrhoden Jul 2, 2018

Choose a reason for hiding this comment

codenrhoden Jul 2, 2018

Choose a reason for hiding this comment

bartmatthaei Jul 5, 2018

Choose a reason for hiding this comment

kumy Apr 1, 2019

Choose a reason for hiding this comment

hervenicol Dec 1, 2020

Choose a reason for hiding this comment

hervenicol Apr 4, 2021

Choose a reason for hiding this comment

codenrhoden Jul 2, 2018

Choose a reason for hiding this comment

ederst commented Aug 3, 2018 • edited

CLAassistant commented Jun 19, 2018 •

edited

codecov-io commented Jun 19, 2018 •

edited

ederst commented Aug 3, 2018 •

edited