Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: update existing OSDs with deviceClass #9259

Merged
merged 1 commit into from Nov 29, 2021
Merged

Conversation

parth-gr
Copy link
Member

If we apply useAllNodes to false for the current deployment,
the OSDs should get updated with the individual nodes values and config,
The deviceClass was not updating to the existing OSDs because there was
bug in the check.
The check osdInfo.DeviceClass == "" which should be
checked like this osdInfo.DeviceClass == "None"

Updated the code so OSDs can make use of the devices present

Signed-off-by: parth-gr paarora@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@@ -126,7 +126,7 @@ func (c *updateConfig) updateExistingOSDs(errs *provisionErrors) {
}

// backward compatibility for old deployments
if osdInfo.DeviceClass == "" {
if osdInfo.DeviceClass == "" || osdInfo.DeviceClass == "None" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the huddle I think using ,omitempty" might better solve this instead of checking for None. Also I cannot get a repro on why we would get None. https://go.dev/play/p/xKyIFkmrIKl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the ,omitempty" at these two places,

DeviceClass string `json:"device-class"`

DeviceClass string `json:"device_class"`

But still, see that osdInfo.DeviceClass is set as None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum I'd be curious to trace back and really understand why the command will return None. That shouldn't be the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leseb, I traced it out None value is returned from the env. variable

if envVar.Name == osdDeviceClassEnvVarName {
osd.DeviceClass = envVar.Value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, weird it is not supposed to since it defaults to "".

Copy link
Member

@leseb leseb Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is the culprit https://github.com/rook/rook/blob/master/pkg/daemon/ceph/osd/volume.go#L85,

see:

  [block]       /dev/ceph-a8ae9f01-a440-4a0e-8e4d-592d3bce3a9d/osd-block-058f4926-6ee7-4165-9a55-db5a1fd22d2f

      block device              /dev/ceph-a8ae9f01-a440-4a0e-8e4d-592d3bce3a9d/osd-block-058f4926-6ee7-4165-9a55-db5a1fd22d2f
      block uuid                cr4xnI-h1sx-4DT3-ZC2N-0kI9-RtZF-Nkr3q1
      cephx lockbox secret      
      cluster fsid              f20931eb-9336-4234-a5c7-b0b44ab8c07a
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  058f4926-6ee7-4165-9a55-db5a1fd22d2f
      osd id                    2
      osdspec affinity          
      type                      block
      vdo                       0
      devices                   /dev/vdd

With crush device class None.

In the end, it's valid to check for None since we don't control this behavior!

Copy link
Member Author

@parth-gr parth-gr Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked it while testing it doesn't make any change.

I see the ROOK_OSD_DEVICE_CLASS environment variable is returned as {ROOK_OSD_DEVICE_CLASS None nil} this might be the actual problem

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, got it.
This is because of how ceph returns the output ceph-volume.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave a comment in the code to explain why we need to check against Noneand add a link to this tracker: https://tracker.ceph.com/issues/53425

pkg/daemon/ceph/client/osd.go Outdated Show resolved Hide resolved
pkg/operator/ceph/cluster/osd/osd.go Outdated Show resolved Hide resolved
@@ -126,7 +126,7 @@ func (c *updateConfig) updateExistingOSDs(errs *provisionErrors) {
}

// backward compatibility for old deployments
if osdInfo.DeviceClass == "" {
if osdInfo.DeviceClass == "" || osdInfo.DeviceClass == "None" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave a comment in the code to explain why we need to check against Noneand add a link to this tracker: https://tracker.ceph.com/issues/53425

If we apply useAllNodes to false for the current deployment,
the OSDs should get updated with the individual nodes values and config,
The deviceClass was not updating to the existing OSDs because there was
bug in the check.
The check osdInfo.DeviceClass == "" which should be
checked like this osdInfo.DeviceClass == "None"

Updated the code so OSDs can make use of the devices present

Signed-off-by: parth-gr <paarora@redhat.com>
@leseb leseb merged commit 83f7c2b into rook:master Nov 29, 2021
mergify bot added a commit that referenced this pull request Nov 29, 2021
osd: update existing OSDs with deviceClass (backport #9259)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants