Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (key symptom) in RpkPluginTest.test_managed_byoc #18334

Open
vbotbuildovich opened this issue May 9, 2024 · 11 comments
Open

CI Failure (key symptom) in RpkPluginTest.test_managed_byoc #18334

vbotbuildovich opened this issue May 9, 2024 · 11 comments
Assignees
Labels
auto-triaged used to know which issues have been opened from a CI job ci-failure

Comments

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented May 9, 2024

https://buildkite.com/redpanda/vtools/builds/13607

Module: rptest.tests.rpk_plugin_test
Class: RpkPluginTest
Method: test_managed_byoc
test_id:    RpkPluginTest.test_managed_byoc
status:     FAIL
run time:   3.138 seconds

RpkException('command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id test_id -v returned 127, output: ', '10:36:15.587  DEBUG  looking for existing byoc plugin  {"exists": true}\n10:36:15.587  WARN  overriding byoc plugin version check. RPK_CLOUD_SKIP_VERSION_CHECK is enabled\n/usr/local/bin/.rpk.managed-byoc: symbol lookup error: /opt/redpanda/rpk-fips/lib/libc.so.6: undefined symbol: _dl_fatal_printf, version GLIBC_PRIVATE\n', 127, '')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 103, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/tests/rpk_plugin_test.py", line 40, in test_managed_byoc
    out = self._rpk.cloud_byoc_aws_apply(redpanda_id=test_id,
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 1548, in cloud_byoc_aws_apply
    out = self._execute(cmd, env=envs)
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 1147, in _execute
    raise RpkException(
rptest.clients.rpk.RpkException: RpkException<command /opt/redpanda/bin/rpk cloud byoc aws apply --redpanda-id test_id -v returned 127, output: ; stderr: 10:36:15.587  DEBUG  looking for existing byoc plugin  {"exists": true}
10:36:15.587  WARN  overriding byoc plugin version check. RPK_CLOUD_SKIP_VERSION_CHECK is enabled
/usr/local/bin/.rpk.managed-byoc: symbol lookup error: /opt/redpanda/rpk-fips/lib/libc.so.6: undefined symbol: _dl_fatal_printf, version GLIBC_PRIVATE
; returncode: 127>

JIRA Link: CORE-2849

@vbotbuildovich vbotbuildovich added auto-triaged used to know which issues have been opened from a CI job ci-failure labels May 9, 2024
@piyushredpanda
Copy link
Contributor

Assuming this is infra issue. If not please reassign to Core, @jackietung-redpanda (cc: @michael-redpanda )

@jackietung-redpanda
Copy link
Contributor

This is a real issue for rpk-fips. When a plugin (such as this mocked out byoc plugin) is exec'd by rpk-fips, it inherits LD_LIBRARY_PATH=/opt/redpanda/rpk-fips/lib, which makes the plugin also try to use the libc.so.6 that we shipped alongside rpk-fips (and meant for exclusive use by rpk-fips.

This reveals that, in general, plugins (which could be any old executable provided by the user) probably should not load he shared libs that we ship as part of rpk-fips. If we agree with this premise, we should make a change in rpk code to explicitly not propagate LD_LIBRARY_PATH, when we exec plugins in general. I think this is the way to go - based on the newbie context I have - team please chime in. cc @twmb @michael-redpanda

Or we completely shutoff the plugin capabilities for rpk-fips - this seems like an excessive restriction for the end user though. Such a shutoff would require some code change in rpk too, to give nice user messaging (specific to FIPS mode).

Notes:

  • I did not figured out exactly why our shipped libc.so.6 is missing symbols that this specific mock plugin wants, but this is secondary IMO. It will be hard to guarantee the compatibility of our so's with all possible plugins, anyway.
  • FIPS product question: Do customers want to be able to run plugins? To maintain compliance, THOSE plugins have to be independently compliant also.
  • Why is the byoc plugin mock dynamically loading things? Golang is supposed to be always statically linked? I learned that it's a lie :). https://utcc.utoronto.ca/~cks/space/blog/programming/GoWhyNotStaticLinked

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@twmb
Copy link
Contributor

twmb commented May 16, 2024

Echoing comment from Slack -- I think we can gate plugin stuff on a build tag. This will require a separate root.go file in the rpk source tree; currently the file that installs all commands also has a huge chunk dedicated to plugin discovery.

@piyushredpanda
Copy link
Contributor

Wonder if we have a path forward here to close this out, @jackietung-redpanda ?

@jackietung-redpanda
Copy link
Contributor

Yes we do. The plan is to block conditionally compile out plugins path for FIPS. However this work is considered non-blocking for the FIPS projects and will be prioritized accordingly.

In the meantime, I plan to merge this #18577 to clean up test reports. I think that will be sufficient to close out this issue.

@vbotbuildovich
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-triaged used to know which issues have been opened from a CI job ci-failure
Projects
None yet
Development

No branches or pull requests

4 participants