You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How to reproduce it (as minimally and precisely as possible):
Deploy network operator on RHEL 8.8 hosts with valid RHEl subscription.
Anything else we need to know?:
Tried the option to specify private repo: ofedDriver.repoConfig.name
network-operator pod log shows error:
2024-04-10T16:39:52Z ERROR Error while syncing state {"controller": "nicclusterpolicy", "controllerGroup": "mellanox.com", "controllerKind": "NicClusterPolicy", "NicClusterPolicy": {"name":"nic-cluster-policy"}, "namespace": "", "name": "nic-cluster-policy", "reconcileID": "d09bbc74-ce62-4fe4-9ccc-99838b245ed3", "error": "failed to create k8s objects from manifest: failed to get destination directory for custom repo config: distribution not supported", "errorVerbose": "failed to get destination directory for custom repo config: distribution not supported\nfailed to create k8s objects from manifest\ngithub.com/Mellanox/network-operator/pkg/state.(*stateOFED).Sync\n\t/workspace/pkg/state/state_ofed.go:270\ngithub.com/Mellanox/network-operator/pkg/state.(*stateManager).SyncState\n\t/workspace/pkg/state/manager.go:92\ngithub.com/Mellanox/network-operator/controllers.(*NicClusterPolicyReconciler).Reconcile\n\t/workspace/controllers/nicclusterpolicy_controller.go:144\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"}
github.com/Mellanox/network-operator/pkg/state.(*stateManager).SyncState
/workspace/pkg/state/manager.go:101
github.com/Mellanox/network-operator/controllers.(*NicClusterPolicyReconciler).Reconcile
/workspace/controllers/nicclusterpolicy_controller.go:144
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235
Got the ofed driver to install successfully by patching the mofed-rhel8.8-ds daemonset and adding these volumeMonts/volumes entries:
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.8
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.8"
What happened:
Deploy network operator on RHEL 8.8 hosts with option ofedDriver: deploy: true
ofed driver pods fail dues to error:
What you expected to happen:
ofed driver install succeeds on RHEL 8.8.
Release notes for network operator v23.10.0 state that RHEL 8.8 is supported: https://docs.nvidia.com/networking/display/kubernetes2310/release+notes
How to reproduce it (as minimally and precisely as possible):
Deploy network operator on RHEL 8.8 hosts with valid RHEl subscription.
Anything else we need to know?:
Tried the option to specify private repo: ofedDriver.repoConfig.name
network-operator pod log shows error:
Got the ofed driver to install successfully by patching the mofed-rhel8.8-ds daemonset and adding these volumeMonts/volumes entries:
Logs:
kubectl -n nvidia-network-operator get -A
:Network Operator version: v23.10.0
Logs of Network Operator controller:
mofed-rhel8.8-ds-h5bcr-success.log
mofed-rhel8.8-ds-rqbfp-crash.log
network-operator-6444bc476f-g22tf.log
Logs of the various Pods in
nvidia-network-operator
namespace:Helm Configuration (if applicable):
custom-values.yaml
kubectl get node -o yaml
:Environment:
kubectl version
): v1.27.10cat /etc/os-release
):uname -a
):4.18.0-477.10.1.el8_8.x86_64
The text was updated successfully, but these errors were encountered: