Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for --remove-installed-kernel #4322

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cgwalters
Copy link
Member

Closes: #2542

Today anyone wanting to switch to a different kernel variant such as kernel-rt must manually pass all the current kernel packages to --uninstall (separately). It's not just very unergonomic, it also makes the calling code operating system dependent because as just happened when C9S added a kernel-modules-core package, that suddenly also needs to be specified.

xref https://issues.redhat.com/browse/OCPBUGS-8113

@cgwalters cgwalters force-pushed the remove-kernel branch 2 times, most recently from 9d705f5 to 8355e69 Compare March 2, 2023 15:18
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sane though I wonder if a cleaner UX would be to have a dedicated override kernel-replace command which knows to remove all the base kernel packages. The major advantage of this is that it would be the same CLI API regardless of whether it's changing kernel types or not.

rust/src/treefile.rs Outdated Show resolved Hide resolved
rust/src/treefile.rs Outdated Show resolved Hide resolved
rust/src/treefile.rs Outdated Show resolved Hide resolved
@cgwalters
Copy link
Member Author

Seems sane though I wonder if a cleaner UX would be to have a dedicated override kernel-replace command which knows to remove all the base kernel packages. The major advantage of this is that it would be the same CLI API regardless of whether it's changing kernel types or not.

Can you elaborate? "Same CLI API" as what?

jlebon
jlebon previously approved these changes Mar 2, 2023
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? "Same CLI API" as what?

So this is useful if you're e.g. changing from kernel to kernel-rt, where you can now do rpm-ostree install --remove-installed-kernel kernel-rt*.rpm. But if you're changing kernel versions and not type, you still have to use rpm-ostree override replace kernel*.rpm. So driving software have to know which path to take (by e.g. inspecting the state of the host first). I think for the MCO that's not an issue since it knows already based on the MCs what kernel type the host on. Though humans directly interacting with rpm-ostree still have to understand the difference between the two.

If we had a rpm-ostree override kernel-replace instead, we could have it work regardless of the situation. rpm-ostree can easily know whether it's an override replace or an override remove + install.

But anyway, cool with this too, and it obviously doesn't preclude doing that down the line!

@cgwalters
Copy link
Member Author

@cgwalters cgwalters marked this pull request as draft March 6, 2023 14:14
@cgwalters
Copy link
Member Author

OK this PR is not critical path; I ended up with a workaround in openshift/machine-config-operator#3580 that is probably fine for now.

@cgwalters
Copy link
Member Author

But if you're changing kernel versions and not type, you still have to use rpm-ostree override replace kernel*.rpm.

Isn't this though just a specific instance of us forcing users to explicitly specify whether they're doing an override or an install?

I guess there's an interesting angle here of whether actually we should think of kernel and kernel-rt as "the same package" even though mechanically they're not.

I think if I could redesign things here honestly I feel the "override" versus "install" split we force has not carried its weight. Really of course this is just arguing that we should support dnf install kernel-rt and it just Does The Right Thing and also removes kernel, same as dnf install ./kernel-6.1.2*.rpm would also just DTRT if you have kernel and do an override etc.

This all said I do actually think bigger picture the way we in FCOS-derivatives handle kernel-rt is suboptimal - it's sufficiently important that IMO it argues for introducing an new base image. I did so in cgwalters/bootc-demo-base-images@68afb07

And if we do that, then from the user perspective at least it's always consistently override replace to do hotfixes today, and the use case of "switch from throughput to latency" drops out except for unusual cases.

@jlebon
Copy link
Member

jlebon commented Mar 7, 2023

But if you're changing kernel versions and not type, you still have to use rpm-ostree override replace kernel*.rpm.

Isn't this though just a specific instance of us forcing users to explicitly specify whether they're doing an override or an install?

I guess there's an interesting angle here of whether actually we should think of kernel and kernel-rt as "the same package" even though mechanically they're not.

Right, exactly. And it'd be nice if status showed this better than awkwardly noting it as independent installs and removals.

I think if I could redesign things here honestly I feel the "override" versus "install" split we force has not carried its weight.

Can you expand on this? The split between the two has been a useful way IMO to think about package layering. Though I could imagine having only install and uninstall and map those accordingly while still preserving the status output. We could still do that, but we'd need to work through some details.

This all said I do actually think bigger picture the way we in FCOS-derivatives handle kernel-rt is suboptimal - it's sufficiently important that IMO it argues for introducing an new base image.

👍 overall to the idea. I think that was considered early on when kernel-rt support was added in RHCOS but I don't remember why we went with the override replace approach (probably because it was easier than shipping multiple images).

@cgwalters
Copy link
Member Author

Right, exactly. And it'd be nice if status showed this better than awkwardly noting it as independent installs and removals.

I think we lead towards getting that effect if we merge this; we can recognize the case in status where override_remove_kernel is present and simply omit which specific kernel packages were removed since it's not usually interesting information.

The messier thing here is then we still need to recognize which new packages are kernels...are you thinking we'd get that by having them explicit in the switch-kernel verb and then...serialize those separately?

Mmm...hmm...I think we could expose in status which new packages were also kernel packages, then a client can render it.

Closes: coreos#2542

Today anyone wanting to switch to a different kernel variant
such as `kernel-rt` must manually pass all the *current* kernel
packages to `--uninstall` (separately).  It's not just very unergonomic,
it also makes the calling code *operating system dependent* because
as just happened when C9S added a `kernel-modules-core` package,
that suddenly also needs to be specified.

xref https://issues.redhat.com/browse/OCPBUGS-8113
@cgwalters
Copy link
Member Author

Thought about this one some more, and I think if we agree that the "separate -rt base image" is really the way to go (and I think it probably is) then that argues against trying to special case kernel packages.

Rebased 🏄

@cgwalters cgwalters added the difficulty/medium medium complexity/difficutly issue label May 8, 2023
@jlebon
Copy link
Member

jlebon commented May 9, 2023

Thought about this one some more, and I think if we agree that the "separate -rt base image" is really the way to go (and I think it probably is) then that argues against trying to special case kernel packages.

Hmm, not sure I follow. The proposed switch is most useful for changing kernel types. If we take that case out of the picture then override replace *.rpm is all we need, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty/medium medium complexity/difficutly issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add --uninstall-kernel
3 participants