Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce buffering between watcher and Store #1487

Closed
clux opened this issue May 9, 2024 · 1 comment · Fixed by #1494
Closed

Reduce buffering between watcher and Store #1487

clux opened this issue May 9, 2024 · 1 comment · Fixed by #1494
Labels
help wanted Not immediately prioritised, please help! runtime controller runtime related

Comments

@clux
Copy link
Member

clux commented May 9, 2024

What problem are you trying to solve?

As a follow-up to watcher paging/streaming in an effort to reduce allocations and move the complexity to where it needs to be. The main problem here is allocation for initial lists (or initial streaming lists). These are essentially allocated twice internally:

  • first in the watcher's step_trampolined
  • second in the Store's apply_watcher_event
  • thirdly? looks like we clone into another buffer before allocating it due to not consuming with an into_iter in the same fn?

If we can move the allocation into store and bubble up the event earlier, then we avoid double/triple allocating this, and users who write custom stores can avoid waiting or double allocating.

Note this buffering happens for both listwatch and for streaming lists.

Describe the solution you'd like

We can lift this caching with a Page<Vec<K>> or Partial<Vec<K>> new watcher::Event that can be bubbled up to be inserted into the store. Because we now have a ready guard in the store it should be safe to start inserting into the store immediately (though it would have to be altered slightly to fire after a complete initial list/stream has happened).

This is a small breaking change to the enum, but it is contained to very internal interfaces and can be documented.

Describe alternatives you've considered

  1. flags in watcher::Config to decide whether to bubble up early, but it doesn't avoid the breaking change of introducing a new watcher event for partial data (even if we don't act on it) because it's not #[non_exhaustive]
  2. feature flag in runtime to decide whether watcher::Event has extra features. this feels pretty hairy for an already complex watcher trampoline, and we eventually want the best performance to be the default, rather than hidden behind an opt-in

Documentation, Adoption, Migration Strategy

  • highlight the change in a release, users who match on the low-level watcher::Event will get a new variant to match on

Target crate for feature

kube-runtime

@clux clux added help wanted Not immediately prioritised, please help! runtime controller runtime related labels May 9, 2024
@clux
Copy link
Member Author

clux commented May 9, 2024

It's also been pointed out that this peak allocation might not deallocate at all for the default allocator. Important bits from a discord thread:

Another problem is that the default system allocator never returns the memory to the OS after the burst, even if the objects are dropped. Since the initial list fetch happens sporadically you get a higher RSS usage together with the memory spike. Solving the brust will solve this problem, and reflectors and watchers can be started in parallel without worrying of OOM killers.

The allocator does not return the memory to the OS since it treats it as a cache. This is mitigated by using jemalloc with some tuning, however, you still get the memory burst so our solution was to use jemalloc + start the watchers sequentially. As you can imagine it's not ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Not immediately prioritised, please help! runtime controller runtime related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant