Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chunked_map in seastar #2133

Open
travisdowns opened this issue Mar 7, 2024 · 5 comments
Open

chunked_map in seastar #2133

travisdowns opened this issue Mar 7, 2024 · 5 comments

Comments

@travisdowns
Copy link
Contributor

travisdowns commented Mar 7, 2024

Seastar cannot in general support large allocations after an application has been running for some time, due to its use of a custom allocator which allocate a fixed virtual memory region which never grows: in general the heap becomes fragmented and allocations some significant multiple of the page size are likely to fail.

Hence the existence of chunked_fifo which is a non-contiguous replacement for std::vector in some cases.

H/e, the same basic problem also arises with maps. In Redpanda, we will solve this problem using a "fragmented map" whose backing arrays are compose of chunks of some maximum size, not unlike chunked_fifo (except that we need random access, so there is a 2-level indexing scheme, not unlike std::deque).

Some of the large allocations can occur in seastar itself, e.g., if there are many metrics (e.g., here: redpanda-data/redpanda#16949).

Given this, we propose that we upstream the fragmented map here instead of keeping in Redpanda. It would almost certainly be an existing 3rd party map implementation, with a compatible license, modified to be chunked.

WDYT?

Perhaps this problem already has a solution in Scylla: we'd be interested to hear about it if so.

@travisdowns
Copy link
Contributor Author

cc @tchaikov any thoughts on this one?

@avikivity
Copy link
Member

We don't have a fragmented unordered map (don't think we ever saw a need - to exceed 128k one would need 16k buckets, no?

In general I don't want Seastar to become a collections library.

@travisdowns
Copy link
Contributor Author

We don't have a fragmented unordered map (don't think we ever saw a need - to exceed 128k one would need 16k buckets, no?

Something like that: but this is possible for example (and in practice) if you have many metrics: 16K isn't an outlandish number of metrics. Another option would be to use something like std::map, which is inherently non-contiguous, for those use cases but performance of that implemetnation is ... not ideal.

@nyh
Copy link
Contributor

nyh commented Apr 4, 2024

In general I don't want Seastar to become a collections library.

Yes, this is the current sentiment. Some historical context: We added chunked_fifo to Seastar almost eight years ago (commit e66ec71) to use it in Seastar, namely the semaphore implementation. As you noticed, chunked_fifo is optimized to be just a queue, not a random-addressable vector, which is exactly what we needed for the semaphore.

Here's a core dump of my thoughts on this issue:

In the ScyllaDB project we did need a chunked vector type (with full random-access support), and added one (appropriately called "chunked_vector") but never tried to put it in Seastar. As you noted, you guys also need a chunked map type. One could argue that every serious Seastar user will likely need one or more of these chunked containers, so it does make sense to collect these chunked container implementations and put them into either Seastar itself or a separate library.

One reason to not want to put those container implementations in Seastar itself is that they are not at all specific to Seastar - they don't use Seastar's futures, or any other Seastar function or header file. E.g., just like Boost have a set of "intrusive containers", it (or a new library) could have a set of "chunked containers".

Another, bigger problem, is how to choose which containers to include. Should we focus on chunked "versions" of existing C++ STL - e.g., a chunked version of std::map, a separate chunked version of std::unordered_map, etc.? Or will we allow people to start adding all sort of alternative container implementations, each with different optimizations? There are many different container implementations in Boost, Google's libraries, and other places... Arguably, Seastar's existing chunked_fifo is already one of these "alternative" implementation - it is not 100% compatible with STL's std::dequeue because it's just a queue and does not implement O(1) random access. But as I said, we added because we needed to use it inside Seastar itself.

Another question to ponder about: Even if we only focus on "chunked versions of existing C++ STL containers", how stringent will we be (or can we be) about the complexity guarantees? For example, std::vector requires O(1) addressing. A chunked implementation cannot really have asymptotic (as N grows to infinity) O(1) performance. It can be log(N)/large_constant but not O(1). Alternatively, the "chunked container" can opt to guarantee <128KB allocations only up to a certain N. This is what Scylla's chunked_vector currently does: vectors up to 2GB have a maximum allocation size of 128KB, but larger vectors may need a contiguous allocation of more than 128KB. Is this fine? Probably, but as soon as we make this an "official" part of Seastar we will need to define some "rules".

In addition to just keeping all these containers in seaparate projects like Scylla and Redpanda, I can think of three options of different levels of integration into Seastar:

  1. We could allow in Seastar chunked containers that mimic existing C++ STL containers, with the same APIs and complexity guarantees (with the above caveats...). E.g., a chunked_vector, chunked_map or chunked_unordered_map.
  2. Allow in a Seastar subdirectory of container implementations, which is only half-way "in Seastar". Perhaps it will have a less API stability guarantees.
  3. Create a separate Github project, of "chunked containers", independent from Seastar and not requiring anything from Seastar. Seastar users can also pick up that library.

@niekbouman
Copy link
Contributor

In general I don't want Seastar to become a collections library.

  1. Create a separate Github project, of "chunked containers", independent from Seastar and not requiring anything from Seastar. Seastar users can also pick up that library.

I would think that a separate GitHub project could make sense, with even a broader scope (not just containers) where third parties (like RedPanda and others) can contribute implementations under a liberal open-source license, suitable for use with Seastar, and benefit from each other. Maybe this Github project is merely a curated list of links to repos that contain the actual code and that are owned/managed by the contributors (similar to https://github.com/dotnwat/awesome-seastar)

For example, we have made some adaptations to the Cap'n Proto library, so that we can read/write Cap'n Proto messages from and to a seastar::input_stream resp. seastar::output_stream. We'd be happy to contribute this, and are currently also writing a chunked_vector-like implementation (for which the same holds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants