Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] New Building Block: Key/Value Store #7338

Open
WhitWaldo opened this issue Jan 2, 2024 · 7 comments
Open

[Proposal] New Building Block: Key/Value Store #7338

WhitWaldo opened this issue Jan 2, 2024 · 7 comments

Comments

@WhitWaldo
Copy link
Contributor

WhitWaldo commented Jan 2, 2024

In what area(s)?

/area runtime

Describe the proposal

But wait, don't we already have this with the state store? Yes, sort of, but think of the existing state store building block as a generic v1 get/set thing and this as a richer and specialized key/value store. It's not striving to be the kitchen sink of store APIs, but should offer features that are more aligned specifically with key/value stores. This proposal rethinks the existing state store to add forward-looking streaming support, key querying and event subscriptions.

This proposal specifically rethinks how the state store works today and reimagines it with a next-generation API specific to key/value stores suitable to future iterations of Dapr. A great many of the stores already implemented with the state building block would be ideal candidate targets for this implementation as well, but some use cases may be better suited to another specialty state store.

What is a key/value store?

A Key/Value Store allows arbitrarily typed values to be stored alongside a given key. While keys can be queried, values cannot be queried and are accessed only by their associated key. The querying of this key is one of the primary differences between this proposal and the existing state store building block. The goal isn't to support complex or convoluted key queries (especially those that might be better suited to a document or relational store), but is to support high-performance reads with occasional writes and updates storing relatively smaller values of data.

Interface

The Key/Value Store components should implement as many of the following as possible using asynchronous methods (note this is quite similar to reliable dictionaries in Service Fabric):

Name Description Notes
Clear Removes all keys (and thus all values) from the store Limited to the scope of the component and app prefix, of course.
Contains Key Returns a boolean value that determines whether the store contains the specified key -
List Keys Lists all the keys on the state instance (this should also include some mechanism to filter and/or page the keys such as my proposal here) I'd propose that advanced functionality be supported via optional query parameters to REST API on sidecar
Get Count List the number of key/value pairs in the store Subject to the same advanced functionality in ListKeys
Try Add Adds the specified key/value to the store, returns true if successfully added or false if key already exists
Set Adds the specified key/value to the store, updates with the given value if the key already exists Saves some round tripping with the sidecar and the backing store
Try Get Attempts to get the value for a given key OR the values given some sort of filter Optional filtering where supported
Get or Add Retrieves the value in the store for the given key. If it doesn't exist, adds the given value to the store.
Try Remove Tries to remove the value with the specified key from the store Should support more advanced and optional functionality like deleting by key prefix.

It'd be great to also be able to subscribe to an event from the Dapr runtime itself (e.g. not from a pub/sub provider per se) and potentially from other applications to receive a notification whenever the store was changed (per some scope and via the Dapr runtime) along with the key, the current value and a description of the changed state (e.g. added, removed, modified) - I haven't seen a proposal for this functionality specifically though.

Not all underlying stores can be expected to support more complex functionality. Some components may allow mere listing of all keys in their store (and it will depend on the sidecar to limit the response to those matching the current app ID and scope), where others may provide a richer means of filtering or even (require) paging results. The documentation should clearly indicate which components enable and provide what functionality and where Dapr can enable more advanced (to a point) queries of the keys.

What should this key/value store support?

Many of the same things supported by the state store today:

  • Transactions (implemented in the SDK not unlike actors today but also utilizing components' providers as well where supported)
  • TTL
  • Strong/weak consistency
  • Sidecar-based optional and configurable caching in furtherance of high-performance reads
  • Configurable serialization with data contract serialization as a fallback when unspecified
  • Should support gRPC streaming of requests/responses to reduce memory load on the sidecar (e.g. when paging through lists of keys or values) with HTTP fallback as long as this is supported in the sidecar

I don't propose that this should be a stand-in replacement for the current state store for actors to use or even to replace it so much as complement it like other proposed document and relational data stores. This is intended simply to enable users to use Dapr to engage with a key/value store from services using the various capabilities one might generally expect from such a store with a more feature-rich key/value-specific API than the current state store can offer.

Thank you for the consideration!

@gspadotto
Copy link
Contributor

It would be nice to also be able to get Values based on selection criteria, see my comment here:
dapr/components-contrib#3218 (comment)

I am not an expert at all in key-values repositories, so I do not know what the tipical Use Cases are, but I wonder what's the use of having a repository whose values you cannot retrieve unless you know beforehand which keys are "relevant" (where, by "relevant", I mean "matching some criteria on the values").

@dapr-bot
Copy link
Collaborator

This issue has been automatically marked as stale because it has not had activity in the last 60 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

@dapr-bot dapr-bot added the stale Issues and PRs without response label Mar 25, 2024
@WhitWaldo
Copy link
Contributor Author

I'll circle back on this as I have some free time to write up a more formal proposal for this and my other specialized store proposals.

@WhitWaldo
Copy link
Contributor Author

WhitWaldo commented Mar 25, 2024

It would be nice to also be able to get Values based on selection criteria, see my comment here: dapr/components-contrib#3218 (comment)

I am not an expert at all in key-values repositories, so I do not know what the tipical Use Cases are, but I wonder what's the use of having a repository whose values you cannot retrieve unless you know beforehand which keys are "relevant" (where, by "relevant", I mean "matching some criteria on the values").

Redis-like APIs tend to support a SCAN operator that let you query keys based on a given prefix. I think that'd be a worthwhile addition to this API for those key/value stores that support it.

That said, they do tend to be optimized in such a way that queries against the values aren't necessarily fitting the use-case. There, you might have better luck with a NoSQL database like Cosmos that does more on-the-fly indexing of those values allowing for the values to be more readily queried than they could here.

@dapr-bot dapr-bot removed the stale Issues and PRs without response label Mar 25, 2024
@gspadotto
Copy link
Contributor

dapr/components-contrib#3218 (comment)

Well... Redis State Store (just to name one) already supports querying on values (via Dapr' Query API), see:
https://docs.dapr.io/reference/components-reference/supported-state-stores/setup-redis/#querying-json-objects-optional

The point is - IMHO - that the proposed Interface should be rich enough to cover all typical key/value store use-cases, but still be implementable from a wide set of providers.

At the moment I am not sure if querying on the values should or should not be part of the proposed interface, even though it would make key/value stores much more useful.

Maybe querying on values should be part of a different, specific API as the one outlined here.

@dapr-bot
Copy link
Collaborator

This issue has been automatically marked as stale because it has not had activity in the last 60 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

@dapr-bot dapr-bot added the stale Issues and PRs without response label May 24, 2024
@WhitWaldo
Copy link
Contributor Author

The Dapr Query API has been effectively discontinued and only works if you explicitly mark the data type as JSON when you populate the underlying data store. This proposal attempts to revive and improve on this explicitly in the key/value store context. Again, while I'm sure some key/value stores support querying values, my experience is that it's far more common that the keys themselves can be queried. In the interest of a broader set of applicable components, I think I'd favor limiting this proposal to key queries and saving any queries against the values to some more specific relational/NoSQL data store that's better positioned to facilitate that.

@dapr-bot dapr-bot removed the stale Issues and PRs without response label May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants