Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new filters type in the query of the state component #3218

Merged
merged 2 commits into from Dec 6, 2023

Conversation

luigirende
Copy link
Contributor

@luigirende luigirende commented Nov 9, 2023

Description

In this PR have been added the follow filter operation for the query in the state component

  • NEQ -> Not Equal !=
  • GT -> Greater Than >
  • GTE -> Greater Than Equals >=
  • LT -> Less Than <
  • LTE -> Less Than Equals <=

These operations are implemented in the Redis, MongoDB, CosmosDB, Postgresql, CockroachDB state

Issue reference

We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.

Please reference the issue this PR will close: #[issue number]

Checklist

Please make sure you've completed the relevant tasks for this PR, out of the following list:

  • Code compiles correctly
  • Created/updated tests
  • Extended the documentation / Created issue in the https://github.com/dapr/docs/ repo: dapr/docs#[issue number]

@luigirende luigirende requested review from a team as code owners November 9, 2023 19:58
@berndverst
Copy link
Member

@luigirende thanks for the PR.

While this looks good I need to point out that we want to have open issues for discussion before opening PRs. I need to point out that Query API has been Alpha and in maintenance mode (and I personally would deprecate it if possible). There are no plans for Query API to make it to Beta or Stable. The Query API as it exists is a poor design and not easily maintainable. As such, these feature you are adding here will remain unsupported (and not receive patch release support). Please do keep these caveats in mind.

Copy link
Member

@berndverst berndverst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query API is Alpha and will not become Beta / Stable. That being said, we can accept this addition - however it requires that you add conformance tests that exercise the new query operators.

Please add queries for GT, GTE, LT, LTE to the state store conformance tests query test case. There should be additional query scenarios added. For you to have the appropriate data to run the query scenarios against you will probably need to insert more data in the other test operations.

https://github.com/dapr/components-contrib/blob/main/tests/conformance/state/state.go

Only if the conformance test exercises all new query operands and passes for Postgres, CosmosDB, Mongo and Redis can we accept this PR. Unit tests are not sufficient for this.

Copy link
Member

@berndverst berndverst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, your unit tests are failing. Please run make test locally.

@luigirende
Copy link
Contributor Author

luigirende commented Nov 12, 2023

Hi @berndverst I have added some conformance tests and execute locally; meanwhile I have found an issue, and fixed, on CosmosDB query when you use the sorting (previously in the conformance test there were not query scenarios with sort part). In fact CosmosDB accepts ordering (in the REST API) only with the partitionKey Querying Azure Cosmos DB resources using the REST API

Copy link
Member

@berndverst berndverst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please revert the CosmosDB changes to partition key use. The existing behavior we have is correct and working as intended.
  • Your PR should only add the new query filter operands. Even if there were an issue - this should be addressed in a separate PR.
  • Please fix the linter errors. Run make lint on your code to get the results locally.

The way Dapr stores data in the Cosmos DB state store it will be stored across multiple partitions. In order for the query API to return all results we need to perform a cross partition query. Cross partition queries must not specify the partition key.

@berndverst
Copy link
Member

berndverst commented Nov 13, 2023

FYI, locally we do not run the Azure (AWS, GCP) specific conformance tests - this is because we do not want to leak any of our paid cloud service credentials. We do not run against these resources on PRs either - only approvers / maintainers can manually trigger the execution against these resources.

@berndverst
Copy link
Member

/ok-to-test

@dapr-bot
Copy link
Collaborator

Complete Build Matrix

The build status is currently not updated here. Please visit the action run below directly.

🔗 Link to Action run

Commit ref: b690520

@dapr-bot
Copy link
Collaborator

dapr-bot commented Nov 13, 2023

Components conformance test

🔗 Link to Action run

Commit ref: b690520

✅ All conformance tests passed

All tests have reported a successful status

@dapr-bot
Copy link
Collaborator

dapr-bot commented Nov 13, 2023

Components certification test

🔗 Link to Action run

Commit ref: b690520

❌ Some certification tests failed

These tests failed:

  • bindings.azure.eventhubs
  • bindings.azure.servicebusqueues
  • pubsub.mqtt3
  • secretstores.hashicorp.vault

@luigirende
Copy link
Contributor Author

Hi @berndverst I am going to revert the Cosmosdb changes and I will be going to create a new PR about it. However I haven't changed the actual behavior, I have made it possible to add the partition key (optional) in the query method when the user use the ordering. I have found out this bug when run in local the conformance test with my own azure account. If the user don't add the partition key there is not no change

@luigirende
Copy link
Contributor Author

Reveted the cosmosdb changes and fix the lint issue

@berndverst
Copy link
Member

Hi @berndverst I am going to revert the Cosmosdb changes and I will be going to create a new PR about it. However I haven't changed the actual behavior, I have made it possible to add the partition key (optional) in the query method when the user use the ordering. I have found out this bug when run in local the conformance test with my own azure account. If the user don't add the partition key there is not no change

Nice. Yes, please do open a separate PR for that then. That seems useful.

Value int `json:"value"`
}{Value: 15}, Status: "ACTIVE"},
contentType: contenttype.JSONContentType,
partitionKey: fmt.Sprintf("partitionKey-%s", key), // for CosmosDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still need to revert this also :)

Value int `json:"value"`
}{Value: 12}, Status: "INACTIVE"},
contentType: contenttype.JSONContentType,
partitionKey: fmt.Sprintf("partitionKey-%s", key), // for CosmosDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line for now

`,
// For CosmosDB for using the ORDER structure
metadata: map[string]string{
"partitionKey": fmt.Sprintf("partitionKey-%s", key),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

Copy link
Member

@berndverst berndverst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new query operands do not work for CosmosDB when using cross partition queries then we cannot accept this addition to the Query API.

We do not want to add operators to the query API which only work under certain conditions (e.g. limited to a single partition). For those use cases we suggest that users use the CosmosDB SDK directly.

There is also a proposal for a complete different building block that would natively support queries - but this is currently abandoned. dapr/dapr#5146

@luigirende
Copy link
Contributor Author

@berndverst I reverted the change also in the conformance test. However the issue hasn't been detected by conformance test because previously there were not query scenarios with sorting. But the issue is still present with the actual operators only if put the sort block; example I can use the IN operator where I can get multiple state/entity and from these I want to apply an ordering in the query, in this case I will have a error, because at the moment I can't put the partitionKey in the query method and CosmosDB doesn't permit apply order and other keywords for cross partition

@berndverst
Copy link
Member

@luigirende Cosmos DB does not support sorting when it comes to cross-partition queries unfortunately. So you might need to remove sorting for now. It's one of those legacy things which we should never have added because it cannot work for CosmosDB.

@luigirende
Copy link
Contributor Author

luigirende commented Nov 13, 2023

@luigirende Cosmos DB does not support sorting when it comes to cross-partition queries unfortunately. So you might need to remove sorting for now. It's one of those legacy things which we should never have added because it cannot work for CosmosDB.

I Know it, what I would like to add it's the possibility to do query without cross partition, I only need set the partitionKey in the metadata request Query method too. At the moment I can do for the Set and Get but not it's not implemented for the Query. My change doesn't remove or change the actual behavior but add a new scenario (where in this case I can apply the ordering too)

@berndverst
Copy link
Member

@luigirende Cosmos DB does not support sorting when it comes to cross-partition queries unfortunately. So you might need to remove sorting for now. It's one of those legacy things which we should never have added because it cannot work for CosmosDB.

I Know it, what I would like to add it's the possibility to do query without cross partition, I only need set the partitionKey in the metadata request Query method too. At the moment I can do for the Set and Get but not it's not implemented for the Query. My change doesn't remove or change the actual behavior but add a new scenario (where in this case I can apply the ordering too)

But you won't know which partition data was written to - so would this actually be useful? There is a possibility you would be missing results! Unless of course you have a custom property in your document that you defined as the partition key instead.

Here is how Dapr stores things:

type CosmosItem struct {
ID string `json:"id"`
Value interface{} `json:"value"`
IsBinary bool `json:"isBinary"`
PartitionKey string `json:"partitionKey"`
TTL *int `json:"ttl,omitempty"`
Etag string `json:"_etag"`
TS int64 `json:"_ts"`
}

Which means

{
 "id" : "somekey",
 "value": {
   "somecustomfield": "customvalue"
 },
 "isBinary": false,
 "partitionKey": "somekey"
}

By default we set "partitionKey" to the value of the key because we also instruct user to set up their CosmosDB Account with this path as the partition value.

But you can provision your CosmosDB account instead to use a different path of /value/somecustomfield as long as that field will be contained in every document.

When performing a get request you will also need to specify the partition key via metadata on every request.

That's of course not ideal. It works similarly for querying.

@luigirende
Copy link
Contributor Author

luigirende commented Nov 14, 2023

@luigirende Cosmos DB does not support sorting when it comes to cross-partition queries unfortunately. So you might need to remove sorting for now. It's one of those legacy things which we should never have added because it cannot work for CosmosDB.

I Know it, what I would like to add it's the possibility to do query without cross partition, I only need set the partitionKey in the metadata request Query method too. At the moment I can do for the Set and Get but not it's not implemented for the Query. My change doesn't remove or change the actual behavior but add a new scenario (where in this case I can apply the ordering too)

But you won't know which partition data was written to - so would this actually be useful? There is a possibility you would be missing results! Unless of course you have a custom property in your document that you defined as the partition key instead.

Here is how Dapr stores things:

type CosmosItem struct {
ID string `json:"id"`
Value interface{} `json:"value"`
IsBinary bool `json:"isBinary"`
PartitionKey string `json:"partitionKey"`
TTL *int `json:"ttl,omitempty"`
Etag string `json:"_etag"`
TS int64 `json:"_ts"`
}

Which means

{
 "id" : "somekey",
 "value": {
   "somecustomfield": "customvalue"
 },
 "isBinary": false,
 "partitionKey": "somekey"
}

By default we set "partitionKey" to the value of the key because we also instruct user to set up their CosmosDB Account with this path as the partition value.

But you can provision your CosmosDB account instead to use a different path of /value/somecustomfield as long as that field will be contained in every document.

When performing a get request you will also need to specify the partition key via metadata on every request.

That's of course not ideal. It works similarly for querying.

How works CosmosDB state it's clear to me. But at the moment It's not possible to use the partitionKey for the query

func (p crossPartitionQueryPolicy) Do(req *policy.Request) (*http.Response, error) {
raw := req.Raw()
// Check if we're performing a query
// In that case, remove the partitionkey header and enable cross-partition queries
if strings.ToLower(raw.Header.Get("x-ms-documentdb-query")) == "true" {
raw.Header.Add("x-ms-documentdb-query-enablecrosspartition", "true")
raw.Header.Del("x-ms-documentdb-partitionkey")
}
return req.Next()
}

This is correct, because if the user doesn't set the partitionKey in the Set method, Dapr will use as partitionKey the value Key and potentially he could have N partition Key for N documents stored and for using the query it's correct to use the cross partition; but to use query without partitionKey will degrad the performance on CosmosDB. So if the user design fine the data model and use correctly the partitionKey concept at the moment he can't use the query for one partitionKey and CosmosDB will work in cross partition with low performance. My change only add the opportunity to set the partitionKey in the query method too; without changing the actual behavior.

However about the PR this is off topic, we can treat this topic in the new PR, if it's okey for you this PR about the new operations filters if it's possible to merge it

@berndverst
Copy link
Member

@luigirende can you please resolve the conflicts? And then we can probably merge this.

It seems we will also have a need for your follow up PR to enable querying by partition key :) See #3244. That was opened as a bug (even though it's expected behavior). We can implement as a feature request like you described - but let's have a separate PR for that please.

@luigirende luigirende force-pushed the state-update-filters-query branch 2 times, most recently from 525810e to 5ff275a Compare November 22, 2023 13:16
@luigirende
Copy link
Contributor Author

@luigirende can you please resolve the conflicts? And then we can probably merge this.

It seems we will also have a need for your follow up PR to enable querying by partition key :) See #3244. That was opened as a bug (even though it's expected behavior). We can implement as a feature request like you described - but let's have a separate PR for that please.

@berndverst I resolved the conflicts.

About the partitionKey I have already created a PR last week. PR #3227

Copy link
Member

@berndverst berndverst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please revert the docker compose yaml changes? Sometimes it's handy to auto-resolve the platform (where possible) when testing on Apple Silicon for example. You can also set the global variable for the default Docker Platform if desired.

@luigirende
Copy link
Contributor Author

luigirende commented Nov 28, 2023

@berndverst removed the changes on docker compose files

@luigirende luigirende force-pushed the state-update-filters-query branch 2 times, most recently from d5a8d23 to 70d3aa4 Compare December 2, 2023 16:42
…CosmosDB, Postgresql and CockroachDB

Signed-off-by: Luigi Rende <luigirende@gmail.com>
@berndverst berndverst added this pull request to the merge queue Dec 6, 2023
Merged via the queue into dapr:main with commit a7c64f4 Dec 6, 2023
85 checks passed
@berndverst
Copy link
Member

@luigirende we may need to revert this PR, or you need to create a follow up PR.

state.postgresql.azure and also CosmosDb conformance tests are failing now.

You need to make sure that these tests will succeed when there is existing data or multiple tests run in parallel.

@luigirende
Copy link
Contributor Author

@berndverst in this commit fixed the issue luigirende@b1292e7

@ItalyPaleAle ItalyPaleAle added this to the v1.13 milestone Dec 22, 2023
@gspadotto
Copy link
Contributor

gspadotto commented Jan 25, 2024

Query API is Alpha and will not become Beta / Stable.

If I knew from https://docs.dapr.io/reference/api/state_api/#query-state that the "query state" endpoint would not have made it to Stable I would not have invested time and effort in testing/using it.

If that choice is final, I would highlight that fact in the page above and in bullet point 3 of https://docs.dapr.io/reference/components-reference/supported-state-stores/setup-redis/#querying-json-objects-optional

You might say that these risks are implied in alpha releases, but yet I had no way of knowing that the "query state" endpoint has been "deprecated" apart from randomly stumbling on this pull request.

Is there an API in the backlog to "query values by criteria" in State Stores that are implemented by key-value repositories?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants