Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short-circuit evaluation in openCypher #572

Open
ExtReMLapin opened this issue Apr 24, 2024 · 1 comment
Open

Short-circuit evaluation in openCypher #572

ExtReMLapin opened this issue Apr 24, 2024 · 1 comment

Comments

@ExtReMLapin
Copy link

Hello,
I'm using openCypher in ArcadeDB.

I met a bug that was first in the developer in a hotfix really quickly, but we should have never met this bug, it was isLower running on null fields.

What's weird is the first evaluation condition in the query should never have let isLower run on a node type that didn't have this property.

The same way that running this query something like 10 times (before applying the hotfix) didn't throw an error 10 times out of 10.

This is what makes me think that there was an issue in the order of the execution/validation of the conditions in the query.

Here is the issue on the other repo : ArcadeData/arcadedb#1563

Here was the query :

MATCH (n) WHERE (n:NER OR n:THEME) AND
(
    toLower(n.identity) CONTAINS 'technology' AND
    toLower(n.identity) CONTAINS 'science' AND
    toLower(n.identity) CONTAINS 'notable advancements' AND
    toLower(n.identity) CONTAINS '2020'
)
RETURN n

The first condition n:NER OR n:THEME ensures the query runs ONLY on nodes where the identity property exists.
Yet it managed to error (not every time).

As it doesn't always error it sounds like the conditions are stored in an unordered set and the order is not respected.

@malencv
Copy link

malencv commented May 9, 2024

Hi,

this is not an openCypher question. openCypher publishes a specification of the language. Your problem seems to be with the implementation of that specification, so you should contact your database provider (as I see you have done).

In general, database languages do not specify the order of evaluation of filtering conditions. This allows implementers to create query planners that optimize the order of evaluation based on factors like selectivity of the conditions and/or the presence or not of indexes on the properties used.

This is the case for openCypher (and for SQL too), afaik.

I am a bit surprised, too, that the order changes in consecutive runs, but again, that is something to discuss with your database provider.

Knowing nothing about the database you are using, I would suggest to split the match into two:
MATCH (n) WHERE (n:NER OR n:THEME)
MATCH (n) WHERE
toLower(n.identity) CONTAINS 'technology' AND
toLower(n.identity) CONTAINS 'science' AND
toLower(n.identity) CONTAINS 'notable advancements' AND
toLower(n.identity) CONTAINS '2020'
RETURN n

that should guarantee that the first match conditions are evaluated first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants