Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CQL2 as an expression language (not only boolean predicates) #723

Open
jerstlouis opened this issue Jul 1, 2022 · 5 comments
Open

CQL2 as an expression language (not only boolean predicates) #723

jerstlouis opened this issue Jul 1, 2022 · 5 comments
Labels
CQL2 Future work support in an additional part of OGC API Features

Comments

@jerstlouis
Copy link
Member

jerstlouis commented Jul 1, 2022

CQL2 would actually be very useful as a general expression language, if we would only remove the concept that it needs to evaluates to a boolean expression. Such an expression language is needed in several OGC specifications, including for speciying derived fields (e.g., properties=ndvi:(B5-B4)/(B5+B4) in an extension for OGC API - Coverages and OGC API - DGGS), for specifying parameter values for symbolizers in SymCore, as well as for OGC API - Features - Part 5: Search/Queries extension.

Also if I understand the grammar correctly, currently e.g., active (referencing a boolean queryable called active) would not be a valid CQL expression, even though it does evaluates to a boolean expression, while active = true would be. Even in the context of predicate/filter, e.g. in a styling rule selector, it would be nice to be able to use the shorter syntax, e.g. [active] { visible = true }. Related to point 2 of #705.

If the requirement that a CQL2 expression used as a filter evaluates to a boolean was in the OGC API - Features - Part 3: Filter extension instead, that would leave CQL2 ready to use as a generic expression language.

@pvretano
Copy link
Contributor

pvretano commented Jul 1, 2022

@jerstlouis about the fact that CQL expression evaluate to a boolean expression I am not opposed to removing that or at least qualifying it a little further (e.g. when used in a filter expression, CQL expression SHALL evaluate to true or false) but that is something that will need to be discussed in the SWG. Although I will point out that such a chance may complicate the definition of the language a bit and will likely require the specification to be extended (or reorganized) in order to discuss the general evaluation of a expression (as opposed to it simply evaluating to a boolean).
About a boolean queryables, that's easy enough to add and I'm not opposed to that either.

@cportele
Copy link
Member

cportele commented Jul 1, 2022

I think my main concern is that most of the CQL2 language is about boolean predicates (all the logical, comparison, spatial, temporal, array predicates are boolean-valued). Only the functions, arithmetic expressions and the operators like CASEI are not. I.e., the main mechanism for extending the language to result in non-boolean results would be additional operators or functions, but that is probably ok.

But if we open the scope to a general spatio-temporal expression language, it would seem a little odd that the language does not include more operators to, just for example, construct geometries (buffer, union, intersection, etc.) or other non-boolean values.

Could it be an option to simply explicitly open the possibility to extend the CQL2 grammar / JSON schema in an extension to non-boolean results (and start working on this). Such an extension should then also define key operators for the main usage patterns outside of filter expressions. I guess such an extended language should then also have a different name to be clear about the scope?

@jerstlouis
Copy link
Member Author

jerstlouis commented Jul 1, 2022

@cportele

If we allow a simple identifier expression (as in my example of active above) or an arbitrary function call, then that identifier (queryable) or that function could evaluate to a string, a geometry, or anything else.

In #705 basically I am trying to highlight that we really would only need a grammar for Basic CQL2, and everything else in the advanced conformance classes could be defined by additional functions that play the role of operators (as in the array operators, temporal operators, spatial operators). Additional such functions could also be defined in an extension (or local implementation function definitions listed at /functions) for constructing geometries (buffer, union, intersection, etc.).

The CQL2 language itself and its grammar is already almost perfect as an expression language if we remove the restrictions to comparisons and specific operators (syntactically I see them as functions) that return boolean, and allow identifiers, or literals, or any function call as the top-level grammar rule.

@pvretano

Although I will point out that such a change may complicate the definition of the language a bit and will likely require the specification to be extended (or reorganized) in order to discuss the general evaluation of a expression (as opposed to it simply evaluating to a boolean). About a boolean queryables, that's easy enough to add and I'm not opposed to that either.

Yes it would require a bit of re-organization, but I believe that it could actually greatly simplify the grammar, that is what I tried to illustrate in #705.

We would not try to catch type mismatches at the grammar level as the BNF currently does (because it cannot always be done anyways with queryables and functions returning values, and because the specification already has a permission that says the server can either type cast or throw an error), but use a simple set of rules like:

Tokens (lexing):

  • identifier
    • starts with: ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
    • continued with any of the above, or: "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
    • unless within double-quotes ", where everything is allowed (and " itself can be escaped with a \)
  • string: starts and ends with a ', two ' is one escaped '
  • number: [0-9]+[.][E[0-9]+] | [[0-9]*].[0-9]+[E[0-9]+]
  • boolean: false, true
  • null
  • arithmetic operators: +, -, *, /,^
  • comparison operators: =,<>,<,>,<=,>=
  • logic operators: not,and,or
  • is
  • advanced comparison operator: in,between,like
  • symbols: (, ), ,

Operators precedence:

  1. ^
  2. *, /
  3. +, -;
  4. in, between, like
  5. <, >, <=, >=
  6. =, <>, is
  7. and
  8. or

Expression Grammar Rules (parsing):

  • Expression: ExpIdentifier | ExpBoolean | ExpNull | ExpString | ExpNumber | ExpArray | ExpTuple | ExpSub | ExpArray | ExpOperation | ExpAdvComp | ExpFunctionCall
  • ExpIdentifier: identifier
  • ExpBoolean: false | true
  • ExpNull: null
  • ExpString: string
  • ExpNumber: number
  • ExpTuple: ExpNumber ExpNumber (used for WKT coordinates pairs)
  • ExpList: Expression [, Expression]*
  • ExpSub: ( ExpOperation | ExpAdvComp )
  • ExpArray: ( [ExpList] )
  • ExpOperation: [- | not] Expression | Expression (arithmetic/logic/relational) operator Expression | Expression is [not] ExpNull
  • ExpAdvComp: Expression [not] between Expression and Expression | Expression [not] like Expression | Expression [not] in Expression
  • ExpFunctionCall: Expression ( [ExpList] )

I believe these are the only rules that are needed.
There may be a few ambiguities to address here regarding array literals, sub-expressions, tuples, and function call arguments...

Everything else in all CQL2 conformance classes (including WKT geometry) can be defined by using ExpFunctionCall as the extension point for spatial/temporal/array operators and spatial/temporal literals (assuming we decide to change array literals to ( 1, 2, 3 ) rather than [ 1, 2, 3 ] as discussed in #718).

@aaime
Copy link
Contributor

aaime commented Jul 2, 2022

For reference, GeoServer has been using "Extended CQL" as an expression language for a while, in a few places:

  • Alternative styling languages, like GeoCSS or YSLD
  • Vector data attribute customization (creating a new attribute as an expression based on other, existing attributes)

The language has not been extended to make it a more complete expression language, we used functional extensibility to address all needs we faced so far (e.g., geometry construction).

@cportele cportele self-assigned this Jul 4, 2022
@cportele
Copy link
Member

cportele commented Jul 4, 2022

Meeting 2022-07-04: It is future work to make CQL2 a general expression language. For this version, add a statement that CQL2 contains many language elements for a general expression language and that we will investigate to evolve the language into an expression language in a future version (with boolean expressions as a conformance class). The language should backwards compatible with v1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CQL2 Future work support in an additional part of OGC API Features
Development

No branches or pull requests

4 participants