Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify type casting in CESQL spec #1281

Merged

Conversation

Cali0707
Copy link
Contributor

@Cali0707 Cali0707 commented May 9, 2024

Follow up on discussion in cloudevents/sdk-go#1046 (comment)

Proposed Changes

  • Clarify which type casts are required to be supported
  • Clarify how boolean -> string type casting works
  • Clarify how type casting is performed in the LIKE operator

Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707
Copy link
Contributor Author

Cali0707 commented May 9, 2024

cc @pierDipi @duglin

cesql/spec.md Outdated

1. `Integer -> String`
1. `String -> Integer`
1. `String -> Boolean`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are ones like this defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I had assumed that they were defined elsewhere and I was just seeking to clarify which ones had to be supported. But, looking more closely now I don't see those defined anywhere...

Let me add those

Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707
Copy link
Contributor Author

cc @duglin @jskeet

Let me know if this makes sense given your feedback last week!

cesql/spec.md Outdated Show resolved Hide resolved
cesql/spec.md Outdated Show resolved Hide resolved
cesql/spec.md Outdated
@@ -353,6 +357,25 @@ left operand of the OR operation evalues to `true`, the right operand MUST NOT b

#### 3.7. Type casting

A CESQL engine MUST support the following type casts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This talks about "returning" an error - elsewhere, the spec talks about "raising" an error but also returning a value. I suspect we need to be consistent here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see your point. Personally, I think "returning" makes more sense because it's not like an exception that is raised can be caught somewhere - it is just a second return value you get on every evaluation (which may be null/nil). WDYT? If you think "returning" makes more sense, I'll open a PR to change that everywhere, otherwise I'll switch these to be "raising" an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't clear - I'm fine with the "raising" terminology, but the problem is that these casts don't say what the non-error part of the return is when there's a problem. If I cast "BOGUS" to Boolean, an error will be raised... but what's the value of the expression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see your point - let me add some clarification here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some clarification, could you recheck when you have time @jskeet ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that definitely works.

cesql/spec.md Outdated Show resolved Hide resolved
Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707 Cali0707 requested a review from jskeet May 20, 2024 15:15
Signed-off-by: Calum Murray <cmurray@redhat.com>
@duglin
Copy link
Collaborator

duglin commented May 22, 2024

Error looks real: cesql/spec.md: line 365: 'may' MUST be capitalized ('MAY')

Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707 Cali0707 requested a review from duglin May 22, 2024 14:39
cesql/spec.md Outdated
@@ -278,6 +278,10 @@ For example, the pattern `_b*` will accept values `ab`, `abc`, `abcd1` but won't
Both `%` and `_` can be escaped with `\`, in order to be matched literally. For example, the pattern `abc\%` will match
`abc%` but won't match `abcd`.

In cases where the left operand is not a `String`, it MUST be cast to a `String` before the comparison is made.
The pattern of the `LIKE` operator (that is, the right operand of the operator) MUST be a valid string predicate,
otherwise the parse MUST return a parse error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear... subject LIKE TRUE isn't valid because TRUE isn't a valid string, however, we talk about how there's implicit casting all over the place... so someone may wonder why TRUE isn't implicitly converted into "true". Am I correct in this thinking? Should we be explicit and say that casting isn't allowed in this case? Or perhaps should we allow it for consistency? @jskeet thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definitely be good to be clear. Even without the "LIKE" aspect, I can't tell offhand whether "TRUE" = true is casting the LHS to Boolean, or the RHS to String. I suspect it's the former, based on bullet 2 in the list in 3.7.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duglin I think the main problem is that currently the LIKE expression isn't defined with an expression as the right operand. It is defined as:

expression NOT? LIKE stringLiteral

So, any value on the right operand which is not a string literal MUST be a parse error currently. We can change this, but I'm not sure if it makes sense. The whole point of a LIKE expression is to compare against a pattern, not a specific value. If someone were to compare against another string value wouldn't it just makes sense to use = or something like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type casting is whenever an operator is defined for values, but the LIKE operator is currently only defined for a literal, not a value. I'm not 100% sure why this was the initial decision, but that's how it currently works

Copy link
Collaborator

@duglin duglin May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to leave it as "must be a string literal w/o casting", but let's be extra clear in the text that implicit type casting isn't allowed in this one spot. I agree defining it as stringLiteral sort of implies it, but saying that things should be implicitly case in all other spaces (even when the spec says the arg is a string) could be unclear to people.

cesql/spec.md Outdated

| Definition | Semantics |
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `Integer -> String` | Returns the string representation of the integer value in base 10. If the value is less than 0, the '-' character is prepended to the result. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be very anal... w/o leading zeros?

cesql/spec.md Outdated
| `Integer -> String` | Returns the string representation of the integer value in base 10. If the value is less than 0, the '-' character is prepended to the result. |
| `String -> Integer` | Returns the result of interpreting the string as a 32 bit base 10 integer. The string MAY begin with a leading sign '+' or '-'. If the result will overflow or the string is not a valid integer an error is returned along with a value of `0`. |
| `String -> Boolean` | Returns `true` or `false` if the lower case representation of the string is exactly "true" or "false, respectively. Otherwise returns an error along with a value of `false` |
| `Boolean -> String` | Returns `"true"` if the boolean is `true`, and `"false"` if the boolean is `false`. |
Copy link
Collaborator

@duglin duglin May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a 3x3 table (or something) that shows all combinations in one spot. Then it's easy to see what's missing, like bool->int, int->bool

Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
@Cali0707 Cali0707 requested a review from duglin May 27, 2024 18:39
@Cali0707
Copy link
Contributor Author

@duglin how does this look now?

cesql/spec.md Outdated
| ------- | -------- | ------ | -------- |
| Integer | N/A | MUST | MUST NOT |
| String | MUST | N/A | MUST |
| Boolean | MUST NOT | MUST | N/A |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid choice, but I'm curious why you didn't choose to map 1 with true and 0 with false ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I just based this off the current implementation in the SDKs...

The 1 mapping to true and 0 to false sounds reasonable though, especially because then the zero value for integers would be cast to false.

Which do you think is better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm partial to allowing it, then everything can be implicitly cast to anything else - and I think you can then remove an error type.

@jskeet any concerns?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with it - personally I do like being clear about expressions in terms of types, but that's mostly in C# etc. I haven't done significant amounts of SQL in years, so I don't know how it would "feel". If others are happy, that's fine by me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duglin if we do this:

map 1 with true and 0 with false

then how would we handle an integer value like 200? Would that be an error? Or to avoid errors would it be better to say that when casting from integer to boolean, any non-zero integer becomes true, and 0 becomes false.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup- I like that. Int->bool is zero vs non-zero. But bool->int is 0 vs 1

@duglin
Copy link
Collaborator

duglin commented May 28, 2024

Just one question, but either way that one lands.... LGTM -thanks!

Signed-off-by: Calum Murray <cmurray@redhat.com>
@duglin
Copy link
Collaborator

duglin commented May 30, 2024

Approved on the 5/30 call

@duglin duglin merged commit 66c067b into cloudevents:main May 30, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants