Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expression Builder API feedback #494

Closed
ethanresnick opened this issue May 14, 2023 · 23 comments · Fixed by #565
Closed

Expression Builder API feedback #494

ethanresnick opened this issue May 14, 2023 · 23 comments · Fixed by #565
Labels
api Related to library's API discussion Let's talk about design, goals and other stuff related to Kysely's development

Comments

@ethanresnick
Copy link
Contributor

ethanresnick commented May 14, 2023

Thanks again for all the great work you're doing here! I just started using the new(ish) ExpressionBuilder API, and wanted to give a bit of feedback, which I hope will be helpful.

In general, I think the new design makes a lot of sense, and the new names apply better in the various contexts where expressions can be used than the old ones did. My only suggestion has to do with the abbreviated method names cmpr and bxp. Two things struck me about these:

  1. Using abbreviations felt a bit out of place with the rest of the kysely API, which generally seems favors explicitness and readability over maximum concision (e.g., selectFrom rather than from, or something even shorter). To me, the three letters saved by using cmpr as opposed to compare doesn't seem worth it, given that cmpr is not an especially familiar abbreviation.

  2. However, if the expression builder API is going to use abbreviations — and I can see a case for that in this API — then I think it might be worth adding abbreviated methods for specific, common binary comparisons, like eq, neq, gt, gte, etc. Having methods like that is, to me, both more intuitive (because eq is a much more common/familiar abbreviation than cmpr) and more concise (since the operator can be omitted).

So, I guess the above suggest three possible APIs:

  1.  qb.where(({ compare, and }) => and([
       compare('first_name', '=', 'Jennifer'),
       compare('last_name', '!=', 'Tremblay'),
       compare('updated_at', '>', lastMonth)
     ])
  2.  qb.where(({ cmpr, and }) => and([
       cmpr('first_name', '=', 'Jennifer'),
       cmpr('last_name', '!=', 'Tremblay'),
       cmpr('updated_at', '>', lastMonth)
     ])
  3.  qb.where(({ eq, neq, gt, and }) => and([
       eq('first_name', 'Jennifer'),
       neq('last_name', 'Tremblay'),
       gt('updated_at', lastMonth)
     ])

Personally, I think I prefer (1), for maximum clarity without too much verbosity. But, if maximal concision is the goal, then (3) makes more sense to me than two.

@igalklebanov
Copy link
Member

igalklebanov commented May 14, 2023

Hey 👋

Thank you for taking the time to write this.

We went with such abbreviations because:

  1. we didn't want these names to collide with SQL keywords, known and unknown. e.g. compare sounds like something some dialect might have, either full-match, or as part of a built-in function name.
  2. we wanted the function names to "get out of the way" when reading/writing the code. The most important part of a cmpr(...) is its arguments, which should read like SQL to align with WYSIWYG. We don't need a fully qualified compare to realize there's a comparison going on, the arguments speak for themselves.

We thought about eq, neq, etc. and decided to not introduce these:

  1. They're not aligned with WYSIWYG.
  2. More code to maintain for ",'=',".length less characters you type/"copilot"?
  3. They're redundant. Too many options to choose from that do the same thing.

@igalklebanov igalklebanov added discussion Let's talk about design, goals and other stuff related to Kysely's development api Related to library's API labels May 14, 2023
@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 14, 2023

Thanks @igalklebanov! I think I agree with not adding eq, neq, etc... I'm curious if you all considered something like this?

 qb.where(({ $, and }) => and([
   $('first_name', '=', 'Jennifer'),
   $('last_name', '!=', 'Tremblay'),
   $('updated_at', '>', lastMonth)
 ]))

db.updateTable('person')
  .set(({ $ }) => ({  
     age: $('age', '+', 1)
  }))

I'm not necessarily arguing for that, but it would seem to be the logical extension of the current design.

I.e., some of the downsides with cmpr/bxp are that they are not names that a user would know to search for (e.g., in the API docs) or that they'd necessarily recognize as being the function that they're looking for (e.g., in an autocomplete dropdown). $ has both of those downsides, but then seems to do better at "getting out of the way" and letting the arguments speak for themselves.

In the above example, $ would probably be a shorthand for bxp, as bxp seems like the more general one, which people could use almost all the time (unless they run into TS performance/type depth issues).

@koskimas
Copy link
Member

koskimas commented May 15, 2023

If we renamed cmpr to compare, what would bxp be renamed to? binaryExpression? That's way too long and would dominate the expressions. I think we should either abbreviate all or none of them.

We need separate binary expression and a comparison expression. If we only had bxp, you couldn't use an unknown operator without specifying the output type explictly.

cmpr produces a boolean expression by default
bxp produces an expression that has the same type as the operands by default

$ is already reserved for stuff that's not related to SQL ($call, $if, etc).

Another reason for the abbreviated names was that people won't find the methods by searching no matter what we name them. They should be discovered from examples. The API documentation is filled with examples (which people don't seem to find). The new site will have those examples in a more accessible/searchable place.

@koskimas
Copy link
Member

koskimas commented May 16, 2023

We could replace bxp and cmpr with a single function if we accept that using custom operators needs an explicit output type. We support most of the operators the built-in dialects have, but third party dialects might suffer.

I kind of like the idea of replacing both with $ but unfortunately that's not an option as stated in the previous comment. What else could we use?

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 17, 2023

What else could we use?

Hmm…I think the obvious candidate would be _, since that’s the only other legal non-alphabetic character to start an identifier with, but that feels pretty overloaded to me in backend JS land (with lodash and the like) and also a bit hard to type constantly. So maybe better would be a very short alphabetic string that gestures at ‘expression’…maybe just e?

qb.where(({ e, and }) => and([
   e('first_name', '=', 'Jennifer'),
   e('last_name', '!=', 'Tremblay'),
   e('updated_at', '>', lastMonth)
 ]))

db.updateTable('person')
  .set(({ e }) => ({  
     age: e('age', '+', 1)
  }))

Or possibly xp, which sorta sounds like expression?

qb.where(({ xp, and }) => and([
   xp('first_name', '=', 'Jennifer'),
   xp('last_name', '!=', 'Tremblay'),
   xp('updated_at', '>', lastMonth)
 ]))

db.updateTable('person')
  .set(({ xp }) => ({  
     age: xp('age', '+', 1)
  }))

Could also do just x, which also kinda sounds like expression and has the secondary meaning of ‘a generic placeholder name’, which kinda applies here.

qb.where(({ x, and }) => and([
   x('first_name', '=', 'Jennifer'),
   x('last_name', '!=', 'Tremblay'),
   x('updated_at', '>', lastMonth)
 ]))

db.updateTable('person')
  .set(({ x }) => ({  
     age: x('age', '+', 1)
  }))

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 17, 2023

Another direction would be is, which I actually quite like…makes sense in the boolean context especially, but works decently in other contexts too:

qb.where(({ is, and }) => and([
   is('first_name', '=', 'Jennifer'),
   is('last_name', '!=', 'Tremblay'),
   is('updated_at', '>', lastMonth)
 ]))

db.updateTable('person')
  .set(({ is }) => ({  
     age: is('age', '+', 1)
  }))

I think that’s my favorite by far: it’s actually a full word, while still being shorter than the current names. If we don’t like is in the non-boolean contexts, we could also make is the replacement for cmpr and think of something different for bxp

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 17, 2023

I suspect an objection to is will be that it could be confused with the SQL operator, and I agree that’s an issue.

So here’s one more option: get, which is also a full and appropriate word, and is short enough that, with a two argument overload, it might be able to replace unary and all the shorthand methods (not, exists, neg) as well.

qb.where(({ get, and }) => and([
   get('first_name', '=', 'Jennifer'),
   get('last_name', '!=', 'Tremblay'),
   get('updated_at', '>', lastMonth),
   get('not', 'is_deleted')
 ]))

db.updateTable('person')
  .set(({ get }) => ({  
     age: get('age', '+', 1),
     backwards: get('-', 'age')
  }))

Of course, we can mix and match between all these options (eg, unifying cmpr, bxp, and unary, but calling that unified method e or exp or whatever).

I’m also not sure about the compiler performance hit of unifying unary with the other methods via a two-argument overload, but it does seem nice from an API POV.

@igalklebanov
Copy link
Member

Left field idea:

eb.b('ref', '+', 5) // `b` for binary.
eb.b01('ref', '=', 'moshe') // cmpr replacement. `b` for binary, `01` for SqlBool hint :)

@ethanresnick
Copy link
Contributor Author

Hmm, I think I'd rather try to unify bxp and unary, in which case leaning into "binary" in the name doesn't make sense. But, if that unification doesn't work for some reason, then b seems like a decent option. I'm not a fan of the 01, though: I think kysely should infer the return type from the operator (at least for the built-in operators, which seem like they cover 95%+ of the usage).

@igalklebanov
Copy link
Member

igalklebanov commented May 17, 2023

Another one:

eb('ref', '+', 5)
eb('ref', '=', 'moshe')

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 17, 2023

Love that!

Although it does mean that every other property (and, or, fn) can't be destructured.... assuming eb is the whole builder object, and not a function called eb on the builder. But maybe we could pass the builder twice?

@igalklebanov
Copy link
Member

igalklebanov commented May 17, 2023

@ethanresnick

Love that!

Although it does mean that every other property (and, or, fn) can't be destructured.... assuming eb is the whole builder object,
and not a function called eb on the builder. But maybe we could pass the builder twice?

Yeah that could work.

interface EB {
  (): {}
  eb: Omit<EB, 'eb'>
  fn: () => {}
  ...
}

({ eb, fn }) => ...

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 18, 2023

I'm loving this!

But why not:

interface EB {
  (): {}
  eb: EB // no omit here
  fn: () => {}
  ...
}

To me, it makes sense that the eb property should just be a reference to the full EB object (since that's what it's named after, after all).

Combining that with unifying bxp/unary/cmpr and we'd have:

qb.where(({ eb, and }) => and([
   eb('first_name', '=', 'Jennifer'),
   eb('last_name', '!=', 'Tremblay'),
   eb('updated_at', '>', lastMonth),
   eb('not', 'is_deleted')
 ]))

db.updateTable('person')
  .set(eb => ({  
     age: eb('age', '+', 1),
     backwards: eb('-', 'age')
  }))

@koskimas
Copy link
Member

koskimas commented May 18, 2023

This seems awesome! But let's really think this through. Deprecating the functions we just added to replace another set of deprecated functions is a bit ugly 😅 Let's make sure this one sticks or we end up changing the API again.

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 18, 2023

This seems awesome! But let's really think this through. Deprecating the functions we just added to replace another set of deprecated functions is a bit ugly 😅 Let's make sure this one sticks of we end up changing the API again.

Totally agree: if we make changes, we should be reasonably confident we're making all the changes we'd want to make, and doing more than just bikeshedding on function names. I've been thinking a bit more deeply about this today, and will have a proposal soon.

In the meantime, I'm curious: how are custom (dialect-specific) binary operators supposed to work today with bxp/cmpr? I see that the type of OP is allowed to extend Expression<unknown>, but a custom operator name (just the name, with no operands) isn't really be an Expression, right? I think it should just be a string?

Also, it seems like the return type of bxp in the custom operator case would always be ExtractTypeFromReferenceExpression<DB, TB, RE>, where RE is the expression on the lhs. But that doesn't seem quite right either? I.e., why are we assuming that the operator will output the same type as the first operand?

By my read, if we were gonna merge bxp and cmpr (just taking these two as a base case to start with), I think the type would use three overloads, and be something like:

// This is the exact type that `cmpr` has today, except it constrains OP to _only_ known
// comparison operators, rather than `ComparisonOperatorExpression`.
function bxp<
  O extends SqlBool = SqlBool,
  RE extends ReferenceExpression<DB, TB> = ReferenceExpression<DB, TB>
>(
    lhs: RE,
    op: ComparisonOperator,
    rhs: OperandValueExpressionOrList<DB, TB, RE>
  ): ExpressionWrapper<O>;

// This is the exact type that `bxp` has today, except that it also replaces 
// BinaryOperatorExpression with just BinaryOperator, and simplifies the return type
// because ComparisonOperator is now handled above.
function bxp<
  RE extends ReferenceExpression<DB, TB>,
>(
  lhs: RE,
  op: BinaryOperator,
  rhs: OperandValueExpression<DB, TB, RE>
): ExpressionWrapper<ExtractTypeFromReferenceExpression<DB, TB, RE>>

// Finally, we handle the custom operator case as just string, and let the caller
// customize the return type with a new RES parameter.
function bxp<
  RE extends ReferenceExpression<DB, TB>,
  RES extends unknown = ExtractTypeFromReferenceExpression<DB, TB, RE>
>(
  lhs: RE,
  op: string,
  rhs: OperandValueExpression<DB, TB, RE>
): ExpressionWrapper<RES>

Does that seem right to both of you? I just wanna check that I'm not misunderstanding something fundamental before I go down a longer rabit hole of thinking more about how we might unify some of these functions.

Btw, I’ve never made overloads with different type parameters before — didn’t even realize that was legal — so I’m not sure how well TS will resolve different call sites against overloads like this, or how the compiler performance compares with unifying all of them through some fancy conditional types. But, again, I just wanted to start by checking my conceptual understanding of how custom operators should work, before we get into these details

@igalklebanov
Copy link
Member

igalklebanov commented May 18, 2023

@ethanresnick have you tried this in a playground? I suspect it might break autocompletion which is a big no-no.

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 18, 2023

@igalklebanov I haven't, but I will. Again, though, I'm less interested in "should this be overloads or conditional types" and more interested in: am I understanding correctly that the current setup for custom operators — namely, making the operator be an Expression rather than a string, and forcing the output type to match the output type of the LHS — doesn't really make sense?

It also seems a little weird to be that bxp and cmpr seem to accept custom operators, but unary doesn't? I’m also not sure if there’s any use case for custom operators with more than two operands (probably not, as most of those will just be functions?), but that doesn’t seem supported either.

@ethanresnick
Copy link
Contributor Author

ethanresnick commented May 19, 2023

Quick update: I did test the above set of overloads in the playground, and autocomplete works! However, if I also try to add overloads for what is currently the unary function, things get weird: the autocomplete dropdown shows all the valid options for each argument (ie, the available column names, operator names, etc), but it also shows some suggestions that are inapplicable (but that would be valid for other overloads), thanks to this bug: microsoft/TypeScript#26892

That’s too bad, because I was hoping this eb function could unify unary, binary, and nary operators (including and and or) to give:

qb.where(eb => eb('and', [
  eb('first_name', '=', 'Jennifer'),
  eb('last_name', '!=', 'Tremblay'),
  eb('updated_at', '>', lastMonth),
  eb('not', 'is_deleted')
]))

But it doesn’t seem like that’s currently possible while still having great autocomplete.

The rule would’ve been:

  • eb with three arguments is a binary expression, with the operator infixed
  • eb with two arguments, where the second is an expression, is a unary operator expression (with the operator in the first argument)
  • eb with two arguments, where the second is an array, is an nary operator expression (with the operator in the first argument)

@koskimas
Copy link
Member

Let's not try to combine and, or or unary into the new function. They'd break autocompletion no matter how they are implemented.

@ethanresnick
Copy link
Contributor Author

Ok. In that case, I think my only outstanding questions are the ones I posed above about custom operators: how are they passed, and are they supported only for binary operators?

@koskimas
Copy link
Member

koskimas commented Jul 6, 2023

@ethanresnick I'm working on these changes here #565

Thanks again for the discussion here! I think the next versions is a big improvement.

@koskimas
Copy link
Member

koskimas commented Jul 6, 2023

Ok. In that case, I think my only outstanding questions are the ones I posed above about custom operators: how are they passed, and are they supported only for binary operators?

We already support most operators. I think it's fine to require

sql`custom_op`

in the rare cases its needed. We can also add operators as we go.

@ethanresnick
Copy link
Contributor Author

Wow, thanks for working on this @koskimas! It's awesome to see it landed 😁 🚀

As far as custom operators go, I don't use any custom operators myself, so I don't have a strong opinion, but using the sql tag does honestly seem a bit weird to me, simply because the sql tag returns an expression, and an operator is not an expression. So, I'd be inclined to accept the custom operator as a string, which doesn't need to break autocomplete. Then again, I can see some benefits to the template tag approach:

  • it avoids the (very remote) possibility of a built-in kysely operator conflicting with a custom operator that has the same name but different argument or return types

  • it means that, if the user makes a typo on one of the built-in operator names, TS will catch it (though, with autocomplete, such typos should be fairly rare).

So, I guess I could go either way. I do think though, that the inferred return type for expressions with custom operators seems a bit off, as I mentioned in the comment linked above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to library's API discussion Let's talk about design, goals and other stuff related to Kysely's development
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants