Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryResultTypeWalker: resolve if int/float/bool is fetched as native type or stringified #506

Draft
wants to merge 9 commits into
base: 1.3.x
Choose a base branch
from

Conversation

janedbal
Copy link
Contributor

@janedbal janedbal commented Dec 13, 2023

Currently, inferring numeric types (mainly results of AVG, SUM etc) is imperfect as it is currently implemented to serve all configurations resulting in union types like int|numeric-string. This is painful to work with in real codebase with some specific configuration where you know it is always int.

To match setup of modern PHP versions (8.1+) using most used driver (PDO), I added option not to stringify such expressions.

Rewritten to autodetct driver & PHP version & PDO setup to properly infer types.

@ondrejmirtes
Copy link
Member

To match setup of modern PHP versions (8.1.25+) using most used driver (PDO), I added option not to stringify such expressions.

What about auto-detecting this from PhpVersion, and also from objectManagerLoader if one is provided?

@janedbal janedbal changed the title QueryResultTypeWalker: add option not to stringify expressions QueryResultTypeWalker: resolve if int/float/bool is fetched as native type or stringified Dec 18, 2023
@@ -812,43 +844,39 @@ public function walkSelectExpression($selectExpression)
$resultAlias = $selectExpression->fieldIdentificationVariable ?? $this->scalarResultCounter++;
$type = $this->unmarshalType($expr->dispatch($this));

if (class_exists(TypedExpression::class) && $expr instanceof TypedExpression) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was dead block of code as class_exists(TypedExpression::class) is always false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -1097,8 +1127,12 @@ public function walkLiteral($literal)
break;

case AST\Literal::BOOLEAN:
$value = strtolower($literal->value) === 'true' ? 1 : 0;
$type = new ConstantIntegerType($value);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just wrong for postgres, that returns real booleans, but we were infering 1|0|'1'|'0'

private function shouldStringifyExpressions(Type $type): TrinaryLogic
{
$driver = $this->em->getConnection()->getDriver();
$nativeConnection = $this->getNativeConnection();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, this causes connection initialization.

@janedbal janedbal marked this pull request as draft December 21, 2023 08:28
@janedbal
Copy link
Contributor Author

Note: this is still WIP and I still plan to finalize this, but implementing all the built-in SQL functions is non-trivial when considering all edge-cases. To be able to implement it properly, I created test repository to verify all differences: https://github.com/janedbal/php-database-drivers-fetch-test/

@VincentLanglet
Copy link
Contributor

Hi @janedbal, I need this PR to move on mine #520.

What is missing here (maybe writing tests ?) ; do you need help ?

@ondrejmirtes
Copy link
Member

@VincentLanglet Jan already drowned dozens of hours to get this right, it's a very complex topic as you can see in here https://github.com/janedbal/php-database-drivers-fetch-test/. Don't rush this :)

@VincentLanglet
Copy link
Contributor

@VincentLanglet Jan already drowned dozens of hours to get this right, it's a very complex topic as you can see in here janedbal/php-database-drivers-fetch-test. Don't rush this :)

Sorry if my message was heard like this, wasn't my intention. I thought one of the issue with this big PR was to write tests for all the build-in method, and that maybe I could find and offer time to this.

Anyway, it will take the time it need to take ! :)

@janedbal
Copy link
Contributor Author

The green CI here is misleading, there is still ton of work awaiting (walkFunction is wrong, DoctrineTypeDescriptor::getDatabaseInternalType has no access to driver so it cannot return proper type, tests need to be MUCH better). I just didnt push my local WIP (maybe I should) :)

@janedbal
Copy link
Contributor Author

@VincentLanglet Pushed the WIP so that you can check the direction I'm leaning towards.

* - bigint: int int int int
* - bool: int int bool bool
*/
public function getDatabaseInternalType(Driver $driver): Type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will be a BC break since the interface has @api annotation.

Not sure which strategy should be followed to avoid this...
Should another interface be introduced ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, another interface somehow.

@VincentLanglet
Copy link
Contributor

@VincentLanglet Pushed the WIP so that you can check the direction I'm leaning towards.

Thanks, can I help you on this ?
If I understand correctly the goal is to uncomment the tests and make them pass or is there something else ?
Do you need more test to be written ?

I notice there was conflict with the 1.3.x branch, do you have time to solve them or do you want me to try ? :)

@janedbal
Copy link
Contributor Author

@VincentLanglet I've some stuff in backlog to do first before returning to this. But the outline what needs to be done:

  • (1) Decide how to deal with untested drivers. For example how BOOL column is fetched. Do we want to:
    • Produce best guess?
      • Currently only postgre uses bool, any other driver is just 1|0 (or '1'|'0' when stringified). So what do we produce for "other driver"? Sidenote: I think bool is currently completely broken for postgre.
    • Return mixed?
    • Return benevolent-union of some reasonable value? What is reasonable value, those other drivers ever used?
  • (2) Carefully finalize the implementation (mainly all the aggregate functions, which are tricky)
    • There are plenty of edge-cases to decide.
      • For example SQRT(negative) leads either to NULL or a failure depending on a driver. But I assume we dont want to add NULL to the result just because of this. Or do we?
  • (3) Test every supported driver
  • (4) Solve the BC break introduced

And maybe more stuff I already forgot. I think you can easily help with (3) as such test can be added in standalone MR and possibly merged beforehand. The rest is just digging in code.

@VincentLanglet
Copy link
Contributor

@VincentLanglet I've some stuff in backlog to do first before returning to this. But the outline what needs to be done:

  • (1) Decide how to deal with untested drivers. For example how BOOL column is fetched. Do we want to:

    • Produce best guess?

      • Currently only postgre uses bool, any other driver is just 1|0 (or '1'|'0' when stringified). So what do we produce for "other driver"? Sidenote: I think bool is currently completely broken for postgre.
    • Return mixed?

    • Return benevolent-union of some reasonable value? What is reasonable value, those other drivers ever used?

I would take the little step by little step approach: untested drivers will returns the same type than before.
This way, when this PR is merged:

  • tested drivers users will have a better type
  • untested drivers won't have anything changed

And if a untested-drivers user complains about the type, he/we just have test this drivers and update the code to support it.

  • (2) Carefully finalize the implementation (mainly all the aggregate functions, which are tricky)

    • There are plenty of edge-cases to decide.

      • For example SQRT(negative) leads either to NULL or a failure depending on a driver. But I assume we dont want to add NULL to the result just because of this. Or do we?

What about starting with only the existing implemented aggregate functions ?

public function walkAggregateExpression($aggExpression): string

It could be easier to first, make a PR with those aggregate functions and then adding other aggregate one by one ?

The options I see are

  • Changing the signature to
public function getDatabaseInternalType(/** Driver $driver */): Type;

and passing the driver. In next major the param will be required. This is often done by Symfony.

  • Creating an interface with a different method
interface DriverAwareDoctrineTypeDescriptor {
     public function getDatabaseInternalTypeFromDriver(Driver $driver): Type;
}

and implementing the interface ; then the check

$typeDescriptor instanceof DriverAwareDoctrineTypeDescriptor
    ?  $typeDescriptor->getDatabaseInternalTypeFromDriver($driver);
    :  $typeDescriptor->getDatabaseInternalType();

could be made.

I think the decision is more on @ondrejmirtes about how he wants to proceed in order to evolve such method signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants