Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra keys cause InvalidArgument #8744

Closed
derrabus opened this issue Nov 23, 2022 · 14 comments
Closed

Extra keys cause InvalidArgument #8744

derrabus opened this issue Nov 23, 2022 · 14 comments

Comments

@derrabus
Copy link

Found while testing 5.0.0-rc1

https://psalm.dev/r/f7ad4bc37e

This function expects an array with a message key. I'm passing an array that has that key among others. Yet, I'm getting an InvalidArgument error. That feels like a false positive.

@psalm-github-bot
Copy link

I found these snippets:

https://psalm.dev/r/f7ad4bc37e
<?php

/**
 * @param array{message: string}|null $error
 */
function outputMessage(?array $error): void
{
    if (!$error) {
        return;
    }
    
    echo $error['message'];
}

outputMessage(error_get_last());
Psalm output (using commit 8f39de9):

ERROR: InvalidArgument - 15:15 - Argument 1 of outputMessage expects array{message: string}|null, but array{file: string, line: int, message: string, type: int}|null provided

@danog
Copy link
Collaborator

danog commented Nov 23, 2022

Starting from Psalm v5, arrays will be sealed by default to avoid unexpected bugs, remove false positives and false negatives and improve overall type inference with more and better assertions.
See https://psalm.dev/articles/psalm-5 for a friendly overview of the changes! :)

@danog danog closed this as completed Nov 23, 2022
@derrabus
Copy link
Author

Thank you for the quick response and the pointer!

Okay wow… This will break a lot of well annotated projects out there.

I fully understand the problem that you want to solve, but I'd appreciate it if we could think about a friendlier way forward.

Could we at least get a more specific error that a generic InvalidArgument? That would allow me to ignore the error on existing codebases and gives me time to review affected doc blocks.

@derrabus
Copy link
Author

Another idea could be to introduce a new sealed-array{} construct and leave the array{} semantics as they are in v4 and as they are understood by other tooling like PHPStorm and PHPStan. Has there been a discussion on this change that I can join?

@weirdan
Copy link
Collaborator

weirdan commented Nov 23, 2022

See #8701

@danog
Copy link
Collaborator

danog commented Nov 23, 2022

The majority of projects making use of shape annotations can easily use type aliases to re-use shapes passed around codebases.

The specific example posted in this issue is one of the rare cases where ... can be used, though I really don't recommend using it because ... disables some useful type assertion logic, leasing to potential false positives and false negatives.

Always use sealed array shapes for userland code, and I may come up with a way to reference to array shapes of native functions in a future PR (maybe even by directly providing a set of native type aliases for commonly used shapes in the PHP STL, though this is really a bit of an edge case IMO :)

@muglug
Copy link
Collaborator

muglug commented Nov 24, 2022

The Psalm 5 announcement post will discuss this in detail, and I hope issues with unsealed shapes being passed where sealed shapes are expected will be minor.

As for having a specific issue for this, it’s an interesting idea, but the better option is to have an error message that explicitly shows which fields are missing in error messages (vs leaving it up to the user). That will hopefully be added soon!

@derrabus
Copy link
Author

At least on the codebases where I tested Psalm 5 on, this change is a major source of "new" errors. And in most cases, the affected code parts look like my example: A function declares an array shape as input parameter with only the keys it actually reads from. It does not care much about extra keys and it's not confused by them either. No join() calls, not foreach'ing over those structures, only key lookups. The sealed structures provide a level of strictness that I hardly need.

And this is the dilemma I find myself in: This change results in many false-positive errors in codebases that have documented their array shapes thoroughly already. Authors are force to revisit all the doc blocks they've written over the last months/years with the non-sealed behavior in mind. This causes a lot of busy work that might even block an update.

Improving the error message would be good as well because currently those are unreadable on big array shapes. Then again, my problem is not that I don't understand the errors. It's that I have so many of them on a perfectly fine codebase. Right now, I would have to suppress all of them in order to upgrade to Psalm 5. And InvalidArgument covers some other really important cases that I don't want to have suppressed unnoticed.

So again, either of those option would be a big help for me:

  • A dedicated type for sealed array shapes, effectively making the new level of strictness opt-in instead of opt-out.
  • A dedicated issue type for extra keys, allowing me to suppress this class of errors at first so I can gradually migrate to sealed array shapes.

@danog
Copy link
Collaborator

danog commented Nov 24, 2022

This causes a lot of busy work that might even block an update.

That's kind of the point of a breaking change in a major release, authors will have to revisit their code to account for the new behavior :)

Anyway, the fact that your codebase has a bunch of shape-related InvalidArgument errors is symptom of the fact that you're not reusing shapes properly via type aliases (and/or DTOs are not being used nearly enough).

Today @ work I'll start migrating our large codebase to sealed-by-default shapes: I already know I'll have some ~500 issues, but I already know that the majority of them can be easily fixed by defining some ~10-20 type aliases, and by using DTOs instead of arrays with Valinor.

It'll probably take a day or two of work, but the benefits especially from the reduced false positives in assertions and the exclusion of an entire class of foreach, join and shape->generic array bugs are much larger compared to the relatively small migration effort.

@AndrolGenhald
Copy link
Collaborator

AndrolGenhald commented Nov 24, 2022

Perhaps we could make conversion from sealed arrays to eg array{a: int, b: int, ...mixed} an opt in change with Psalter (specifically only for arrays where it would fix an issue)? This release has already been pushed back so much that it's probably not a good idea to keep pushing it further, but maybe for 5.1.

@orklah
Copy link
Collaborator

orklah commented Nov 24, 2022

Thanks a lot for your input @derrabus ! We're aware that this change will imply work in codebases.
I believe the projects your work on are probably some of the biggest users of array shapes while also the more accurately typed there are. This also mean those projects will ultimately benefit the most from this. I have myself made a few PR on doctrine repos in the past to describe array shapes and I'm convinced there were issues in there that went unnoticed.

Do you have an idea about how many issue this can represent for known repos? One thing to note too is that a lot of errors will probably be fixed by changing a single signature so it may appear more impressive than it really is

I like the idea of a Psalter fix too, but it would be a shame if everyone just added ...mixed without taking time to realize the issue at hand so it would have to be very well explained

@derrabus
Copy link
Author

This causes a lot of busy work that might even block an update.

That's kind of the point of a breaking change in a major release, authors will have to revisit their code to account for the new behavior :)

No, I am sorry, but it is not. It is okay of course to introduce breaking changes, but not for the sake of breaking things or keeping people busy. I do accept breaking changes, but what I'm trying to discuss here is a way to make the way forward a less painful one.

Anyway, the fact that your codebase has a bunch of shape-related InvalidArgument errors is symptom of the fact that you're not reusing shapes properly via type aliases (and/or DTOs are not being used nearly enough).

Also this: I did not come here for a lecture. I am aware that the codebases I'm working on are not the pinnacle of modern software design. Most of those codebases are legacy code that use associative arrays way more than they should. And those arrays change their shape all the time which is why working with type aliases does not get me far.

Rest assured that I am constantly addressing this kind of issues in my projects. And I am very happy that the developers I'm working with slowly adopt static analysis and document their array shapes. The possibility to document array shapes enabled them to use static analysis on code that seemed inaccessible to automated quality tools. This is really wonderful!

But I am certain that the very same developers will be most frustrated when upgrading to Psalm 5 if the change is shipped as I experience it in the RC. Imagine developers thoroughly documenting input parameters to legacy functions whenever they needed to understand and change them, with exactly the syntax that Psalm supports. And suddenly they upgrade to a new version and that very same tool tells them that all their documented types are broken because they should've used a syntax that the old version didn't even understand to fix a problem that the old versions wasn't even remotely aware of.

I repeat: I understand the reasoning behind sealed array shapes and I do believe using them will improve the quality of errors reported by Psalm. But please give me a painless way to progressively adopt the change. I have made two proposals on how this could be achieved.


Thanks a lot for your input @derrabus !

Thank you for taking the time! 🙂

We're aware that this change will imply work in codebases. I believe the projects your work on are probably some of the biggest users of array shapes while also the more accurately typed there are.

You are probably referring to the Doctrine repos, but there's also proprietary code I work on for my clients (because working on Doctrine won't pay my bills unfortunately 😉). This is mostly legacy stuff that I did not write myself and I mainly use static analysis there to regain control over old code with poor test coverage. And the developers that maintain those codebases are fairly new to static analysis.

I can certainly explain to them why sealing array shapes is probably a good idea. Think about mistyped array keys when adding values to an array, that have gone unnoticed previously. But as I said, give them a way to iteratively adopt the concept.

This also mean those projects will ultimately benefit the most from this.

They certainly will and I hope I have made myself clear that I don't question the benefits of the sealed array shape feature.

Do you have an idea about how many issue this can represent for known repos? One thing to note too is that a lot of errors will probably be fixed by changing a single signature so it may appear more impressive than it really is

I have no idea how to measure that, I'm afraid.

I like the idea of a Psalter fix too, but it would be a shame if everyone just added ...mixed without taking time to realize the issue at hand so it would have to be very well explained

An automated fixer would certainly help as well. Could be an alternative to the two options I've mentioned earlier.

@VincentLanglet
Copy link
Contributor

Perhaps we could make conversion from sealed arrays to eg array{a: int, b: int, ...mixed} an opt in change with Psalter (specifically only for arrays where it would fix an issue)? This release has already been pushed back so much that it's probably not a good idea to keep pushing it further, but maybe for 5.1.

Be aware that currently PHPStan (and certainly phpstorm) doesn't support ... syntax so far.
phpstan/phpstan#8438

An option to keep considering

array{a: int, b: int}

as unsealed would have help for the transition v4-v5 and the compatibility with other tool.

@VincentLanglet
Copy link
Contributor

That's kind of the point of a breaking change in a major release, authors will have to revisit their code to account for the new behavior :)

There is are big differences between BC-break in code, and BC-break in phpdoc.

The Phpdoc is shared between multiple tools: Psalm, PHPStan, Phpstorm, ...
If for one tools array{} means sealed array and for another it means unsealed array, there will be conflicts and false positive.
The issue already exists between Psalm 4 users and Psalm 5 users.

If, as a psalm 5 user, I use a library which still annotate the code with Psalm 4, I'll get multiple false positive.
If I provide a PR to change the Phpdoc, I'll break the Phpdoc for Psalm 4 users (and PHPStan users).
It's basically like asking to add psalm version in the require or conflict section of every libraries.

Having at least an option to control the meaning of array{foo: string} would have been helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants