New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement literal analysis for all pure built-in string and array functions #5723
Conversation
Heh, sometimes I forget that I'm not using phabel)))) |
This will likely require some limits to be put in place, to make sure performance doesn't drop too much. |
For safety reason, we should make sure we only execute functions that are declare pure in psalm. It should be a test to make sure we don't add impure functions |
Of course, I'll add limitations for the pad and range functions. If you were talking about the recursive static analysis, of course limitations should also be put in place to avoid excessive depth and width of the recursion tree (the first could actually be emitted as an issue on its own).
This is already in place, along with a manual whitelist of allowed functions. |
I'm not sure how I feel about this. Static analysis should mean static analysis. While Psalm does a few calculations inline ( More importantly, I'm not convinced that much code would benefit from this — it's exceptionally rare that Do you have an example of real-world code that justifies this sort of enhancement? |
To be honest, I too thought of this when writing this module. About native language operators:
That may actually be broken, see #5722 :)
The main reason for this PR is the upcoming recursive static analysis module :) |
Not sure it's related, but sometimes Psalm's tests doesn't display the most precise syntax. It is possible that |
Yes — you can use |
You've mentioned this a couple of times — what real-world bugs will this detect that Psalm cannot? |
A simple example, the one that inspired me to write this PR, is localization: in our project, localization is cached to memcached, but originally it's stored in a simple set of PHP files, that are parsed on runtime if a cache miss occurs. |
Sorry, would you mind providing example code for that? |
A basic example: <?php
function i18n(string $key): string
{
static $cache = [];
static $phpCache = null;
if (array_key_exists($key, $cache)) {
return $cache[$key];
}
$phpCache ??= include 'en_EN.php';
return $phpCache[$key];
}
echo i18n('init').PHP_EOL;
echo i18n('welcome').PHP_EOL; en_EN.php <?php
return ['init' => 'Hello!']; The real code is a tad bit more complex (even better suited for semantic analysis), but the core logic is the same. |
I'm going to close this, I'm afraid — this is not something that I think would be useful to a wide number of Psalm users. Instead I'd encourage you to consider writing your own plugin instead. |
@danog it would be interesting to check the result of the "check against real project" in the CI to see if the PR would have detected interesting issues in the wild |
Just ran them, and as expected, nothing relevant was found, at least with the default levels. |
This PR allows inferring the exact return type of a whitelisted subset of pure built-in functions, by actually executing them during static analysis.
This all started when I implemented proper return types for literal values in the ExplodeReturnTypeProvider: after implementing it manually, I realized that a similar but more generic approach could be used for all pure functions, with due precautions: this is the implementation so far.
Basically, when a call to a whitelisted built-in pure function occurs, all and only the literal types of the parameters provided to the function are extracted, and permutated to generate all combinations of possible exact return types.
If any of the parameter unions contains 0 literals, execution does not occur.
Execution is also skipped if we're running on a version of PHP that does not provide the required function (to make this reproducible across all versions, I suggest adding the symfony polyfills as runtime dependency).
The real underlying reason for this MR is an upcoming implementation of recursive static analysis in psalm, which should allow even more insight into complex systems, by similarly recursively replacing parameter types of functions with the exact literal value passed in a specific call, generating full call stacks and analyzing possible paradoxes/other issues of each single call stack.
Of course, this would also require inferring the exact literal return type of built-in functions, which is what this MR provides.
Generics and conditional types provide a similar, yet much less powerful solution: with recursive static analysis, finding bugs will be even easier, and will not require annotating every single function with generics and rewriting the behavior as a generic conditional.