Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for references and improve UnusedVariable checks (fixes #7254). #7339

Merged
merged 9 commits into from Jan 26, 2022
3 changes: 3 additions & 0 deletions UPGRADING.md
Expand Up @@ -152,6 +152,8 @@
- [BC] Atomic::__toString() is now final
- [BC] Atomic::__toString() now returns a more detailed version of the type (it calls getId() under the hood)
- [BC] Atomic::getId() has now a first param $exact. Calling the method with false will return a less detailed version of the type in some cases (similarly to what __toString used to return)
- [BC] To remove a variable from the context, Context::remove(). Calling
`unset($context->vars_in_scope[$var_id])` can cause problems when using references.

## Removed
- [BC] Property `Psalm\Codebase::$php_major_version` was removed, use
Expand Down Expand Up @@ -194,3 +196,4 @@
- [BC] Method `Psalm\Issue\CodeIssue::getMessage()` was removed
- [BC] Method `Psalm\DocComment::parse()` was removed
- [BC] Class `Psalm\Type\Atomic\THtmlEscapedString` has been removed
- [BC] Property `Psalm\Context::$vars_from_global` has been renamed to `$referenced_globals`
1 change: 1 addition & 0 deletions config.xsd
Expand Up @@ -413,6 +413,7 @@
<xs:element name="RedundantIdentityWithTrue" type="IssueHandlerType" minOccurs="0" />
<xs:element name="RedundantPropertyInitializationCheck" type="IssueHandlerType" minOccurs="0" />
<xs:element name="ReferenceConstraintViolation" type="IssueHandlerType" minOccurs="0" />
<xs:element name="ReferenceReusedFromConfusingScope" type="IssueHandlerType" minOccurs="0" />
<xs:element name="ReservedWord" type="IssueHandlerType" minOccurs="0" />
<xs:element name="StringIncrement" type="IssueHandlerType" minOccurs="0" />
<xs:element name="TaintedCallable" type="IssueHandlerType" minOccurs="0" />
Expand Down
1 change: 1 addition & 0 deletions docs/running_psalm/issues.md
Expand Up @@ -216,6 +216,7 @@
- [RedundantIdentityWithTrue](issues/RedundantIdentityWithTrue.md)
- [RedundantPropertyInitializationCheck](issues/RedundantPropertyInitializationCheck.md)
- [ReferenceConstraintViolation](issues/ReferenceConstraintViolation.md)
- [ReferenceReusedFromConfusingScope](issues/ReferenceReusedFromConfusingScope.md)
- [ReservedWord](issues/ReservedWord.md)
- [StringIncrement](issues/StringIncrement.md)
- [TaintedCallable](issues/TaintedCallable.md)
Expand Down
39 changes: 39 additions & 0 deletions docs/running_psalm/issues/ReferenceReusedFromConfusingScope.md
@@ -0,0 +1,39 @@
# ReferenceReusedFromConfusingScope

Emitted when a reference assigned in a potentially confusing scope is reused later.
Since PHP doesn't scope loops and ifs the same way most other languages do, a common issue is the reuse of a variable
declared in such a scope. Usually it doesn't matter, because a reassignment to an already-defined variable will just
reassign it, but if the already-defined variable is a reference it will change the value of the referred to variable.

```php
<?php

$arr = [1, 2, 3];
foreach ($arr as &$i) {
++$i;
}

// ...more code, after which the usage of $i as a reference has been forgotten by the programmer

for ($i = 0; $i < 10; ++$i) {
echo $i;
}
// $arr is now [2, 3, 10] instead of the expected [2, 3, 4]
```

To fix the issue, unset the reference after using it in such a scope:

```php
<?php

$arr = [1, 2, 3];
foreach ($arr as &$i) {
++$i;
}
unset($i);

for ($i = 0; $i < 10; ++$i) {
echo $i;
}
// $arr is correctly [2, 3, 4]
```
4 changes: 1 addition & 3 deletions src/Psalm/Codebase.php
Expand Up @@ -1966,9 +1966,7 @@ public function addTaintSource(

$this->taint_flow_graph->addSource($source);

$expr_type->parent_nodes = [
$source->id => $source,
];
$expr_type->parent_nodes[$source->id] = $source;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really that closely related to everything else in this PR, but I'm pretty sure it's a correct change. I'm not familiar enough with the taint feature to know how to test it, but I worry that without this change some of the taint stuff might mess up the parent_nodes for a reference.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix that when we have a reported bug then. I see we have multiple instances of both ways of declaring parent_nodes and I'm not sure when to choose which

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was leaning that way originally, but it looks like this only exists for plugins to use, and based on the name it's supposed to add a source, not remove all other sources as well.
I think the only time parent_nodes should be set to an array rather than setting an offset is when the previous nodes are known to be wrong, such as when assigning a variable.

$foo = 'a';
$foo = 'b'; // Should no longer have a parent node for `$foo = 'a'`

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear on what can cause a variable to have multiple parents though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure about that myself. I utilized multiple parents for references, but outside of that I don't know what use it has, other than it probably has something to do with taints.

}

/**
Expand Down
132 changes: 118 additions & 14 deletions src/Psalm/Context.php
Expand Up @@ -13,9 +13,12 @@
use Psalm\Type\Atomic\DependentType;
use Psalm\Type\Atomic\TArray;
use Psalm\Type\Union;
use RuntimeException;

use function array_keys;
use function array_search;
use function array_shift;
use function assert;
use function count;
use function in_array;
use function is_int;
Expand All @@ -40,6 +43,49 @@ class Context
*/
public $vars_possibly_in_scope = [];

/**
* Keeps track of how many times a var_in_scope has been referenced. May not be set for all $vars_in_scope.
*
* @var array<string, int<0, max>>
*/
public $referenced_counts = [];

/**
* Maps references to referenced variables for the current scope.
* With `$b = &$a`, this will contain `['$b' => '$a']`.
*
* All keys and values in this array are guaranteed to be set in $vars_in_scope.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly confident I caught all the places a var_in_scope might be unset, but if this isn't true for some reason I consider that a bug to be fixed.

*
* To check if a variable was passed or returned by reference, or
* references an object property or array item, see Union::$by_ref.
*
* @var array<string, string>
*/
public $references_in_scope = [];

/**
* Set of references to variables in another scope. These references will be marked as used if they are assigned to.
*
* @var array<string, true>
*/
public $references_to_external_scope = [];

/**
* A set of globals that are referenced somewhere.
*
* @var array<string, true>
*/
public $referenced_globals = [];

/**
* A set of references that might still be in scope from a scope likely to cause confusion. This applies
* to references set inside a loop or if statement, since it's easy to forget about PHP's weird scope
* rules, and assinging to a reference will change the referenced variable rather than shadowing it.
*
* @var array<string, CodeLocation>
*/
public $references_possibly_from_confusing_scope = [];

/**
* Whether or not we're inside the conditional of an if/where etc.
*
Expand Down Expand Up @@ -362,11 +408,6 @@ class Context
*/
public $has_returned = false;

/**
* @var array<string, bool>
*/
public $vars_from_global = [];

/**
* @var array<string, true>
*/
Expand Down Expand Up @@ -449,6 +490,28 @@ public function update(
}
}

/**
* Updates the list of possible references from a confusing scope,
* such as a reference created in an if that might later be reused.
*/
public function updateReferencesPossiblyFromConfusingScope(
Context $confusing_scope_context,
StatementsAnalyzer $statements_analyzer
): void {
$references = $confusing_scope_context->references_in_scope
+ $confusing_scope_context->references_to_external_scope;
foreach ($references as $reference_id => $_) {
if (!isset($this->references_in_scope[$reference_id])
&& !isset($this->references_to_external_scope[$reference_id])
&& $reference_location = $statements_analyzer->getFirstAppearance($reference_id)
) {
$this->references_possibly_from_confusing_scope[$reference_id] = $reference_location;
}
}
$this->references_possibly_from_confusing_scope +=
$confusing_scope_context->references_possibly_from_confusing_scope;
}

/**
* @param array<string, Union> $new_vars_in_scope
*
Expand Down Expand Up @@ -496,19 +559,60 @@ public static function getNewOrUpdatedVarIds(Context $original_context, Context
return $redefined_var_ids;
}

public function remove(string $remove_var_id): void
public function remove(string $remove_var_id, bool $removeDescendents = true): void
{
unset(
$this->referenced_var_ids[$remove_var_id],
$this->vars_possibly_in_scope[$remove_var_id]
);

if (isset($this->vars_in_scope[$remove_var_id])) {
$existing_type = $this->vars_in_scope[$remove_var_id];
unset($this->vars_in_scope[$remove_var_id]);

$this->removeDescendents($remove_var_id, $existing_type);
if ($removeDescendents) {
$this->removeDescendents($remove_var_id, $existing_type);
}
}
$this->removePossibleReference($remove_var_id);
unset($this->vars_possibly_in_scope[$remove_var_id]);
}

/**
* Remove a variable from the context which might be a reference to another variable, or
* referenced by another variable. Leaves the variable as possibly-in-scope, unlike remove().
*/
public function removePossibleReference(string $remove_var_id): void
{
if (isset($this->referenced_counts[$remove_var_id]) && $this->referenced_counts[$remove_var_id] > 0) {
// If a referenced variable goes out of scope, we need to update the references.
// All of the references to this variable are still references to the same value,
// so we pick the first one and make the rest of the references point to it.
$references = [];
foreach ($this->references_in_scope as $reference => $referenced) {
if ($referenced === $remove_var_id) {
$references[] = $reference;
unset($this->references_in_scope[$reference]);
}
}
assert(!empty($references));
$first_reference = array_shift($references);
if (!empty($references)) {
$this->referenced_counts[$first_reference] = count($references);
foreach ($references as $reference) {
$this->references_in_scope[$reference] = $first_reference;
}
}
}
if (isset($this->references_in_scope[$remove_var_id])) {
$reference_count = &$this->referenced_counts[$this->references_in_scope[$remove_var_id]];
if ($reference_count < 1) {
throw new RuntimeException("Incorrect referenced count found");
}
--$reference_count;
}
unset(
$this->vars_in_scope[$remove_var_id],
$this->referenced_var_ids[$remove_var_id],
$this->referenced_counts[$remove_var_id],
$this->references_in_scope[$remove_var_id],
$this->references_to_external_scope[$remove_var_id],
);
}

/**
Expand Down Expand Up @@ -645,7 +749,7 @@ public function removeDescendents(

foreach ($this->vars_in_scope as $var_id => $type) {
if (preg_match('/' . preg_quote($remove_var_id, '/') . '[\]\[\-]/', $var_id)) {
unset($this->vars_in_scope[$var_id]);
$this->remove($var_id, false);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this wasn't unsetting $this->vars_possibly_in_scope before, but it works fine this way. I'm guessing it was just overlooked? Or maybe $this->vars_possibly_in_scope is guaranteed to not contain descendents? (side note, that should actually be descendant, descendent is an adjective with a subtly different meaning).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, vars_possibly_in_scope is just an array of bools. It's nothing without vars_in_scope so yeah, probably overlooked and never caused issue

}

foreach ($type->getAtomicTypes() as $atomic_type) {
Expand Down Expand Up @@ -676,7 +780,7 @@ public function removeMutableObjectVars(bool $methods_only = false): void
}

foreach ($vars_to_remove as $var_id) {
unset($this->vars_in_scope[$var_id], $this->vars_possibly_in_scope[$var_id]);
$this->remove($var_id, false);
}

$clauses_to_keep = [];
Expand Down
1 change: 1 addition & 0 deletions src/Psalm/Internal/Analyzer/ClosureAnalyzer.php
Expand Up @@ -159,6 +159,7 @@ public static function analyzeExpression(

if ($use->byRef) {
$use_context->vars_in_scope[$use_var_id]->by_ref = true;
$use_context->references_to_external_scope[$use_var_id] = true;
}

$use_context->vars_possibly_in_scope[$use_var_id] = true;
Expand Down
37 changes: 9 additions & 28 deletions src/Psalm/Internal/Analyzer/FunctionLikeAnalyzer.php
Expand Up @@ -771,7 +771,7 @@ public function analyze(

foreach ($context->vars_in_scope as $var => $_) {
if (strpos($var, '$this->') !== 0 && $var !== '$this') {
unset($context->vars_in_scope[$var]);
$context->removePossibleReference($var);
}
}

Expand Down Expand Up @@ -854,10 +854,6 @@ private function checkParamReferences(
throw new UnexpectedValueException('$position should not be false here');
}

if ($storage->params[$position]->by_ref) {
continue;
}

if ($storage->params[$position]->promoted_property) {
continue;
}
Expand Down Expand Up @@ -984,6 +980,7 @@ private function processParams(
$project_analyzer = $statements_analyzer->getProjectAnalyzer();

foreach ($params as $offset => $function_param) {
$function_param_id = '$' . $function_param->name;
$signature_type = $function_param->signature_type;
$signature_type_location = $function_param->signature_type_location;

Expand Down Expand Up @@ -1087,7 +1084,7 @@ private function processParams(
&& !$function_param->type->isBool())
) {
$param_assignment = DataFlowNode::getForAssignment(
'$' . $function_param->name,
$function_param_id,
$function_param->location
);

Expand All @@ -1105,16 +1102,6 @@ private function processParams(
$statements_analyzer->data_flow_graph->addPath($type_source, $param_assignment, 'param');
}

if ($function_param->by_ref
&& $codebase->find_unused_variables
) {
$statements_analyzer->data_flow_graph->addPath(
$param_assignment,
new DataFlowNode('variable-use', 'variable use', null),
'variable-use'
);
}

if ($storage->variadic) {
$this->param_nodes += [$param_assignment->id => $param_assignment];
}
Expand All @@ -1123,18 +1110,19 @@ private function processParams(
}
}

$context->vars_in_scope['$' . $function_param->name] = $var_type;
$context->vars_possibly_in_scope['$' . $function_param->name] = true;
$context->vars_in_scope[$function_param_id] = $var_type;
$context->vars_possibly_in_scope[$function_param_id] = true;

if ($function_param->by_ref) {
$context->vars_in_scope['$' . $function_param->name]->by_ref = true;
$context->vars_in_scope[$function_param_id]->by_ref = true;
$context->references_to_external_scope[$function_param_id] = true;
}

$parser_param = $this->function->getParams()[$offset] ?? null;

if ($function_param->location) {
$statements_analyzer->registerVariable(
'$' . $function_param->name,
$function_param_id,
$function_param->location,
null
);
Expand Down Expand Up @@ -1170,7 +1158,7 @@ private function processParams(

IssueBuffer::maybeAdd(
new MismatchingDocblockParamType(
'Parameter $' . $function_param->name . ' has wrong type \'' . $param_type .
'Parameter ' . $function_param_id . ' has wrong type \'' . $param_type .
'\', should be \'' . $signature_type . '\'',
$function_param->type_location
),
Expand Down Expand Up @@ -1271,13 +1259,6 @@ private function processParams(
}
}

if ($function_param->by_ref) {
// register by ref params as having been used, to avoid false positives
// @todo change the assignment analysis *just* for byref params
// so that we don't have to do this
$context->hasVariable('$' . $function_param->name);
}

foreach ($function_param->attributes as $attribute) {
AttributeAnalyzer::analyze(
$this,
Expand Down
Expand Up @@ -129,6 +129,7 @@ function (Clause $c) use ($mixed_var_ids): bool {
$negated_while_types,
[],
$inner_loop_context->vars_in_scope,
$inner_loop_context->references_in_scope,
$changed_var_ids,
[],
$statements_analyzer,
Expand Down