Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logic of variable lookup (LookupMixin._filter_stmts) #1111

Merged
merged 3 commits into from Aug 2, 2021

Conversation

david-yz-liu
Copy link
Contributor

@david-yz-liu david-yz-liu commented Jul 22, 2021

Steps

  • For new features or bug fixes, add a ChangeLog entry describing what your PR does.
  • Write a good description on what the PR does.

Description

My interest in astroid's lookup algorithm was piqued so I found some outstanding issues to investigate. This PR contains three logical changes, but I decided to submit just one PR because my fixes are all in the same function, LookupMixin._filter_stmts, which is the main helper method for determining which assignment/delete statements are "relevant" for a given variable.

To help review, I split up the PR into three different commits, one for each issue. If you'd like, I can also make three separate PRs from them, no big deal. I also added a bunch of test cases for each one---including some test cases for existing correct behaviour, as I wasn't sure whether these cases for lookup were being tested indirectly elsewhere in the test suite. Happy to delete some tests I added if they're redundant.

  1. False positive undefined-loop-variable with pylint 2.5.3 pylint#3711. The issue here was that an assignment statement to a variable in one branch of an if statement overrode statements associated with a use of that variable in another branch:

    >>> import astroid
    >>> ast = astroid.parse("""
    ...     x = 10
    ...     if x > 10:
    ...         x = 100
    ...     elif x > 0:
    ...         x = 200
    ... """)
    >>> node = [n for n in ast.nodes_of_class(astroid.Name) if n.name == 'x'][1]  # the x in x > 0
    >>> node.lookup('x')  # This should return the `x` from `x = 10`
    (<Module.builtins l.0 at 0x1f3b2e1fdf0>, ())

    Fixed it by explicitly ignoring statements that "are_exclusive" with the variable in _filter_stmts. Note that there had previously been a check for this, I just moved the check up a bit.

  2. Scope lookup problem when using closures #180. The issue here was that AssignName nodes representing function parameters didn't have a "_parent_stmts" value consistent with the other nodes in the function's body: doing node.statement().parent returns the parent of the FunctionDef, which won't result in a "match" at this check in _filter_stmts:

    https://github.com/PyCQA/astroid/blob/22e0cdcfcdff3ea80142ba6b90758b42a85ed8e0/astroid/node_classes.py#L1178-L1179

    This results in the following error:

    >>> ast = astroid.parse("""
    ...     def f(x):
    ...         x = 100
    ...         if True:
    ...             print(x)
    ... """)
    >>> name = [n for n in ast.nodes_of_class(astroid.Name) if n.name == 'x'][0]
    >>> name.lookup('x')  # should only be the x = 100
    (<FunctionDef.f2 l.1 at 0x1f3b39b4dc0>, [<AssignName.x l.2 at 0x1f3b39b45b0>, <AssignName.x l.3 at 0x1f3b39b4b80>])

    One subtlety: because of this check,

    https://github.com/PyCQA/astroid/blob/22e0cdcfcdff3ea80142ba6b90758b42a85ed8e0/astroid/node_classes.py#L1222-L1225

    the bug only occurs when the variable being used does not have the same parent as the assignment statement overwriting the function parameter, which is why I included the strange if True: above. (Comment: I wonder if this check was actually an attempt to fix this same problem?)

    The fix I added was to have a special case for Arguments nodes when updating _stmt_parents. Not a big fan of having special cases like this, but not sure if there's a cleaner way.

  3. (Edit: the change is inspired by, but does not fix, the issue) pylint doesn't understand the scope of except clause variables pylint#626. The issue here is pretty straightforward, ExceptHandlers have their own local scope for the exception variable, as of Python 3. From the Python spec:

    When an exception has been assigned using as target, it is cleared at the end of the except clause.

    Here's an example:

    >>> ast = astroid.parse("""
    ...     try:
    ...         1 / 0
    ...     except ZeroDivisionError as e:
    ...         pass
    ...     print(e)
    ... """)
    >>> name = [n for n in ast.nodes_of_class(astroid.Name) if n.name == 'e'][0]
    >>> name.lookup('e')  # Should be empty
    (<Module l.0 at 0x1f3b39bbc70>, [<AssignName.e l.4 at 0x1f3b39bbd00>])

    My fix here was a bit clunky, adding a special case to reset the accumulated assignments in _filter_stmts if the variable being was was inside the ExceptHandler that bound that variable, and otherwise to skip that AssignName. I think it would be possible (preferred?) to make ExceptHandler its own scope (and subclass LocalsDictNodeNG), but this was a bigger change and so didn't want to make that change without discussion.

Type of Changes

Type
🐛 Bug fix

Related Issue

Closes pylint-dev/pylint#3711
Closes #180
Closes Ref pylint-dev/pylint#626. The issue still occurs in pylint, see #1111 (comment).

@Pierre-Sassoulas Pierre-Sassoulas added Bug 🪳 topic-control-flow Control flow understanding labels Jul 29, 2021
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.6.6 milestone Jul 29, 2021
@Pierre-Sassoulas Pierre-Sassoulas self-requested a review July 30, 2021 19:26
@Pierre-Sassoulas Pierre-Sassoulas added the pylint-tested PRs that don't cause major regressions with pylint label Jul 30, 2021
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thank you for the very detailed merge request. I tried to include that in pylint and this does fix 3711 (which was broken before see the tests suite in pylint-dev/pylint#4777. ), but regarding pylint's 626 I don't think it does the result stay the same. Or maybe I did not understand something ?

I get the following:

____________________________________________________________________________________ test_functional[unused_variable] _____________________________________________________________________________________

self = <pylint.testutils.lint_module_test.LintModuleTest object at 0x7fe67f24d640>

    def runTest(self):
>       self._runTest()
E       AssertionError: Wrong results for file "unused_variable":
E       
E       Expected in testdata:
E        166: unused-variable
E        168: undefined-variable

@david-yz-liu
Copy link
Contributor Author

david-yz-liu commented Jul 30, 2021

No @Pierre-Sassoulas that's totally my bad. I got so excited about the change that I must have messed up testing with pylint.

After double-checking just now: The change in the third commit does fix an astroid error in how exceptions are handled, and in particular the test test_except_var_after_block_single [1] does currently fail on master. I incorrectly thought that fixing the bug here would also fix the Pylint error, but that's not the case---the latter requires additional work in the VariablesChecker I think, or perhaps making ExceptHandler its own scope.

Anyway, I see two different options:

  1. I can remove the third commit and just keep the first two commits, closing the first two issues.
  2. I can keep the third commit and just remove references to the Pylint issue 626, since that's still outstanding.

In all cases, I can rebase and force push, or make new commits on top of this branch, or create a separate branch/PR. Not sure what workflow you'd prefer.

Let me know what you'd like me to do, and sorry again for the inconvenience!

[1] Note: I just made a small commit beba399 clarifying the test that fails on master; there are a few others as well. (Edit: squashed the commit with a force push, but the tests reflect this comment.)

@Pierre-Sassoulas
Copy link
Member

Thank you for the explanation. I think we can keep the third commit, and only remove the "Closes 626" from the changelog. Could you provide the use case the third commit fixes so I can add it in #4777, please ? :)

@david-yz-liu
Copy link
Contributor Author

Updated to remove the reference to issue 626. Here's an example of an improvement to pylint:

def main(lst):
    try:
        raise ValueError
    except ValueError as e:
        pass

    for e in lst:
        pass

    # e will be undefined if lst is empty
    print(e)

main([])  # Raises UnboundLocalError

Currently pylint doesn't report any errors with the above code, but after this change, it correctly reports

$ python -m pylint test.py
************* Module test
test.py:11:10: W0631: Using possibly undefined loop variable 'e' (undefined-loop-variable)

@Pierre-Sassoulas
Copy link
Member

Thank for the example ! pylint did not raise anything before (Old result available on the pylint MR, I updated the tests suite with your example) and now rightly raise an undefined-loop-variable on print(e). I wonder if it should be undefined-variable instead ? If that's an issue, do you think it's an issue with astroid or should we change a check in pylint ?

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.6.6, 2.7.0 Aug 1, 2021
@Pierre-Sassoulas
Copy link
Member

Regarding the conflict I had during merge, I think a change was introduced independentely and I took your version. Do not hesitate to rebase and fix the conflict yourself if I did that wrong :)

@david-yz-liu
Copy link
Contributor Author

@Pierre-Sassoulas thank you!

Regarding the error, I think undefined-loop-variable is appropriate---the same error is reported by pylint if you remove the try block altogether, i.e. on

def main(lst):
    for e in lst:
        pass

    # e will be undefined if lst is empty
    print(e)  # also undefined-loop-variable

main([])  # Raises UnboundLocalError

My understanding is that's the whole point of this message, so I don't think it's a problem.

Regarding the merge conflict, there was a conflict because #1097 changed the code right above my change in _filter_stmts. I resolved the conflict just by keeping both sets of changes, so you can see the diff now doesn't affect #1097's change at all. I force-pushed so the branch is back to the original 3-commit structure. I think it's good to merge :)

@Pierre-Sassoulas
Copy link
Member

Make sense, thank you for clearing that up :) ! And thank you for the huge work on this, this is going to be the main feature of 2.6.6 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🪳 pylint-tested PRs that don't cause major regressions with pylint topic-control-flow Control flow understanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

False positive undefined-loop-variable with pylint 2.5.3 Scope lookup problem when using closures
2 participants