-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Possible false positive with use-sequence-for-iteration
on in-place sets with dynamic content?
#5788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think you need a set but you're not making the check that hello is in your iterable explicit. Doing so would make your code faster. I took the liberty of removing the print from your example so the benchmark is not majorly measuring the time it takes to print to terminal: Here's the original def strings_and_hello(may_contain_hello):
"""
Prints a set of strings without duplicates, while also ensuring 'hello' always gets
printed
"""
r = ""
for string in {*may_contain_hello, "hello"}:
r += string
return r
iterations = 10000000
for i in range(iterations):
strings_and_hello(["hello", "goodbye"]) # prints both hello and goodbye
strings_and_hello(["goodbye"]) # prints both hello and goodbye
If you can replace your iterable by set in b.py: def strings_and_hello(may_contain_hello):
"""
Prints a set of strings without duplicates, while also ensuring 'hello' always gets
printed
"""
r = ""
if "hello" not in may_contain_hello:
r += "hello"
for string in may_contain_hello:
r += string
return r
iterations = 10000000
for i in range(iterations):
strings_and_hello({"hello", "goodbye"}) # prints both hello and goodbye
strings_and_hello({"goodbye"}) # prints both hello and goodbye
Now that isn't really a fair comparison. Maybe you can't change the input. So using the following file c.py: def strings_and_hello(may_contain_hello):
"""
Prints a set of strings without duplicates, while also ensuring 'hello' always gets
printed
"""
r = ""
may_contain_hello = set(may_contain_hello)
if "hello" not in may_contain_hello:
r += "hello"
for string in may_contain_hello:
r += string
return r
iterations = 10000000
for i in range(iterations):
strings_and_hello(["hello", "goodbye"]) # prints both hello and goodbye
strings_and_hello(["goodbye"]) # prints both hello and goodbye We still have a small perf increase compared to a.py:
Which is why I think the warning is not a false positive. |
I think it is a false positive. It looks like that case was explicitly contemplated: But on the other hand, there was discussion that this warning was only intended for in-place sets. This isn't really an "in place" set anymore if it's deduplicating an input. I read the discussion on #4776, and I think this check was mostly about the nondeterministic ordering you get from iterating a set. Not about performance, really. But for performance, take a look at the most atomic case:
User gets that and says, "okay pylint I trust you, let me cast to list after de-duping": >>> timeit('[x for x in var]', setup='var={1, 2, 3}') # pretend var is dynamic
0.3215374090068508
>>> timeit('[x for x in list(var)]', setup='var={1, 2, 3}')
0.46824162200209685 Note that if the user does ⬆️ , then they will fall afoul of the checker proposed in #4355. |
Unfortunately this won't account for a use case like: def deduplicate_two_lists(input1, input2):
for el in {*input1, *input2}:
do_something() |
Great point @jacobtylerwalls, as we're raising a message on something that is not an in-place set/list the change required are a refactor (like the one I did above), not simply casting to set/list (as evidenced by your benchmark, it just add more thing to do before iterating). I think we need to stick to in-place set/list or make it clearer that a simple cast is not the solution. |
@Pierre-Sassoulas is the task for this issue just updating the documentation or actually updating |
I think we need to also change the behavior to target only inlined set. |
Perhaps by ignoring sets that have an unpacking with |
seems reasonable that the existing functional test |
Yeah I came to that same realization in #5788 (comment). 👍🏻 |
Bug description
I'm getting a warning for
use-sequence-for-iteration
in my code, but I believe I have a valid reason to use it that should possibly be considered as an exception for the rule.In my code I iterate over in-place sets with some in-place content, and some dynamic content to ensure that I iterate over the in-place content if it doesn't already exist in the dynamic content, while also ensuring I never iterate over the same thing twice.
Example:
Do let me know if you think this is a bad practice!
I can't find much discussion about this warning online (which makes sense considering it was added recently) so I would love to engage and leave a paper trail for the next user who gets this warning for a similar usage.
Command used
Pylint output
Expected behavior
No warning for the line
Pylint version
The text was updated successfully, but these errors were encountered: