New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve iterparentnodeids
to consume /
parts until the first ::
#8577
Conversation
…th a / in its name by checking if there is :: before the selected / or any :: after. Also added a test case for this.
How do I go about fixing these errors? |
The python 3.10 are unrelated and we ignore them as others fix it this weekend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, im under the impression we have a follow-up coming
Thanks!
testing/test_nodes.py
Outdated
@@ -23,6 +23,8 @@ | |||
("a/b/c::D/d::e", ["", "a", "a/b", "a/b/c::D", "a/b/c::D/d", "a/b/c::D/d::e"]), | |||
# : alone is not a separator. | |||
("a/b::D:e:f::g", ["", "a", "a/b", "a/b::D:e:f", "a/b::D:e:f::g"]), | |||
# / not considered if a part of a test name | |||
("a/b::c/d::e[/test]", ["", "a", "a/b::c", "a/b::c/d", "a/b::c/d::e[/test]"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im under the impression a/b
is missing as element
But thats for a follow-up as the same detail applies to other nodes as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, but I believe that is apart of the functionality of the iterparentnodeids
function because on line 65 of src/_pytest/nodes.py
@bluetech put a comment that says
Line 65 in 1fc4506
Note that :: parts are only considered at the last / component. |
So that any
::
between /
would be considered as a part of the path.Unless you mean this may need to be reconsidered, then I apologize for this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean this may need a reconsideration much later, your pr is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay. I understand. For this to be merged do I just wait for @bluetech to review it? I am new to the process, sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay. I understand. For this to be merged do I just wait for @bluetech to review it? I am new to the process, sorry.
Right -- I will try to review it today or tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @parthdpa!
In the issue you described the solution as:
I return the path only if there is :: after or there are no :: before.
I'm having a bit of trouble wrapping my head around the new behavior and relating it to the code. It might be good but I think maybe refactoring the code or adding some explanation to the docstring would help.
To maybe help with that, let me try to reconstruct how I got to this code in the first place. I was probably thinking of a grammar like the following, using the longest-match/"eager" rule to resolve ambiguities:
SEP_PART = [^/]* # any text without /
COLON_PART = [^::]* # any text without :: (not proper regex syntax...)
NODEID = "" (SEP_PART "/")* (COLON_PART "::")* COLON_PART
which translates to something like this:
def iterparentnodeids(nodeid: str) -> Iterator[str]:
pos = 0
# The root Session node - always present.
yield ""
# Eagerly consume SEP parts.
while True:
at = nodeid.find(SEP, pos)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len(SEP)
# Eagerly consume :: parts.
while True:
at = nodeid.find("::", pos)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len("::")
# The node ID itself.
if nodeid:
yield nodeid
Then for some reason the code tries to be too clever by half and merges the two loops, but that seems useless now, the above code is probably better and easier to understand. So I'd start with that.
My thought was to change the rule such that we eagerly consume /
parts until the first ::
.
Taking the a/b/c::D/d::e
test as an example, that would change the interpretation from
["", "a", "a/b", "a/b/c::D", "a/b/c::D/d", "a/b/c::D/d::e"]
to
["", "a", "a/b", "a/b/c", "a/b/c::D/d", "a/b/c::D/d::e"]
i.e. the path parts cannot contain ::
.
But maybe your idea is better, with a bit more explaining so I can understand it :)
Thank you for the reply @bluetech
This seems better. I was assuming that I should keep the same behavior so nothing breaks. Clarifying QuestionsI think what needed to be clarified is the possibilities of paths when running pytest (but these questions are because of my little experience of using pytest). Is it ever possible for a path to only contain just My CodeWhat lead to my implementation was keeping all existing tests passing.
After finding a location i.e.
or
but not like this
as I am now following an assumption that anything after the last
so as to consume I think what would lead to this new implementation would be answers of "no" to my clarifying questions above. Sorry about how the code looks, I do not know how to make the code colors and nice looking like you do above. Can you let me know how? |
To create a "code block" you do
To add syntax highlighting, you need to specify the programming language. In pytest's case that's
You can see here for more details.
Right - the issue is not about a bug in the code, it's about the implemented behavior, so if we change the behavior, it's expected that some tests would require changes.
I think some of these are not possible in practice, but we get enough oddities and allow enough flexibility that I think that in general we should assume as little as possible. Basically, the function should handle any arbitrary string as a node ID.
I think this would have the desired outcome, but the code becomes a bit complicated and I'd need to spend some time with it. For me it would be easier if you try to start with the code in #8577 (review), which implements the existing behavior in the |
def iterparentnodeids(nodeid: str) -> Iterator[str]:
pos = 0
# The root Session node - always present.
yield ""
# Eagerly consume SEP parts.
while True:
at = nodeid.find(SEP, pos)
if at == -1:
break
if at > 0:
# CHANGED HERE
# Only consume / until the first ::
first_colons = nodeid.find("::", pos, at)
# break loop and consider :: when the first :: is encountered
if first_colons != -1:
break
# Otherwise yield the consumed / part
yield nodeid[:at]
pos = at + len(SEP)
# Eagerly consume :: parts.
while True:
at = nodeid.find("::", pos)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len("::")
# The node ID itself.
if nodeid:
yield nodeid I have implemented the behavior to the code above in the first while loop and added comments where I changed the functionality to consume |
@parthdpa that looks good! I think you can simplify a bit this way:
I think this will have the same outcome but a bit simpler/more efficient. |
def iterparentnodeids(nodeid: str) -> Iterator[str]:
pos = 0
first_colons = nodeid.find("::")
# The root Session node - always present.
yield ""
# Eagerly consume SEP parts until first colons.
while True:
at = nodeid.find(SEP, pos, first_colons)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len(SEP)
# Eagerly consume :: parts.
while True:
at = nodeid.find("::", pos)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len("::")
# The node ID itself.
if nodeid:
yield nodeid So like this? A question I have about this is if if first_colons == -1:
at = nodeid.find(SEP, pos) or something of the like? |
Right, if we leave it as first_colons: Optional[int] = nodeid.find("::")
if first_colons == -1:
first_colons = None I added the The value |
def iterparentnodeids(nodeid: str) -> Iterator[str]:
pos = 0
first_colons: Optional[int] = nodeid.find("::")
if first_colons == -1:
first_colons = None
# The root Session node - always present.
yield ""
# Eagerly consume SEP parts until first colons.
while True:
at = nodeid.find(SEP, pos, first_colons)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len(SEP)
# Eagerly consume :: parts.
while True:
at = nodeid.find("::", pos)
if at == -1:
break
if at > 0:
yield nodeid[:at]
pos = at + len("::")
# The node ID itself.
if nodeid:
yield nodeid Like this? The |
Yes, that looks good to me!
Yes, let's update the PR to this and update the tests accordingly (you can also add other test cases if you can think of them or want to test something specifically). Then we'll see how this approach actually works out :) |
…consider ::. Tests were changed to reflect this.
@bluetech I have made the changes and pushed here |
iterparentnodeids
to consume /
parts until the first ::
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this is good to go! I only left a suggestion on the changelog.
In the future we should definitely explore @RonnyPfannschmidt suggestion to use something structured and information-preserving for the node IDs, so that we can fix all such issues, but for now hopefully the tradeoff we're making here is the better one.
@RonnyPfannschmidt this is somewhat different than when you first reviewed, can you PTAL again? |
Im currently down with a cold, if its good for you let's go for it Long term we want a more nuanced datastructure to help us with the ambiguous details |
OK. @parthdpa I'll take the liberty of applying my suggestion, will merge after CI is green. |
Fix for issue #8509 and improve
iterparentnodeids
to consume/
parts until the first::