Improve `iterparentnodeids` to consume `/` parts until the first `::` #8577

parthdpa · 2021-04-23T04:46:15Z

Fix for issue #8509 and improve iterparentnodeidsto consume / parts until the first ::

…th a / in its name by checking if there is :: before the selected / or any :: after. Also added a test case for this.

parthdpa · 2021-04-24T16:53:53Z

How do I go about fixing these errors?

RonnyPfannschmidt · 2021-04-24T17:07:58Z

The python 3.10 are unrelated and we ignore them as others fix it this weekend

RonnyPfannschmidt

Looks good, im under the impression we have a follow-up coming

Thanks!

RonnyPfannschmidt · 2021-04-24T17:10:57Z

testing/test_nodes.py

@@ -23,6 +23,8 @@
        ("a/b/c::D/d::e", ["", "a", "a/b", "a/b/c::D", "a/b/c::D/d", "a/b/c::D/d::e"]),
        # : alone is not a separator.
        ("a/b::D:e:f::g", ["", "a", "a/b", "a/b::D:e:f", "a/b::D:e:f::g"]),
+        # / not considered if a part of a test name
+        ("a/b::c/d::e[/test]", ["", "a", "a/b::c", "a/b::c/d", "a/b::c/d::e[/test]"]),


Im under the impression a/b is missing as element

But thats for a follow-up as the same detail applies to other nodes as well

I see, but I believe that is apart of the functionality of the iterparentnodeids function because on line 65 of src/_pytest/nodes.py @bluetech put a comment that says

pytest/src/_pytest/nodes.py

Line 65 in 1fc4506

Note that :: parts are only considered at the last / component.

So that any :: between / would be considered as a part of the path.
Unless you mean this may need to be reconsidered, then I apologize for this comment.

i mean this may need a reconsideration much later, your pr is fine

Oh okay. I understand. For this to be merged do I just wait for @bluetech to review it? I am new to the process, sorry.

Oh okay. I understand. For this to be merged do I just wait for @bluetech to review it? I am new to the process, sorry.

Right -- I will try to review it today or tomorrow.

bluetech

Thanks for working on this @parthdpa!

In the issue you described the solution as:

I return the path only if there is :: after or there are no :: before.

I'm having a bit of trouble wrapping my head around the new behavior and relating it to the code. It might be good but I think maybe refactoring the code or adding some explanation to the docstring would help.

To maybe help with that, let me try to reconstruct how I got to this code in the first place. I was probably thinking of a grammar like the following, using the longest-match/"eager" rule to resolve ambiguities:

SEP_PART = [^/]*   # any text without /
COLON_PART = [^::]*  # any text without :: (not proper regex syntax...)
NODEID = "" (SEP_PART "/")* (COLON_PART "::")* COLON_PART

which translates to something like this:

def iterparentnodeids(nodeid: str) -> Iterator[str]:
    pos = 0
    # The root Session node - always present.
    yield ""
    # Eagerly consume SEP parts.
    while True:
        at = nodeid.find(SEP, pos)
        if at == -1:
            break
        if at > 0:
            yield nodeid[:at]
        pos = at + len(SEP)
    # Eagerly consume :: parts.
    while True:
        at = nodeid.find("::", pos)
        if at == -1:
            break
        if at > 0:
            yield nodeid[:at]
        pos = at + len("::")
    # The node ID itself.
    if nodeid:
        yield nodeid

Then for some reason the code tries to be too clever by half and merges the two loops, but that seems useless now, the above code is probably better and easier to understand. So I'd start with that.

My thought was to change the rule such that we eagerly consume / parts until the first ::.
Taking the a/b/c::D/d::e test as an example, that would change the interpretation from

["", "a", "a/b", "a/b/c::D", "a/b/c::D/d", "a/b/c::D/d::e"]

to

["", "a", "a/b", "a/b/c", "a/b/c::D/d", "a/b/c::D/d::e"]

i.e. the path parts cannot contain ::.

But maybe your idea is better, with a bit more explaining so I can understand it :)

parthdpa · 2021-04-25T17:43:06Z

Thank you for the reply @bluetech
Some explanation of my thought process when writing my code, I was trying to keep the same functionality ie. still pass all the test cases that were already present as is, so that nothing drastically changes and only accounts for any / in test names. However, now that you mention

My thought was to change the rule such that we eagerly consume / parts until the first ::.
Taking the a/b/c::D/d::e test as an example, that would change the interpretation from
["", "a", "a/b", "a/b/c::D", "a/b/c::D/d", "a/b/c::D/d::e"]
to
["", "a", "a/b", "a/b/c", "a/b/c::D/d", "a/b/c::D/d::e"]

This seems better. I was assuming that I should keep the same behavior so nothing breaks.

Clarifying Questions

I think what needed to be clarified is the possibilities of paths when running pytest (but these questions are because of my little experience of using pytest). Is it ever possible for a path to only contain just /? i.e. a/b or can a path only ever contain / followed by :: (a/b::c). Also, is it possible for a path to contain :: followed by another / like the above example? i.e. a/b::c/d::D. This does not make sense to me as a tests files function cannot lead to another file with a function like this when pytest gathers tests to run. If these are not possible, then I think some of the test cases become irrelevant as they are not even possible, which would then lead to the implementation you mention (quoted above).

My Code

What lead to my implementation was keeping all existing tests passing.

                 if sep == SEP:
                    colons_after_at = nodeid.find("::", at)
                    colons_before_at = nodeid.find("::", pos, at)
                    if colons_after_at > -1 or colons_before_at == -1:
                        yield nodeid[:at]
                    else:
                        sep = "::"
                        continue
                else:
                    yield nodeid[:at]

After finding a location i.e. at != -1, I check to see if it is a /. If so I follow these steps to fix issues with / in a test name.
I check to see if there is :: before and after the at location. I assumed that a path's / can only exist in a path if / is followed by :: or if there are no :: before the located /.
So a / can exist like this

a/b

or

a/b/c::C/d

but not like this

a/b::c/d

as I am now following an assumption that anything after the last :: is part of the test's name.
But to change my implementation to the idea you mention I can simply change my code to this

                 if sep == SEP:
                    colons_before_at = nodeid.find("::", pos, at)
                    if colons_before_at == -1:
                        yield nodeid[:at]
                    else:
                        sep = "::"
                        continue
                else:
                    yield nodeid[:at]

so as to consume / until the first ::.

I think what would lead to this new implementation would be answers of "no" to my clarifying questions above.

Sorry about how the code looks, I do not know how to make the code colors and nice looking like you do above. Can you let me know how?

bluetech · 2021-04-26T08:02:08Z

Sorry about how the code looks, I do not know how to make the code colors and nice looking like you do above. Can you let me know how?

To create a "code block" you do

```
def my_code_here(): pass
```

To add syntax highlighting, you need to specify the programming language. In pytest's case that's python or py for short. It's specified in the first line:

```py
def my_code_here(): pass
```

You can see here for more details.

This seems better. I was assuming that I should keep the same behavior so nothing breaks.

Right - the issue is not about a bug in the code, it's about the implemented behavior, so if we change the behavior, it's expected that some tests would require changes.

If these are not possible, then I think some of the test cases become irrelevant as they are not even possible, which would then lead to the implementation you mention (quoted above).

I think some of these are not possible in practice, but we get enough oddities and allow enough flexibility that I think that in general we should assume as little as possible. Basically, the function should handle any arbitrary string as a node ID.

So to change it to what you are thinking @bluetech this is what I think will do that. So that until the first ::, / is consumed.

I think this would have the desired outcome, but the code becomes a bit complicated and I'd need to spend some time with it.

For me it would be easier if you try to start with the code in #8577 (review), which implements the existing behavior in the main branch in a slightly different way, and modify that to the new behavior, just because I think it would be easier to evaluate.

parthdpa · 2021-04-26T16:40:12Z

def iterparentnodeids(nodeid: str) -> Iterator[str]:
    pos = 0
    # The root Session node - always present.
    yield ""
    # Eagerly consume SEP parts.
    while True:
        at = nodeid.find(SEP, pos)
        if at == -1:
            break
        if at > 0:
            # CHANGED HERE
            # Only consume / until the first ::
            first_colons = nodeid.find("::", pos, at)
            # break loop and consider :: when the first :: is encountered
            if first_colons != -1:
                        break
           # Otherwise yield the consumed / part
           yield nodeid[:at]
        pos = at + len(SEP)
    # Eagerly consume :: parts.
    while True:
        at = nodeid.find("::", pos)
        if at == -1:
            break
        if at > 0:
            yield nodeid[:at]
        pos = at + len("::")
    # The node ID itself.
    if nodeid:
        yield nodeid

I have implemented the behavior to the code above in the first while loop and added comments where I changed the functionality to consume / until the first ::.

bluetech · 2021-04-27T13:02:27Z

@parthdpa that looks good!

I think you can simplify a bit this way:

At the beginning of the function, do first_colons = nodeid.find("::")
Change at = nodeid.find(SEP, pos) to at = nodeid.find(SEP, pos, first_colons)

I think this will have the same outcome but a bit simpler/more efficient.

parthdpa · 2021-04-27T16:43:31Z

def iterparentnodeids(nodeid: str) -> Iterator[str]:
    pos = 0
    first_colons = nodeid.find("::")
    # The root Session node - always present.
    yield ""
    # Eagerly consume SEP parts until first colons.
    while True:
        at = nodeid.find(SEP, pos, first_colons)
        if at == -1:
            break
        if at > 0:
           yield nodeid[:at]
        pos = at + len(SEP)
    # Eagerly consume :: parts.
    while True:
        at = nodeid.find("::", pos)
        if at == -1:
            break
        if at > 0:
            yield nodeid[:at]
        pos = at + len("::")
    # The node ID itself.
    if nodeid:
        yield nodeid

So like this? A question I have about this is if first_colons is -1, will this be a problem? Since the .find will not look at the last character in the nodeid. Should a check be added like

    if first_colons == -1:
        at = nodeid.find(SEP, pos)

or something of the like?

bluetech · 2021-04-27T16:56:13Z

Right, if we leave it as -1 then it would exclude the last character of nodeid. I think it would be a little nicer to do the check only once outside the loop, so something like this should work:

first_colons: Optional[int] = nodeid.find("::")
if first_colons == -1:
    first_colons = None

I added the Optional[int] type annotation, otherwise first_colons would get type int and the first_colons = None would get a mypy error.

The value None in str.find(end=...) should work for representing "search until the end of the string", as it does in slice notation.

parthdpa · 2021-04-27T17:04:09Z

def iterparentnodeids(nodeid: str) -> Iterator[str]:
    pos = 0
    first_colons: Optional[int] = nodeid.find("::")
    if first_colons == -1:
        first_colons = None
    # The root Session node - always present.
    yield ""
    # Eagerly consume SEP parts until first colons.
    while True:
        at = nodeid.find(SEP, pos, first_colons)
        if at == -1:
            break
        if at > 0:
           yield nodeid[:at]
        pos = at + len(SEP)
    # Eagerly consume :: parts.
    while True:
        at = nodeid.find("::", pos)
        if at == -1:
            break
        if at > 0:
            yield nodeid[:at]
        pos = at + len("::")
    # The node ID itself.
    if nodeid:
        yield nodeid

Like this? The None does work in str.find for searching until the end of the string. And should I update the file to this and fix the test cases accordingly or is this still something we should look over?

bluetech · 2021-04-27T17:14:52Z

Like this?

Yes, that looks good to me!

And should I update the file to this and fix the test cases accordingly or is this still something we should look over?

Yes, let's update the PR to this and update the tests accordingly (you can also add other test cases if you can think of them or want to test something specifically). Then we'll see how this approach actually works out :)

…consider ::. Tests were changed to reflect this.

parthdpa · 2021-04-27T19:51:10Z

@bluetech I have made the changes and pushed here

bluetech

So I think this is good to go! I only left a suggestion on the changelog.

In the future we should definitely explore @RonnyPfannschmidt suggestion to use something structured and information-preserving for the node IDs, so that we can fix all such issues, but for now hopefully the tradeoff we're making here is the better one.

changelog/8509.improvement.rst

bluetech · 2021-04-29T13:05:03Z

@RonnyPfannschmidt this is somewhat different than when you first reviewed, can you PTAL again?

RonnyPfannschmidt · 2021-04-29T13:34:59Z

Im currently down with a cold, if its good for you let's go for it

Long term we want a more nuanced datastructure to help us with the ambiguous details

bluetech · 2021-04-29T13:40:42Z

OK. @parthdpa I'll take the liberty of applying my suggestion, will merge after CI is green.

Fix issue where TestCase.setUpClass is not called for test methods wi…

e9b2b6c

…th a / in its name by checking if there is :: before the selected / or any :: after. Also added a test case for this.

parthdpa changed the title ~~Fix TestCase.setUpClass not being called for test names with / in them for issue #8509~~ Fix TestCase.setUpClass not being called for test names with / in them for versions later than 6.2.0 Apr 23, 2021

removed print statement that was added

1fc4506

nicoddemus requested a review from bluetech April 24, 2021 14:07

RonnyPfannschmidt approved these changes Apr 24, 2021

View reviewed changes

bluetech reviewed Apr 25, 2021

View reviewed changes

Change iterparentnodeids to consume / parts until the first ::. Then …

3526522

…consider ::. Tests were changed to reflect this.

parthdpa changed the title ~~Fix TestCase.setUpClass not being called for test names with / in them for versions later than 6.2.0~~ Improve iterparentnodeids to consume / parts until the first :: Apr 27, 2021

bluetech approved these changes Apr 29, 2021

View reviewed changes

changelog/8509.improvement.rst Outdated Show resolved Hide resolved

Update changelog/8509.improvement.rst

d4303cf

bluetech merged commit 992c403 into pytest-dev:main Apr 29, 2021

delatrie mentioned this pull request Jan 22, 2023

"Set up" and "Tear down" blocks are not displayed if slash used in test parameters allure-framework/allure-python#290

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `iterparentnodeids` to consume `/` parts until the first `::` #8577

Improve `iterparentnodeids` to consume `/` parts until the first `::` #8577

parthdpa commented Apr 23, 2021 •

edited

parthdpa commented Apr 24, 2021

RonnyPfannschmidt commented Apr 24, 2021

RonnyPfannschmidt left a comment

RonnyPfannschmidt Apr 24, 2021

parthdpa Apr 24, 2021 •

edited

RonnyPfannschmidt Apr 24, 2021

parthdpa Apr 24, 2021

bluetech Apr 25, 2021

bluetech left a comment

parthdpa commented Apr 25, 2021 •

edited

bluetech commented Apr 26, 2021

parthdpa commented Apr 26, 2021 •

edited

bluetech commented Apr 27, 2021

parthdpa commented Apr 27, 2021

bluetech commented Apr 27, 2021

parthdpa commented Apr 27, 2021 •

edited

bluetech commented Apr 27, 2021

parthdpa commented Apr 27, 2021

bluetech left a comment

bluetech commented Apr 29, 2021

RonnyPfannschmidt commented Apr 29, 2021

bluetech commented Apr 29, 2021

Improve iterparentnodeids to consume / parts until the first :: #8577

Improve iterparentnodeids to consume / parts until the first :: #8577

Conversation

parthdpa commented Apr 23, 2021 • edited

parthdpa commented Apr 24, 2021

RonnyPfannschmidt commented Apr 24, 2021

RonnyPfannschmidt left a comment

Choose a reason for hiding this comment

RonnyPfannschmidt Apr 24, 2021

Choose a reason for hiding this comment

parthdpa Apr 24, 2021 • edited

Choose a reason for hiding this comment

RonnyPfannschmidt Apr 24, 2021

Choose a reason for hiding this comment

parthdpa Apr 24, 2021

Choose a reason for hiding this comment

bluetech Apr 25, 2021

Choose a reason for hiding this comment

bluetech left a comment

Choose a reason for hiding this comment

parthdpa commented Apr 25, 2021 • edited

Clarifying Questions

My Code

bluetech commented Apr 26, 2021

parthdpa commented Apr 26, 2021 • edited

bluetech commented Apr 27, 2021

parthdpa commented Apr 27, 2021

bluetech commented Apr 27, 2021

parthdpa commented Apr 27, 2021 • edited

bluetech commented Apr 27, 2021

parthdpa commented Apr 27, 2021

bluetech left a comment

Choose a reason for hiding this comment

bluetech commented Apr 29, 2021

RonnyPfannschmidt commented Apr 29, 2021

bluetech commented Apr 29, 2021

Improve `iterparentnodeids` to consume `/` parts until the first `::` #8577

Improve `iterparentnodeids` to consume `/` parts until the first `::` #8577

parthdpa commented Apr 23, 2021 •

edited

parthdpa Apr 24, 2021 •

edited

parthdpa commented Apr 25, 2021 •

edited

parthdpa commented Apr 26, 2021 •

edited

parthdpa commented Apr 27, 2021 •

edited