Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytest 8.0 sorting tests with multiple parameterization is broken #12008

Open
ShurikMen opened this issue Feb 19, 2024 · 13 comments · May be fixed by #12082
Open

pytest 8.0 sorting tests with multiple parameterization is broken #12008

ShurikMen opened this issue Feb 19, 2024 · 13 comments · May be fixed by #12082
Labels
topic: fixtures anything involving fixtures directly or indirectly topic: parametrize related to @pytest.mark.parametrize type: bug problem that needs to be addressed type: regression indicates a problem that was introduced in a release which was working previously

Comments

@ShurikMen
Copy link
Contributor

Сontinue #11976

The solution from #11976 did not solve the problem with sorting tests with multiple parameters

Tests

import pytest


@pytest.mark.parametrize('proto', ['serial', 'telnet', 'ssh'], scope='class')
@pytest.mark.parametrize('unit', [1, 2, 3], scope='class')
class TestA:
    def test_one(self, proto, unit):
        pass

    def test_two(self, proto, unit):
        pass

Collecting items with pytest 7.4.4

11:07 $ pytest --co
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.6, pytest-7.4.4, pluggy-1.4.0 -- /home/lolik/.cache/pypoetry/virtualenvs/pytest_strain-FQAfhJsh-py3.11/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.11.6', 'Platform': 'Linux-6.7.4-arch1-1-x86_64-with-glibc2.39', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.4.0'}, 'Plugins': {'metadata': '3.1.0'}, 'GIT_BRANCH': 'master'}
rootdir: /home/lolik/Projects/straing/pytest
configfile: pytest.ini
plugins: metadata-3.1.0
collected 18 items                                                                                                                                                        

<Package tests>
  <Module test_asd.py>
    <Class TestA>
      <Function test_one[1-serial]>
      <Function test_two[1-serial]>
      <Function test_one[1-telnet]>
      <Function test_two[1-telnet]>
      <Function test_one[1-ssh]>
      <Function test_two[1-ssh]>
      <Function test_one[2-serial]>
      <Function test_two[2-serial]>
      <Function test_one[2-telnet]>
      <Function test_two[2-telnet]>
      <Function test_one[2-ssh]>
      <Function test_two[2-ssh]>
      <Function test_one[3-serial]>
      <Function test_two[3-serial]>
      <Function test_one[3-telnet]>
      <Function test_two[3-telnet]>
      <Function test_one[3-ssh]>
      <Function test_two[3-ssh]>

======================================================================= 18 tests collected in 0.01s =======================================================================

Collecting items with pytest 8.0.1

11:08 $ pytest --co
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.6, pytest-8.0.1, pluggy-1.4.0 -- /home/lolik/.cache/pypoetry/virtualenvs/pytest_strain-FQAfhJsh-py3.11/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.11.6', 'Platform': 'Linux-6.7.4-arch1-1-x86_64-with-glibc2.39', 'Packages': {'pytest': '8.0.1', 'pluggy': '1.4.0'}, 'Plugins': {'metadata': '3.1.0'}, 'GIT_BRANCH': 'master'}
rootdir: /home/lolik/Projects/straing/pytest
configfile: pytest.ini
plugins: metadata-3.1.0
collected 18 items                                                                                                                                                        

<Dir pytest>
  <Package tests>
    <Module test_asd.py>
      <Class TestA>
        <Function test_one[1-serial]>
        <Function test_two[1-serial]>
        <Function test_one[2-serial]>
        <Function test_two[2-serial]>
        <Function test_one[2-telnet]>
        <Function test_two[2-telnet]>
        <Function test_one[1-telnet]>
        <Function test_two[1-telnet]>
        <Function test_one[3-telnet]>
        <Function test_two[3-telnet]>
        <Function test_one[3-serial]>
        <Function test_two[3-serial]>
        <Function test_one[3-ssh]>
        <Function test_two[3-ssh]>
        <Function test_one[2-ssh]>
        <Function test_two[2-ssh]>
        <Function test_one[1-ssh]>
        <Function test_two[1-ssh]>

======================================================================= 18 tests collected in 0.01s =======================================================================

@bluetech
Copy link
Member

Thanks, this does look wrong at first glance, I will look into it. Bisected to 09b7873 (PR #11220).

@bluetech bluetech added type: bug problem that needs to be addressed topic: parametrize related to @pytest.mark.parametrize type: regression indicates a problem that was introduced in a release which was working previously topic: fixtures anything involving fixtures directly or indirectly labels Feb 19, 2024
@bluetech
Copy link
Member

Minimized example:

@pytest.mark.parametrize('proto', ['a', 'b'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class Test:
    def test(self, proto, unit):
        pass

The items are reordered by reorder_items() in fixture.py (surely the most inscrutable function in all of pytest), whose aim is to minimize fixture setups and teardowns. In this respect it achieves its goal:

Setup plan in pytest 7, has 6 setups/teardowns:

      SETUP    C proto['a']
      SETUP    C unit[1]
        x.py::Test::test[1-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[1-b] (fixtures used: proto, unit)
      TEARDOWN C proto['b']
      SETUP    C proto['a']
      TEARDOWN C unit[1]
      SETUP    C unit[2]
        x.py::Test::test[2-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[2-b] (fixtures used: proto, unit)
      TEARDOWN C unit[2]
      TEARDOWN C proto['b']

in pytest 8, has 5 setups/teardowns:

      SETUP    C proto['a']
      SETUP    C unit[1]
        x.py::Test::test[1-a] (fixtures used: proto, unit)
      TEARDOWN C unit[1]
      SETUP    C unit[2]
        x.py::Test::test[2-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[2-b] (fixtures used: proto, unit)
      TEARDOWN C unit[2]
      SETUP    C unit[1]
        x.py::Test::test[1-b] (fixtures used: proto, unit)
      TEARDOWN C unit[1]
      TEARDOWN C proto['b']

The way @pytest.mark.parametrize works behind the scenes is that it basically desugars to this:

import pytest

@pytest.fixture(params=[1, 2], scope='class')
def unit(request):
    return request.param

@pytest.fixture(params=['a', 'b'], scope='class')
def proto(request):
    return request.param

class Test:
    def test(self, unit, proto):
        pass

In this framing, it is clearer why we want to minimize the setups/teardowns -- the fixtures can do real work, not just return the param. Both pytest 7 and 8 reorder this version to have 5 setups/teardowns.

I can't say why before 09b7873 the reordering didn't happen, and if this was intentional or accidental. Will look into it more.

@RonnyPfannschmidt
Copy link
Member

@bluetech i believe the change by @sadra-barikbin fixed a bug in pytest wrt scope ordering

it dates back to #519 and the fix corrects the test scope ordering based on values and fixture values and scopes

for further validation we might want to add a variant of the example for #519 that mixes class scope in (in which case the old order is actually correct

but the basic gist to my current understanding is that pytest no longer has ordering differences between sugared and de-sugared parameterize as now parameterize is expressed in pseudo fixtures all the way

@bluetech
Copy link
Member

Technical details of the change:

Before 09b7873, Metafunc will generate CallSpec's with separate params (indirect parametrizations, i.e. through fixtures) and funcargs (direct parametrizations), and the desugaring of direct to indirect (i.e.g funcargs to params) happened as a separate step after pytest_generate_functions: https://github.com/pytest-dev/pytest/blob/7.4.4/src/_pytest/fixtures.py#L156.

After 09b7873, Metafunc handles the desugaring itself, and there is no more funcargs.

All of this happens before reorder_items anyway, so why does it affect the item ordering? The difference is the param_index that the desugaring assigns to the (desguared) direct params. Before the code was this:

https://github.com/pytest-dev/pytest/blob/7.4.4/src/_pytest/fixtures.py#L168-L181

The important bit is callspec.indices[argname] = len(arg2params_list) - this basically assigns a fresh param index across all callspecs. This results in the following arg keys in reorder_items:

('proto', 0, <Function test[1-a]>)
('proto', 1, <Function test[1-b]>)
('proto', 2, <Function test[2-a]>)
('proto', 3, <Function test[2-b]>)
('unit',  0, <Function test[1-a]>)
('unit',  1, <Function test[1-b]>)
('unit',  2, <Function test[2-a]>)
('unit',  3, <Function test[2-b]>)

After, there is no special handling of the param indexes of direct params, they are handled same as parametrized fixtures. This results in the following arg keys:

('proto', 0, <Function test[1-a]>)
('proto', 0, <Function test[2-a]>)
('proto', 1, <Function test[1-b]>)
('proto', 1, <Function test[2-b]>)
('unit',  0, <Function test[1-a]>)
('unit',  0, <Function test[1-b]>)
('unit',  1, <Function test[2-a]>)
('unit',  1, <Function test[2-b]>)

Rephrasing the above in a way that may be clearer:

Before, indexes for direct params were assigned in a sequential manner per argname after all parametrizations are exploded, so we have a table of argname, item (= callspec) and we assign the param index:

argname item param index
proto test[1-a] 0
proto test[1-b] 1
proto test[2-a] 2
proto test[2-b] 3
unit test[1-a] 0
unit test[1-b] 1
unit test[2-a] 2
unit test[2-b] 3

After, the param indexes are assigned as they would for the desugaring I gave above:

import pytest

# param indexes:        0  1
@pytest.fixture(params=[1, 2], scope='class')
def unit(request):
    return request.param

# param indexes:         0    1
@pytest.fixture(params=['a', 'b'], scope='class')
def proto(request):
    return request.param

@bluetech
Copy link
Member

@ShurikMen I wonder, is the example you gave realistic or do you use indirect params maybe? I'm mainly curious why you're setting scope='class' on your parametrizes when the parameters are simple values. With function scope the ordering is as you expect.

@ShurikMen
Copy link
Contributor Author

@ShurikMen I wonder, is the example you gave realistic or do you use indirect params maybe? I'm mainly curious why you're setting scope='class' on your parametrizes when the parameters are simple values.

This is one of the simplest examples that are actually used in my projects. There are many examples with an even larger set of parameters including "mixed" scopes (class/function).

With function scope the ordering is as you expect.

Not quite like that. Changing the order causes unwanted fixture calls. They are very expensive in terms of execution time.
It is important for me that tests with the same set of parameters are called sequentially.

With scope class (same optimal):

import pytest


@pytest.fixture(autouse=True, scope='class')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass
test_asd.py::TestA::test_one[1-serial] 
      SETUP    C unit[1]
      SETUP    C proto['serial']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[1-serial] 
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[1-telnet] 
      TEARDOWN C some_fix
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[1-telnet] 
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[2-serial] 
      TEARDOWN C some_fix
      TEARDOWN C unit[1]
      SETUP    C unit[2]
      TEARDOWN C proto['telnet']
      SETUP    C proto['serial']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[2-serial] 
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[2-telnet] 
      TEARDOWN C some_fix
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[2-telnet] 
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
      TEARDOWN C some_fix
      TEARDOWN C proto['telnet']
      TEARDOWN C unit[2]

With scope function

import pytest


@pytest.fixture(autouse=True, scope='function')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='function')
@pytest.mark.parametrize('unit', [1, 2], scope='function')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass
test_asd.py::TestA::test_one[1-serial] 
        SETUP    F unit[1]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_one[1-telnet] 
        SETUP    F unit[1]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_one[2-serial] 
        SETUP    F unit[2]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_one[2-telnet] 
        SETUP    F unit[2]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_two[1-serial] 
        SETUP    F unit[1]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_two[1-telnet] 
        SETUP    F unit[1]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_two[2-serial] 
        SETUP    F unit[2]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_two[2-telnet] 
        SETUP    F unit[2]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[2]

Class params with function fixture

import pytest


@pytest.fixture(autouse=True, scope='function')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass
test_asd.py::TestA::test_one[1-serial] 
      SETUP    C unit[1]
      SETUP    C proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[1-serial] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[1-telnet] 
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[1-telnet] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[2-serial] 
      TEARDOWN C unit[1]
      SETUP    C unit[2]
      TEARDOWN C proto['telnet']
      SETUP    C proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[2-serial] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[2-telnet] 
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[2-telnet] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
      TEARDOWN C proto['telnet']
      TEARDOWN C unit[2]

Here I experimented a lot with various combinations of scope parameters and scope fixtures, including parameterization in fixtures (pytest.fixture(param=...)) to obtain the most optimal ways to perform fixtures and tests.

@RonnyPfannschmidt
Copy link
Member

Currently when parameterize is taken for consideration, compound dependent fixtures are not

Id recommend having a single parameterset that creates the correct compound parameters to a single fixture so it will no longer be considered as independent parameters

@bluetech bluetech changed the title pytest 8.0.1 sorting tests with multiple parameterization is broken pytest 8.0 sorting tests with multiple parameterization is broken Feb 22, 2024
@ShurikMen ShurikMen linked a pull request Mar 6, 2024 that will close this issue
@ShurikMen
Copy link
Contributor Author

@bluetech take a look at #12082

@sadra-barikbin
Copy link
Contributor

sadra-barikbin commented Mar 22, 2024

Here are the three approaches:
pytest-reorder

@sadra-barikbin
Copy link
Contributor

sadra-barikbin commented Mar 23, 2024

Seems we have an appearance-efficiency trade-off, with @ShurikMen's suggestion and v7.4.4 approach yielding better appearance but having lower efficiency in setup-teardowns in comparison with v8.0.0.

The three approaches could also be compared in terms of robustness to parametrization varieties which @ShurikMen 's suggestion for example performs better than v7.4.4 and solves the first example of #11257 but fails on the example below which v8.0.0 solves, unless the user be cautious and swap the order of parametrizations on test1.

# Here `test1["b",0]` and `test2[3]` would wrongfully have a common fixturekey in @ShurikMen 's method.
@pytest.mark.parametrize("arg2", [0, 1, 2], scope='module')
@pytest.mark.parametrize("arg1", ["a", "b"], scope='module')
def test1(arg1, arg2):
    pass

@pytest.mark.parametrize("arg2", [0, 1, 2, 3], scope='module')
def test2(arg2):
    pass

Considering robustness alone, @ShurikMen method's gain over v7.4.4 seems remarkable but I'm not sure about that of v8.0.0 over @ShurikMen 's.

@sadra-barikbin
Copy link
Contributor

sadra-barikbin commented Mar 24, 2024

Aside by the comments above,I'm for the notion of bug for current reordering in v8.0.0 as it has not been introduced by the intention to become more efficient which we discussed here about. It either has been an unnoticed side-effect of #11220 or something in my large initial PR, responsible for the first example of #11257 (which @ShurikMen 's method now solves as well). Sorry for inconvenience, mates!

@ShurikMen
Copy link
Contributor Author

I have updated the pr. Unified indexing of parameters added through a marker and through fixtures. I think it's the right thing to do. Along the way, I corrected the tests related to parameterization through fixtures.

@ShurikMen
Copy link
Contributor Author

@bluetech what about solving the problem on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: fixtures anything involving fixtures directly or indirectly topic: parametrize related to @pytest.mark.parametrize type: bug problem that needs to be addressed type: regression indicates a problem that was introduced in a release which was working previously
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@RonnyPfannschmidt @bluetech @sadra-barikbin @ShurikMen and others