Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to display field names in namedtuple diffs. #7553

Merged
merged 2 commits into from
Oct 31, 2020

Conversation

tirkarthi
Copy link
Contributor

Here is a quick checklist that should be present in PRs.

  • Include documentation when adding new features.
  • Include new tests or update existing tests when applicable.
  • Allow maintainers to push and squash when merging my commits. Please uncheck this if you prefer to squash the commits yourself.

If this change fixes an issue, please:

Unless your change is trivial or a small documentation fix (e.g., a typo or reword of a small section) please:

  • Create a new changelog file in the changelog folder, with a name like <ISSUE NUMBER>.<TYPE>.rst. See changelog/README.rst for details.

  • Add yourself to AUTHORS in alphabetical order.

@The-Compiler
Copy link
Member

I think we should fall back to the remaining tuple elements

What do you mean? If _fields somehow doesn't match up with the tuple length? I'd rather just have an additional len(obj._fields) == len(obj) check or so then (handling TypeError if len isn't available).

@@ -423,6 +431,8 @@ def _compare_eq_cls(
fields_to_check = [
field.name for field in all_fields if getattr(field, ATTRS_EQ_FIELD)
]
elif isnamedtuple(left):
fields_to_check = left._fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved from: #7527 (comment)

FWIW, I like the "._fields" approach but I think it should fallback to regular tuple comparison if the attributes are not exhausted.

Also I think we should just change the display of fields using _fields, and continue looking up items in the tuple by index/iteration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I like the "._fields" approach but I think it should fallback to regular tuple comparison if the attributes are not exhausted.

Is this about checking the length of _fields and the object itself in the other comment?

Also I think we should just change the display of fields using _fields, and continue looking up items in the tuple by index/iteration

Sorry, I don't get this. Is this about _fields being potentially used for some other purpose in custom subclass of tuple and to rely on iterating the namedtuple by index?

@@ -423,6 +431,8 @@ def _compare_eq_cls(
fields_to_check = [
field.name for field in all_fields if getattr(field, ATTRS_EQ_FIELD)
]
elif isnamedtuple(left):
fields_to_check = left._fields
Copy link
Member

@graingert graingert Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@The-Compiler

I think we should fall back to the remaining tuple elements

What do you mean? If _fields somehow doesn't match up with the tuple length? I'd rather just have an additional len(obj._fields) == len(obj) check or so then (handling TypeError if len isn't available).

I think it might need len(set(obj._fields)) == len(obj)

or something like this?

fields_and_items = [
    (field if getattr(obj, field) is item else i), item
    for i, (field, item) in enumerate(zip_longest(obj._fields, obj))
]

@@ -981,6 +982,30 @@ class SimpleDataObjectTwo:
assert lines is None


class TestAssert_reprcompare_namedtuple:
def test_namedtuple(self) -> None:
SimpleDataObject = namedtuple("SimpleDataObject", ["field_a", "field_b"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since collections is already imported this could be collections.namedtuple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collections.abc is imported. I don't see collections being imported. Are you suggesting to just do import collections and use collections.namedtuple?

Copy link
Member

@graingert graingert Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import collections.abc is approximately equivalent to

importlib.import_module("collections.abc")
import collections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, I mostly prefer it to be explicit. Even on import collections the pre-commit removes it. So I have made the changes as suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the hook removes it because import a and import a.b.c are duplicate imports

@tekumara
Copy link

Thanks so much for working on this @tirkarthi! I'd love to see this feature land.

What's the best way to test this? I pulled your fork and tried it out but I couldn't get the field names to show.

@tirkarthi
Copy link
Contributor Author

@tekumara You're welcome. I can see the changes with the example from the report #7527 . I tried cloning and did python setup.py install. Can you check the relevant installed files to see if the patch is there? The patch is also small that it extends on the work already done for dataclass and attrs.

Copy link
Member

@bluetech bluetech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! LGTM, except some comments on the tests.

@@ -412,9 +418,11 @@ def _compare_eq_cls(
left: Any,
right: Any,
verbose: int,
type_fns: Tuple[Callable[[Any], bool], Callable[[Any], bool]],
type_fns: Tuple[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but this type_fns argument is really odd, it is always the same and can just be dropped & inlined instead.

If you'd like to do this that'd be nice (in a separate commit), but you don't have to :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW it was added in a663f60 as part of #3776 - maybe @alysivji remembers what the intention behind it was?

Copy link
Member

@alysivji alysivji Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Believe I didn't inline the arguments as the code would have been formatted over multiple lines.

for line in lines[2:]:
assert "field_a" not in line

def test_comparing_two_different_namedtuple(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test tests python itself, not pytest. What did you intend to test here?

right = SimpleDataObject(1, "c")

lines = callequal(left, right)
assert lines is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to just compare all of lines (i.e. assert lines == [...]).

testing/test_assertion.py Show resolved Hide resolved
@@ -981,6 +981,36 @@ class SimpleDataObjectTwo:
assert lines is None


class TestAssert_reprcompare_namedtuple:
def test_namedtuple(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add some tests with objects that have _fields, are tuples, but are not NamedTuples?

Also some tests were _fields doesn't match __getattr__ and len(type(o)_fields) != len(o)

@tirkarthi
Copy link
Contributor Author

@graingert Good points that illustrate it can result in false positives. I have come to below definition of isnameduple as below but this will still cause problems in below test case. I feel adding too much cases in isnamedtuple will cause also slowdown the diff output since isnamedtuple. There could be namedtuple like objects as below that might cause surprising outputs. I guess the PR is moving more towards the line where its not feasible to display the output user wants with no standard way to detect namedtuples.

def isnamedtuple(obj: Any) -> bool:
    fields = getattr(obj, "_fields", None)
    return (
        isinstance(obj, tuple) # Check for tuple
        and fields is not None # Check for _fields
        and issequence(fields) # Check that its iterable
        and len(obj) == len(fields) # Check for len of both
        and all(hasattr(obj, field) for field in fields) # Check all attrs are present
    )

2 elements from constructor. 2 attributes as _fields with argument to constructor and _fields being different

def test_comparing_custom_object_fields_set_different_arity() -> None:
    class SimpleDataObjectOne(tuple):
        _fields = ["field_a", "field_b"]
    
    left = SimpleDataObjectOne((1, 2))
    left.field_a = "a"
    left.field_b = "d"
    right = SimpleDataObjectOne((2, 1))
    left.field_a = "a"
    left.field_b = "b"

    assert left == right

Current pytest output

pytest -vvv /tmp/test_namedtuple1.py -k test_comparing_custom_object_fields_set_different_arity
=============================== test session starts ===============================
platform linux -- Python 3.8.0, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /root/py38-venv/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/root/pytest/.hypothesis/examples')
rootdir: /tmp
plugins: django-3.9.0, forked-1.1.3, env-0.6.2, hypothesis-5.20.2, xdist-1.31.0, aiohttp-0.3.0, mock-3.1.1, timeout-1.3.4, cov-2.8.1, pythonpath-0.7.3
collected 5 items / 4 deselected / 1 selected                                     

../../tmp/test_namedtuple1.py::test_comparing_custom_object_fields_set_different_arity FAILED [100%]

==================================== FAILURES =====================================
_____________ test_comparing_custom_object_fields_set_different_arity _____________

    def test_comparing_custom_object_fields_set_different_arity() -> None:
        class SimpleDataObjectOne(tuple):
            _fields = ["field_a", "field_b"]
    
        left = SimpleDataObjectOne((1, 2))
        left.field_a = "a"
        left.field_b = "d"
        right = SimpleDataObjectOne((2, 1))
        left.field_a = "a"
        left.field_b = "b"
    
>       assert left == right
E       assert (1, 2) == (2, 1)
E         At index 0 diff: 1 != 2
E         Full diff:
E         - (2, 1)
E         + (1, 2)

/tmp/test_namedtuple1.py:68: AssertionError
============================= short test summary info =============================
FAILED ../../tmp/test_namedtuple1.py::test_comparing_custom_object_fields_set_different_arity
========================= 1 failed, 4 deselected in 0.14s =========================

With patch though length and attrs are present the util function might get confused with _fields being same and the arguments to constructor to be different.

pytest -vvv /tmp/test_namedtuple1.py -k test_comparing_custom_object_fields_set_different_arity
=============================== test session starts ===============================
platform linux -- Python 3.9.0b5, pytest-6.0.0rc2.dev72+gbecc9f941.d20200819, py-1.9.0, pluggy-0.12.0 -- /root/py39-venv/bin/python3.9
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/root/pytest/.hypothesis/examples')
rootdir: /tmp
plugins: hypothesis-4.43.1, cov-2.8.1
collected 5 items / 4 deselected / 1 selected                                     

../../tmp/test_namedtuple1.py::test_comparing_custom_object_fields_set_different_arity FAILED [100%]

==================================== FAILURES =====================================
_____________ test_comparing_custom_object_fields_set_different_arity _____________

    def test_comparing_custom_object_fields_set_different_arity() -> None:
        class SimpleDataObjectOne(tuple):
            _fields = ["field_a", "field_b"]
    
        left = SimpleDataObjectOne((1, 2))
        left.field_a = "a"
        left.field_b = "d"
        right = SimpleDataObjectOne((2, 1))
        left.field_a = "a"
        left.field_b = "b"
    
>       assert left == right
E       AssertionError: assert (1, 2) == (2, 1)
E         (pytest_assertion plugin: representation of details failed: /root/pytest/src/_pytest/assertion/util.py:448: AttributeError: 'SimpleDataObjectOne' object has no attribute 'field_a'.
E          Probably an object has a faulty __repr__.)

/tmp/test_namedtuple1.py:68: AssertionError
============================= short test summary info =============================
FAILED ../../tmp/test_namedtuple1.py::test_comparing_custom_object_fields_set_different_arity
========================= 1 failed, 4 deselected in 0.29s =========================

@The-Compiler
Copy link
Member

IMHO we shouldn't care (or bikeshed) too much about cases which are very unlikely to happen. At least the Debian Code Search has no results at all for _fields =.

We should make sure pytest doesn't explode entirely when comparing them, but the "representation of details failed" message from above seems quite reasonable to me.

@tirkarthi
Copy link
Contributor Author

I have seen projects using _fields_. There is one project schematics that seems to use _fields from initial grep like https://github.com/schematics/schematics/blob/90dee53fd1d5c29f2c947bec6f5ffe5f74305ab1/tests/test_datastructures.py#L95 . Searching for _fields in around 2000 top PyPI projects do return 236 matches. Some of them are tuple of tuples where they are filtered. I haven't explored all instances but full log at https://gist.github.com/tirkarthi/e71470c370a1910da58adfc1da8d5dad . Let me know if you need any other pattern searches.

rg -t py ' _fields =' | wc
    236     978   11409

@bluetech
Copy link
Member

IMO the current check return isinstance(obj, tuple) and getattr(obj, "_fields", None) is not None is sufficient, but if we want to be more confident we can also check for existence of _field_defaults.

@bluetech
Copy link
Member

@tirkarthi do you think you'll have time to work on this? If not, I can finish it. I think it would be a nice feature for the next release.

@tirkarthi
Copy link
Contributor Author

@bluetech Sorry, I lost track of the discussion and might not be able to continue. Feel free to use the patch as needed. Thanks :)

It doesn't serve any purpose that I am able to discern.
Copy link
Member

@bluetech bluetech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed an updated version.

Note that this is strictly restricted to the case where the namedtuple types match, which avoids most of the pitfalls, while still catching 95% of the cases, in particular the "proto-dataclass" case.

Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jumping a bit late into the party, but thanks everyone!

Left just a small suggestion to the CHANGELOG message, other than that LGTM. 👍

changelog/7527.improvement.rst Outdated Show resolved Hide resolved
@bluetech bluetech merged commit 1c18fb8 into pytest-dev:master Oct 31, 2020
@tirkarthi
Copy link
Contributor Author

Thanks @bluetech and everybody for the review :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Show namedtuple field name in assertion error message
8 participants