`BaseModel.hash` doesn't match `eq` #7785

alexmojaki · 2023-10-09T20:54:55Z

Initial Checks

I confirm that I'm using Pydantic V2

Description

This special handling for generic classes:

Lines 855 to 864 in dbbd776

    
           def __eq__(self, other: Any) -> bool: 
        
               if isinstance(other, BaseModel): 
        
                   # When comparing instances of generic types for equality, as long as all field values are equal, 
        
                   # only require their generic origin types to be equal, rather than exact type equality. 
        
                   # This prevents headaches like MyGeneric(x=1) != MyGeneric[Any](x=1). 
        
                   self_type = self.__pydantic_generic_metadata__['origin'] or self.__class__ 
        
                   other_type = other.__pydantic_generic_metadata__['origin'] or other.__class__ 
        
                   return ( 
        
                       self_type == other_type

has no equivalent for __hash__:

pydantic/pydantic/_internal/_model_construction.py

Line 401 in dbbd776

return hash(self.__class__) + hash(tuple(self.__dict__.values()))

This means that a == b doesn't imply hash(a) == hash(b), breaking how dicts and sets work.

It might also be worth noting that __eq__ looks at __pydantic_private__ and __pydantic_extra__ while __hash__ doesn't. This isn't a contract violation in the same way, since non-equal instances are allowed to have equal hashes, but it makes hash collisions more likely. Hypothetically you could have a large set/dict of model instances where all the public fields are the same (so all the hashes are equal) but the private attributes differ (so the instances are non-equal) and then operations which are usually O(1)ish become O(n). On the other hand, adding more logic to __hash__ would of course reduce performance slightly in the vast majority of cases, so it's not obvious what to do.

EDIT: there's a good reason not to hash private attributes: #7800 (comment)

Example Code

from typing import TypeVar, Generic

from pydantic import BaseModel

T = TypeVar("T")


class A(BaseModel, Generic[T], frozen=True):
    a: T


a1 = A[int](a=1)
a2 = A(a=1)
assert a1 == a2
assert hash(a1) != hash(a2)
assert a1 not in {a2}
assert a2 not in {a1}

Python, Pydantic & OS Version

pydantic version: 2.4.2
        pydantic-core version: 2.10.1
          pydantic-core build: profile=release pgo=true
                 install path: /home/alex/work/pydantic/pydantic
               python version: 3.11.5 (main, Sep  9 2023, 21:35:25) [GCC 7.5.0]
                     platform: Linux-5.15.0-86-generic-x86_64-with-glibc2.35
             related packages: typing_extensions-4.7.1 email-validator-2.0.0.post2 pyright-1.1.330.post0 mypy-1.1.1 pydantic-extra-types-2.1.0 pydantic-settings-2.0.3

The text was updated successfully, but these errors were encountered:

alexmojaki added bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation labels Oct 9, 2023

alexmojaki mentioned this issue Oct 11, 2023

Only hash model_fields, not whole __dict__ #7786

Merged

5 tasks

sydney-runkle closed this as completed in #7786 Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`BaseModel.hash` doesn't match `eq` #7785

`BaseModel.hash` doesn't match `eq` #7785

alexmojaki commented Oct 9, 2023 •

edited

BaseModel.__hash__ doesn't match __eq__ #7785

BaseModel.__hash__ doesn't match __eq__ #7785

Comments

alexmojaki commented Oct 9, 2023 • edited

Initial Checks

Description

Example Code

Python, Pydantic & OS Version

`BaseModel.hash` doesn't match `eq` #7785

`BaseModel.hash` doesn't match `eq` #7785

alexmojaki commented Oct 9, 2023 •

edited