Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default_factory in dataclasses behaves strange when set to a callable object instead of a function #8510

Open
1 task done
frnhr opened this issue Jan 7, 2024 · 3 comments
Assignees
Labels
bug V2 Bug related to Pydantic V2

Comments

@frnhr
Copy link

frnhr commented Jan 7, 2024

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

There seems to be an unexpected difference with default_factory with pydantic.dataclasses in a very specific case.

It happens in this case:

  • Using nested dataclasses (a parent and a child).
  • A field in the parent and a field in the child dataclass have default_factory set to the same callable object.
  • This callable is a class instance and not a function.
  • Initializing parent and child objects in a single call by passing child data as dict.

It seems that the callable object (used for default_factory) gets copied (duplicated) at some point before being called. This can cause some unexpected behaviour, as shown in the tests below.

Example Code

import dataclasses

import pydantic


# First, we have a dead-simple auto-increment int class:


class UniqueId:
    """
    >>> unique_id = UniqueId()
    >>> unique_id()
    1
    >>> unique_id()
    2
    >>> another_unique_id = UniqueId()
    >>> another_unique_id()
    1
    >>> another_unique_id()
    2
    >>> unique_id()
    3
    >>> another_unique_id()
    3
    """
    def __init__(self):
        self.counter = 0

    def __call__(self) -> int:
        self.counter += 1
        return self.counter


# Here is the test case which fails:


def test_1():
    unique_id = UniqueId()

    @pydantic.dataclasses.dataclass
    class User:
        name: str
        id_: int = dataclasses.field(default_factory=unique_id)

    @pydantic.dataclasses.dataclass
    class Group:
        name: str
        users: list[User]
        id_: int = dataclasses.field(default_factory=unique_id)

    group = Group(name="Group 1", users=[{"name": "John Smith"}, {"name": "John Doe"}])

    assert group.users[0].id_ == 1
    assert group.users[1].id_ == 2
    assert group.id_ == 3  # Fails: 1 != 3


# Below are some variations of this test.
# All tests below are OK!


def test_2():
    unique_id = UniqueId()

    @pydantic.dataclasses.dataclass
    class User:
        name: str
        id_: int = dataclasses.field(default_factory=unique_id)

    @pydantic.dataclasses.dataclass
    class Group:
        name: str
        users: list[User]
        id_: int = dataclasses.field(default_factory=unique_id)

    group = Group(name="Group 1", users=[User(name="John Smith"), User(name="John Doe")])
    #  ^^^^^ Change in this line, users are now inited explicitly, not dicts ^^^^^

    assert group.users[0].id_ == 1
    assert group.users[1].id_ == 2
    assert group.id_ == 3  # OK!



def test_3():
    unique_id = UniqueId()

    @pydantic.dataclasses.dataclass
    class User:
        name: str
        id_: int = dataclasses.field(default_factory=lambda: unique_id())
        #  ^^^^^ Change in this line, using a lambda ^^^^^

    @pydantic.dataclasses.dataclass
    class Group:
        name: str
        users: list[User]
        id_: int = dataclasses.field(default_factory=unique_id)

    group = Group(name="Group 1", users=[{"name": "John Smith"}, {"name": "John Doe"}])

    assert group.users[0].id_ == 1
    assert group.users[1].id_ == 2
    assert group.id_ == 3  # OK!


def test_4():
    unique_id_instance = UniqueId()

    def unique_id():
        return unique_id_instance()

    # ^^^^^ Changes above, using a named function instead of a class instance ^^^^^

    @pydantic.dataclasses.dataclass
    class User:
        name: str
        id_: int = dataclasses.field(default_factory=unique_id)

    @pydantic.dataclasses.dataclass
    class Group:
        name: str
        users: list[User]
        id_: int = dataclasses.field(default_factory=unique_id)

    group = Group(name="Group 1", users=[{"name": "John Smith"}, {"name": "John Doe"}])

    assert group.users[0].id_ == 1
    assert group.users[1].id_ == 2
    assert group.id_ == 3  # OK!

Python, Pydantic & OS Version

pydantic version: 2.5.3
        pydantic-core version: 2.14.6
          pydantic-core build: profile=release pgo=true
                 install path: /Users/fran/.pyenv/versions/3.11.4/envs/someenvname/lib/python3.11/site-packages/pydantic
               python version: 3.11.4 (main, Sep 28 2023, 13:58:48) [Clang 14.0.3 (clang-1403.0.22.14.1)]
                     platform: macOS-14.1-arm64-arm-64bit
             related packages: mypy-1.6.1 typing_extensions-4.8.0
@frnhr frnhr added bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation labels Jan 7, 2024
@sydney-runkle
Copy link
Member

@frnhr,

Thanks for reporting this. Definitely a bug! We'll look into fixing this 🐛.

@sydney-runkle sydney-runkle removed the pending Awaiting a response / confirmation label Jan 8, 2024
@sydney-runkle sydney-runkle self-assigned this Jan 8, 2024
@sydney-runkle
Copy link
Member

Such an odd bug. Fixed by #9114.

@sydney-runkle
Copy link
Member

Fix should be released in 2.7 soon:

import dataclasses

import pydantic

class UniqueId:
    """
    >>> unique_id = UniqueId()
    >>> unique_id()
    1
    >>> unique_id()
    2
    >>> another_unique_id = UniqueId()
    >>> another_unique_id()
    1
    >>> another_unique_id()
    2
    >>> unique_id()
    3
    >>> another_unique_id()
    3
    """
    def __init__(self):
        self.counter = 0

    def __call__(self) -> int:
        self.counter += 1
        return self.counter

unique_id = UniqueId()

@pydantic.dataclasses.dataclass
class User:
    name: str
    id_: int = dataclasses.field(default_factory=unique_id)

@pydantic.dataclasses.dataclass
class Group:
    name: str
    users: list[User]
    id_: int = dataclasses.field(default_factory=unique_id)

group = Group(name="Group 1", users=[{"name": "John Smith"}, {"name": "John Doe"}])

assert group.users[0].id_ == 1
assert group.users[1].id_ == 2
assert group.id_ == 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2
Projects
None yet
Development

No branches or pull requests

2 participants