Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

field defined with Field() makes pydantic to have a "RecursionError: maximum recursion depth exceeded" #7327

Closed
1 task done
luminoso opened this issue Sep 4, 2023 · 17 comments
Assignees
Labels
bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable

Comments

@luminoso
Copy link

luminoso commented Sep 4, 2023

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

pydantic 2.x leads to a recursion error when running the example code.

(....)

  File "/home/luminoso/PycharmProjects/pydantic2datamodel-bug-report/venv/lib/python3.11/site-packages/pydantic/_internal/_repr.py", line 55, in __repr_str__
    return join_str.join(repr(v) if a is None else f'{a}={v!r}' for a, v in self.__repr_args__())
                                                                            ^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded

to test, try to parse an object:

Model.parse_obj({"Settings": {"name": "foobar"}})

(Despite pydantic 2 deprecated parse_obj() the problem also happens with pydantic v2 model_validate())

A workaround is not to use the Field(), as so:

class Model(BaseModel):
    Settings: Settings

Where the problem doesn't happen.

And the solution is to downgrade pydantic 1.10.12, where the problem does not occur with both with or without the Field() definition.

Example Code

from __future__ import annotations

from pydantic.fields import Field
from pydantic.main import BaseModel


class Settings(BaseModel):
    name: str


class Model(BaseModel):
    Settings: Settings = Field(..., title="a nice title")

Python, Pydantic & OS Version

pydantic version: 2.3.0
        pydantic-core version: 2.6.3
          pydantic-core build: profile=release pgo=true
                 install path: /var/home/luminoso/PycharmProjects/pydantic2datamodel-bug-report/venv/lib/python3.11/site-packages/pydantic
               python version: 3.11.4 (main, Jun  7 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)]
                     platform: Linux-6.1.50-200.fc38.x86_64-x86_64-with-glibc2.37
     optional deps. installed: ['email-validator', 'typing-extensions']

Selected Assignee: @dmontagu

@luminoso luminoso added bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable labels Sep 4, 2023
@luminoso
Copy link
Author

luminoso commented Sep 4, 2023

(#7309 may be related)

@hramezani
Copy link
Member

Thanks @luminoso for reporting this 🙏

Yes, It is a duplicate of #7309

Here is the reason #6646 (comment)

@sneakers-the-rat
Copy link

Just following up since the prior issue was closed - is this an impossible to fix bug? Or would it be possible to at least have an informative error message saying you cant have a field with the same name as the type?

It seems odd it only happens when you have an explicit Field object and not otherwise, but idk anything about the implementation that might explain that

@luminoso
Copy link
Author

luminoso commented Sep 4, 2023

According to #6646 (comment)

This isn't related to Pydantic, it has to do with how Python finds references to the type hints you are using

If the issue is related to python then why the same exact example runs with pydantic v1?

@Viicos
Copy link
Contributor

Viicos commented Sep 5, 2023

If the issue is related to python then why the same exact example runs with pydantic v1?

It only works with from __future__ import annotations. When I said it has to do with how Python finds references it might be slightly incorrect: Pydantic defines a lot of extra utilities to determine what type hints have been used. So this might differ from Pydantic v1 where it could be able to determine the type correctly when type hints are evaluated as strings thanks to the future import.

While I don't think supporting this use case with or without the future import (I don't know if it is even possible without this import) is a good idea, a better error message could be a good improvement.

For this use case, an alias can be used, or reference the Settings class under a different name (not pretty):

from pydantic import BaseModel, Field


class Settings(BaseModel):
    name: str

_Settings = Settings

class Model(BaseModel):
    settings: Settings = Field(..., title="a nice title", alias="Settings")
    # or
    Settings: _Settings  = Field(..., title="a nice title")

@luminoso
Copy link
Author

luminoso commented Sep 5, 2023

Thank you for the references and for clarifying @Viicos .

If both pydantic v1 and v2 need the from __future__ import annotations but v1 due to its superior type inference is able to solve the issue, doesn't this mean that v2 is still catching up the v1 implementation?

@sneakers-the-rat
Copy link

Thanks for the information.

I'm not sure what you mean by this:

While I don't think supporting this use case with or without the future import (I don't know if it is even possible without this import) is a good idea,

since it seems very natural to want to have a field named date with a date type, for example.

@Viicos
Copy link
Contributor

Viicos commented Sep 6, 2023

I'm not sure what you mean by this:

I said it wasn't a good idea to support this because generally in standard Python the wanted behavior here isn't really supported. See this example using stdlib dataclasses and VSCode (which uses pyright as a type checker):

image

As you can see, pyright is confused about the type hint of A: is it the A: defined on the same line, or the first dataclass? (by the way, even annotating this correctly as Optional[A] = None leads to the same result).

But as always, Pydantic can deviate from vanilla Python, e.g. non-hashable default values. So if there's a way to support this, maybe Pydantic maintainers would be open to it

@sneakers-the-rat
Copy link

sneakers-the-rat commented Sep 6, 2023

Hmm ya I see the complication.

The (accepted) PEP 649 that is supposed to supercede 563 gives this pseudocode example:

class function:
    # __annotations__ on a function object is already a
    # "data descriptor" in Python, we're just changing
    # what it does
    @property
    def __annotations__(self):
        return self.__annotate__()

# ...

def annotate_foo():
    return {'x': int, 'y': MyType, 'return': float}
def foo(x = 3, y = "abc"):
    ...
foo.__annotate__ = annotate_foo
class MyType:
   ...
foo_y_annotation = foo.__annotations__['y']

Which suggests that references to objects are resolved in the enclosing scope, ie.

y = str
class MyClass:
    y: y = int
# type(MyClass.y) == str

Even though y is rebound to int in the class, but it also says:

  • For functions and classes, the globals dictionary will be the module where the object was defined. If the object is itself a module, its globals dictionary will be its own dict.
  • For methods on classes, and for classes, the locals dictionary will be the class dictionary.

where the and for classes part in the second bullet introduces some ambiguity because how can class-level attrs use the class dict for scope before it's bound?

The language reference currently says:

If the right hand side is present, an annotated assignment performs the actual assignment before evaluating annotations (where applicable).

https://docs.python.org/3/reference/simple_stmts.html#annassign

which is consistent with the above comments re: python just is broken here, but also PEP 649 hasnt been implemented yet AFAIK so that might change.

In any case, can we agree that it would never make sense for someone to actually intend

my_str = str
class MyClass(BaseModel):
    my_str: my_str = "default string"

to mean

a = MyClass()
type(a.my_string)
# Literal['default string']

and that the intented behavior would always be type(a.my_string) == my_string ?

Diverging from default python behavior seems warranted if the default python behavior makes no sense imo


edit: mypy seems to correctly identify types in this context

from dataclasses import dataclass

A = str
@dataclass
class B:
    a: A = None
    A: A = None

reveal_type(B)

output of mypy:

 Revealed type is "def (a: builtins.str =, A: builtins.str =) -> test_module.B"

which makes me lean more towards "bug" than expected python behavior

@dmontagu
Copy link
Contributor

dmontagu commented Sep 11, 2023

I played with this a bit, and it looks like it comes down to the fact that when from __future__ import annotations is present, the namespace of the class is ignored when using typing.get_type_hints, whereas when it is not present, it is not ignored.

I'm not sure the details here, so maybe it is fixable. I thought I was onto something by removing the localns = dict(vars(base)) line from pydantic._internal._typing_extra.get_cls_type_hints_lenient, which doesn't break any tests except one designed to demonstrate this exact misbehavior. But I realized that removing that actually makes it stop
matching the logic for resolving annotations when you don't have from __future__ import annotations:

# from __future__ import annotations  # can uncomment this to get the other outcome below

from typing import get_type_hints


class Settings:
    pass


class Model:
    class Settings:
        pass

    x: Settings


print(get_type_hints(Model))
# with `from __future__ import annotations`: {'x': <class '__main__.Settings'>}
# without `from __future__ import annotations`: {'x': <class '__main__.Model.Settings'>}

(Note there is nothing specific to pydantic above, it's all standard library only.)

So, unless there is a way to determine whether from __future__ import annotations was imported in the module in which the class was defined, I don't see a way to make the change that is undeniably an improvement. (And even if you could determine that, it is arguably still a breaking change to the current behavior that would likely be painful to resolve if you were affected.)

Given all of this, I think the best course of action is to just treat this as an unfortunately unfixable "misbehavior", at least until PEP649 is live (at which point I think the improvements that will bring would merit making the arguably-breaking change to always reflect the semantics of that PEP).

For what it's worth, if you don't have another workaround available, you might also find the approach from tests.test_edge_cases.test_invalid_forward_ref_model useful as a way to work around collisions between field names and type names.

@dmontagu
Copy link
Contributor

dmontagu commented Sep 11, 2023

For tracking purposes, I'm going to close this issue for now due to the comment above, but to be clear I am open to continued discussion on this point if anyone has suggestions about how to improve this behavior (without causing breaking changes). PRs welcome, but recognize the barrier to acceptance may be somewhat high if it could affect existing code designed to work around this shortcoming. I think if you can detect whether from __future__ import annotations was used in the affected module then that might be a good start toward matching the relevant behavior, but I would want to confirm with the other maintainers before accepting anything.

I'll also note that this will definitely be revisited once PEP649 is live.

@dmontagu dmontagu closed this as not planned Won't fix, can't repro, duplicate, stale Sep 11, 2023
sneakers-the-rat added a commit to sneakers-the-rat/pydantic that referenced this issue Sep 12, 2023
@sneakers-the-rat
Copy link

sneakers-the-rat commented Sep 12, 2023

I don't mean to be a pain but I think the problem is basically this:

The language reference currently says:

If the right hand side is present, an annotated assignment performs the actual assignment before evaluating annotations (where applicable).

for both cases, whether from __future__ import annotations is present or not. I confirmed this by starting pdb at pydantic._internal._typing_extra.get_cls_type_hints_lenient and the contents of the __annotations__ dictionary at that point are the FieldInfo type, rather than what is set as the annotation.

So the solution seems relatively near at hand, we just need to detect when a FieldInfo object is present in the __annotation__ dictionary when constructing the model and then resolve the reference one evaluation scope out from the class's local vars.

Adding this here resolves the problem and all the other tests pass for both the __future__ annotations and not.:

if type(value).__name__ == 'FieldInfo' and type(value).__module__ == 'pydantic.fields':
    # while we still have the name in scope,
    # check if value is FieldInfo without importing. If we have
    # used the same name for the field as its type, python evaluates
    # the type as the assigned value.
    # So, try to replace with value from the outer namespace.
    value = globalns.get(name, value)

elif isinstance(value, str) and value in ann.keys():
    # otherwise, if we have future annotations (which are all strings)
    # check if the string refers to another local variable
    # and if so, replace it with the global value before evaluating
    value = globalns.get(value, value)

value = eval_type_lenient(value, globalns, localns)

the first check just checks if the __annotation__ value is the Field() object, and we get the value from the global namespace instead if present, returning the value unchanged otherwise.

The second handles the __future__ annotations case, where if the annotation is a string, and that string is also present in the field names, then try and get the symbol with that name from the globalns.

Here's a patch: main...sneakers-the-rat:pydantic:recursion_error

I didn't know where to put the tests but these all pass (I don't know how to parameterize at the module level to run the tests with and without from __future__ import annotations so i just manually commented it out when i ran the tests locally)

expand content of test
from __future__ import annotations

import pytest

from datetime import date

from pydantic import BaseModel, Field

A = str
def test_recursive_naming():
    """
    Models that have an attribute with the same name as the type annotation
    should not raise a recursion error
    """

    class MyModel(BaseModel):
        a: A = Field(...)
        A: A = Field(...)

    assert MyModel.model_fields['a'].annotation is str
    assert MyModel.model_fields['A'].annotation is str

def test_issue_7327():
    """
    https://github.com/pydantic/pydantic/issues/7327
    """
    class Settings(BaseModel):
        name: str

    class Model(BaseModel):
        Settings: Settings = Field(..., title="a nice title")

    assert Model.model_fields['Settings'].annotation is Settings

def test_issue_7309():
    """
    https://github.com/pydantic/pydantic/issues/7309
    """

    class MyModel(BaseModel):
        date: date = Field()

    assert MyModel.model_fields['date'].annotation is date

edit: also addresses #7309

@dmontagu
Copy link
Contributor

dmontagu commented Sep 12, 2023

I understand you are trying to address the issue of a FieldInfo, but that leaves un-addressed any time where you assign an actual default value instead of a FieldInfo. I'm already not super keen to modify this logic to add special handling of FieldInfo, but I especially don't like that if you change from

MyStr = str
class A(BaseModel):
    MyStr: MyStr = FieldInfo('default')

to

MyStr = str
class A(BaseModel):
    MyStr: MyStr = 'default'

you'll get a surprising new misbehavior.

I think I also see another issue with your suggested patch. Consider the following code:

from __future__ import annotations

from typing import get_type_hints


# Comment/uncomment the following
# class Settings:
#     pass


class MyClass:
    class Settings:
        pass

    my_settings: list[Settings]


# this dict changes if the commented code above is uncommented
# but I don't believe the code in your patch would reflect this
print(get_type_hints(MyClass))

I believe with your current code for looking up annotations in the globalns, it will fail to do the lookup because it will be looking for the string 'list[Settings]' instead of 'Settings'. Resolving this would require more care than just doing a string-lookup-based substitution using the globalns dict prior to evaluating the type hint. It might be enough to attempt to evaluate the type hint using just the globalns then again using the global and localns, but things get tricky fast; e.g., imagine if that Settings class was itself generic with multiple parameters that were parametrized with forwardrefs to classes defined differently in the two namespaces. (Maybe handling this would "just work" but based on what I've seen so far it's hard for me to be confident.)

I recognize these are all weird edge cases, but I would rather not open a new can of worms of inconsistent behavior by making a best effort to handle what is already an edge case (trying to using the same name for a field and an annotation in a single type) that already has a workaround (using a type alias for the type that has a conflict with the field name), and for which the handling should be significantly simplified once PEP649 is live.

@sneakers-the-rat
Copy link

one initial caveat: i just want to be clear that I am not trying to be difficult! the main thing i'm interested in here is that I think it should be possible to have fields have the same name as their annotation - that, to me, seems like a pretty normal thing that the "average user" would want to do (my initial example was naming a date field "date"), and the workaround is pretty hard to discover (particularly without informative error message) so I'm not just being pedantic for the sake of arguing, I think it would be a genuine UX improvement to fix, since the current behavior (of python) is as far as i can tell never what someone would actually intend (but i could be wrong!) ♥️


First, on scope:

Resolving this would require more care than just doing a string-lookup-based substitution using the globalns dict prior to evaluating the type hint.

The problem here is specifically that python evaluates the assignment operator before populating the annotations dictionary, so in the scope of the assignment the annotation becomes the assigned value. The idea is to just reverse that ordering - if the annotation has been replaced by the assigned value, go out to the parent scope and find the original name. This is not an attempt to fix delayed annotations generally, just fix the specific case where a) the field name is the same name as the annotation, and b) the expression has assignment.

So yes! surely edge cases, that little patch was just a provisional demonstration, so surely there's a more airtight way to do that, but that would be what any sort of fix would have to do (pending PEP649).


this example:

MyStr = str
class A(BaseModel):
    MyStr: MyStr = 'default'

works with future annotations, but has the same misbehavior as the current problem without them - 'default' isn't in the parent namespace, so my modification doesn't do anything.

You're right though, that my change was too specific - FieldInfo (I think!) should never go in an annotation, so that check is unambiguous, but not general enough. What we really want to check is whether the annotation has been replaced by the assigned value, so if I modify my first check like this:

if hasattr(obj, name) and value is getattr(obj, name):
    # If the value in the annotation is the same as is assigned,
    # either as a literal default or a Field() object, then Python
    # has substituted the annotation value with the assignment value.
    # We attempt to get the symbol with the same name from the enclosing
    # parent scope, returning the value unchanged if it's not present
    value = globalns.get(name, value)

then this passes:

def test_issue_7327_comment_1715813268_1():
    """
    https://github.com/pydantic/pydantic/issues/7327#issuecomment-1715813268
    """
    class MyModel(BaseModel):
        A: A = 'default'

    assert MyModel.model_fields['A'].annotation is str
    assert MyModel().A == 'default'

This example:

# Comment/uncomment the following
# class Settings:
#     pass


class MyClass:
    class Settings:
        pass

    my_settings: list[Settings]

shouldn't be affected by my change -

I believe with your current code for looking up annotations in the globalns, it will fail to do the lookup because it will be looking for the string 'list[Settings]' instead of 'Settings'.

The lookup should only happen iff the annotation is a string (from future annotations, delayed literal string annotations) AND the value of that annotation is a key in the annotations dict (ie. is the name of a model field).

Since the name of the field is not the name of the annotation (list[Settings] is not a valid python name), and there is no assignment operation that is evaluated before the annotations are stored (the root of the problem), neither of my checks do anything

You're again right to be concerned about "which parent scope does the globalns get, but it looks pretty well contained. it gets constructed here (u probably already know since u are a maintainer after all, but just for the sake of public completeness lol):

if __pydantic_reset_parent_namespace__:
cls.__pydantic_parent_namespace__ = build_lenient_weakvaluedict(parent_frame_namespace())
parent_namespace = getattr(cls, '__pydantic_parent_namespace__', None)
if isinstance(parent_namespace, dict):
parent_namespace = unpack_lenient_weakvaluedict(parent_namespace)

which if i'm reading it right by default gets 2 frames above, excluding the module-global ns:

def parent_frame_namespace(*, parent_depth: int = 2) -> dict[str, Any] | None:

So this test passes, the evaluation looks like it's contained to the local containing scopes, rather than grabbing from the top.

class SettingsA:
    pass
    
def test_issue_7327_comment_1715813268_2():
    """
    https://github.com/pydantic/pydantic/issues/7327#issuecomment-1715813268
    """
    # First the original case in the comments
    class MyClass(BaseModel, arbitrary_types_allowed=True):
        class SettingsA:
            pass

        my_settings: list[SettingsA]

    assert MyClass.model_fields['my_settings'].annotation == list[MyClass.SettingsA]

    # then an additional case to check that we aren't getting
    # the global-global namespace, just the parent-global namespace relative
    # to the scope evaluating the annotation
    class MyOtherClass(BaseModel, arbitrary_types_allowed=True):
        SettingsB: SettingsA = Field(default_factory=SettingsA)
        class SettingsA(BaseModel):
            SettingsA: SettingsA = Field(default_factory=SettingsA)

    assert MyOtherClass.SettingsA.model_fields['SettingsA'].annotation is MyOtherClass.SettingsA

imagine if that Settings class was itself generic with multiple parameters that were parametrized with forwardrefs to classes defined differently in the two namespaces.

I am trying now to concoct an example of this but can't quite come up with one - these checks should only be triggered if the annotation value is the same object as the assigned value (without future annotations), or the annotation is a string that is one of the models keys. So if the settings was generic and had a bunch of same-named fields, they would first get replaced by any names in the outer scope, and then it should behave as expected since any forward refs in the Settings object should have already been evaluated (i am pretty sure what's what eval_type_lenient called immediately afterwards does).


I recognize these are all weird edge cases, but I would rather not open a new can of worms of inconsistent behavior by making a best effort to handle what is already an edge case (trying to using the same name for a field and an annotation in a single type) that already has a workaround (using a type alias for the type that has a conflict with the field name), and for which the handling should be significantly simplified once PEP649 is live.

Understood! handling weird edges cases is good and I wouldn't want to introduce unnecessary complexity. I think that using the same name for a field and an annotation is not exactly an edge case, which is why I am spending the time here (because I am working with a bunch of generated models from different schema languages that aren't defined first in Pydantic). I think it's probably very normal to want to name a date field date, or a Path field Path. If this is no good, then I think to handle the recursion error with an informative error you would need to check for when FieldInfo() is used in the annotation (the recursion error being a symptom of the root cause of the evaluation order of annotations), and at that point I figure why not just fix it :)

I can again confirm that all the rest of the pydantic tests still pass with this change, and the tests included in the patch pass with and without future annotations:

Updated patch here:
main...sneakers-the-rat:pydantic:recursion_error

@chicco785
Copy link

any reason why this issue is closed? :(

@sneakers-the-rat
Copy link

sneakers-the-rat commented Nov 28, 2023

problem seems to keep getting closed citing previous issues for tracking purposes. appreciate limited dev time and @Viicos doing good triaging, but this issue def has the most discussion i think and should be reopened given how common this is and will be until the above mentioned PEPs are implemented and merged.

@samuelcolvin 's argument here: #6646 (comment)
(edit: sorry that was @Viicos, didn't mean to spam notifs. leaving in previous tag so this comment apologizing for the notification makes sense and then i can take it out lol)

is technically correct but above discussion shows migitations with tests. PEP 649 is scheduled to be implemented in 3.13, which is expected to be released on october 10th, 2024. I think it's worth special casing this bug until then, hell maybe even with a warning, even if just to save the devs triage time from the many times it will be raised again.

at the end of the day, i think it is normal to name a date field date. The language has a weird edge case here but i think that's gotta be pretty far below a WONTFIX threshold, esp given the above discussion that so far hasn't yielded any irresolvable test failures.

@Viicos
Copy link
Contributor

Viicos commented Nov 28, 2023

I still think this shouldn't be supported for the reason I've stated here and the fact that they are workarounds (that make it less confusing), but I understand the frustration, especially with common use cases such as date: date. I think I've run into it a couple times already.

I've played around will working on #8243, and found an example that breaks at "the Python level":

def f():
    @dataclass
    class Thing:
        i: int

    @dataclass
    class Other:
        Thing: Thing  # Raises NameError: name 'Thing' is not defined

f()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable
Projects
None yet
Development

No branches or pull requests

6 participants