-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic model without arguments leads to serialization data loss after V2 #6895
Comments
This might be kind of related to #6882 This is just a different variation of the issue. I understand that in V2 its desirable to not "expose" fields of a derived model-instance when the field-model uses a super class of the derived one. You are free to close the issue. But it would be nice to somehow capture the issue statically (mypy) or at creation time. |
@MarkusSintonen thanks for another clear and thoughtful issue report. So, I'll just point out at least two things you can do to help resolve this (but spoiler alert: I don't think they address your concern fully): Option 1: Use import json
from typing import Generic, TypeVar
from pydantic import BaseModel, SerializeAsAny
dump_json = False
class InnerBase(BaseModel):
a: str
class Inner(InnerBase):
b: str
T = TypeVar("T", bound=SerializeAsAny[InnerBase])
class MyModel(BaseModel, Generic[T]):
val: T
print(MyModel(val=Inner(a="foo", b="bar")).model_dump())
#> {'val': {'a': 'foo', 'b': 'bar'}}
print(MyModel[InnerBase](val=Inner(a="foo", b="bar")).model_dump())
#> {'val': {'a': 'foo'}} Option 2: just explicitly specify the generic parameters when instantiating the model (which I did in the final line above) — this would address the issue in both of your cases above (i.e., using obj: Model[Inner] = ModelInner`). Sadly, I haven't found a great way to get the behavior you really want, which is that: obj: MyModel[Inner] = MyModel(val=Inner(a="foo", b="bar")) isn't actually a lie to mypy that can't be caught (since (Knowing your skill with both python and pydantic, and your comments about wanting to capture the issue statically, I expect you've already considered both of the above approaches and find them wanting. But I figured I'd still include it explicitly for anyone else reading this who might not have considered them.) Regarding improving the situation with type checkers: I think this would be very hard to get handled better by mypy because of the sorts of assumptions that mypy's typing makes about how generic parameters work. But, on the other hand, making pydantic generics behave more like standard typing generics would also be hard because it would be hard to get separate validators etc. producing the same class. (The idea would be to return a custom subclass of I think not impossible, and @adriangb and I investigated it pretty deeply a few months ago before giving up, I think we both still think it's possible with enough elbow grease, though may introduce some (subtle) breaking changes for advanced users of pydantic generics — for example, at least without additional effort and API implementations, instances of the (generic model) class won't know what type parameters were used during their validation (in contrast with the current behavior, where a subclass specific to the specific set of type parameters is created). So at any rate such changes may be a better target for v3. At any rate, definitely open to other suggestions related to how this might get improved. |
I feel its better to fix the issue we have in code where its accidentally using the generics without params. Would always remember to use
I think this is better option so using Pydantic generics "correctly". But also this is prone to suddenly fail as both options require developer to remember to use generics carefully. @dmontagu What do you think about adding atleast deprecation warnings when user tries to call model initializers when it has free generic params? (I feel it shouldnt allow it to get initialized in V2). If the warning even has a dedicated class it would allow asserting in the unit tests that no such warning ever gets raised (assuming full code coverage). FYI I had to temporarily use my fork to verify our large codebase didnt have any lingering issues: https://github.com/pydantic/pydantic/compare/main...MarkusSintonen:pydantic:check-no-generics-without-args?expand=1 This allowed to spot couple of additional risky places which were not captured by tests |
@MarkusSintonen My gut reaction is I don't love the idea of doing a deprecation warning about this because I could imagine a future where we actually change things up and decide this is the intended usage pattern, and it would be unfortunate to un-deprecate it. (And I wish it were possible for it to behave as desired like this, since this is the pattern that standard library generics use. But obviously there are reasons this is probably not a great choice for pydantic, which we've discussed above.) I could be convinced otherwise on this, but I also want to suggest another option — a config setting for what to do if you try to instantiate a generic model with free parameters (with choices Thoughts on that? |
Config parameter would sound fine to me!
I also think it would be pretty sensible to make it error by default. That would make migration to V2 much safer. But could be configurable if someone really wants to use it so. Then in future if there is some kind of smarter inference (probably very hard) then the default could be changed |
@MarkusSintonen another thing we could do that I was hesitant to mention, but will put on the table, is that we could apply the SerializeAsAny behavior automatically when attributes are typevars that are falling back to non-parametrized default values. (Only relevant if that proposed config setting isn't set to My main concern is that this might be a bit less intuitive once you understand how things work, but it would make it safer to migrate from v1, and I suspect in most cases this would probably be the preferred behavior anyway if you are purposely not specifying a parametrization and know that you could. @adriangb not sure if you'd disagree strongly enough to veto that idea though, haha. |
Does it incur then some "hidden" performance penalty btw? |
I think this was fixed with #7606. Happy to reopen if anyone is still running into this problem, but the initial MRE is now passing for me on the latest version of Pydantic 😄. |
Initial Checks
Description
On our codebase we had accidental piece of code where a Pydantic generic model was missing arguments which leads to silent data loss on serialization. See below example how the data loss may happen. Gladly our test suite was able to catch this but opening this issue so you are aware of this (maybe a major) breaking change. Could we consider in V2 that initializing a generic model without arguments gives a validation failure or disallowing it to happen via mypy-plugin? Could also be a model-config parameter to disallow usage of generics with missing args.
Below example passes with V1.
Example Code
Python, Pydantic & OS Version
Selected Assignee: @samuelcolvin
The text was updated successfully, but these errors were encountered: