Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize unsubstituted type vars as Any #7606

Merged
merged 5 commits into from Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
109 changes: 90 additions & 19 deletions docs/concepts/models.md
Expand Up @@ -783,43 +783,114 @@ print(concrete_model(a=1, b=1))

If you need to perform isinstance checks against parametrized generics, you can do this by subclassing the parametrized generic class. This looks like `class MyIntModel(MyGenericModel[int]): ...` and `isinstance(my_model, MyIntModel)`.

If a Pydantic model is used in a `TypeVar` constraint, [`SerializeAsAny`](serialization.md#serializing-with-duck-typing) can be used to
serialize it using the concrete model instead of the model `TypeVar` is bound to.
If a Pydantic model is used in a `TypeVar` constraint or bound and the generic type is never parametrized then Pydantic will use the constraint or `TypeVar` default for validation but treat the value as `Any` in terms of serialization:

```py
from typing import Generic, TypeVar
from typing import Generic, Optional, TypeVar

from pydantic import BaseModel, SerializeAsAny
from pydantic import BaseModel


class Model(BaseModel):
a: int = 42
class ErrorDetails(BaseModel):
foo: str


class DataModel(Model):
b: int = 2
c: int = 3
ErrorDataT = TypeVar('ErrorDataT', bound=ErrorDetails)


class Error(BaseModel, Generic[ErrorDataT]):
message: str
details: Optional[ErrorDataT]


class MyErrorDetails(ErrorDetails):
bar: str


# serialized as Any
error = Error(
message='We just had an error',
details=MyErrorDetails(foo='var', bar='var2'),
)
assert error.model_dump() == {
'message': 'We just had an error',
'details': {
'foo': 'var',
'bar': 'var2',
},
}

# serialized using the concrete parametrization
# note that `'bar': 'var2'` is missing
error = Error[ErrorDetails](
message='We just had an error',
details=ErrorDetails(foo='var'),
)
assert error.model_dump() == {
'message': 'We just had an error',
'details': {
'foo': 'var',
},
}
```

If you use a `default=...` for a `TypeVar` (available in Python >= 3.13 or via `typing-extensions`) the default value will be used for both validation and serialization if the type variable is not parametrized. You can override this behavior using `pydantic.SerializeAsAny`:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmontagu I think this is perhaps the controversial part of this PR. My reasoning was that a bound and default are different semantically and this distinction led me to implement other behavior. This is also an opportunity to preserve the SerializeAsAny example that was here before. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your interpretation of the semantics. (And in particular, this choice.)


BoundT = TypeVar('BoundT', bound=Model)
```py
from typing import Generic, Optional

from typing_extensions import TypeVar

class GenericModel(BaseModel, Generic[BoundT]):
data: BoundT
from pydantic import BaseModel
from pydantic.functional_serializers import SerializeAsAny


class SerializeAsAnyModel(BaseModel, Generic[BoundT]):
data: SerializeAsAny[BoundT]
class ErrorDetails(BaseModel):
foo: str


data_model = DataModel()
ErrorDataT = TypeVar('ErrorDataT', default=ErrorDetails)

print(GenericModel(data=data_model).model_dump())
#> {'data': {'a': 42}}

class Error(BaseModel, Generic[ErrorDataT]):
message: str
details: Optional[ErrorDataT]

print(SerializeAsAnyModel(data=data_model).model_dump())
#> {'data': {'a': 42, 'b': 2, 'c': 3}}

class MyErrorDetails(ErrorDetails):
bar: str


# serialized using the default's serializer
error = Error(
message='We just had an error',
details=MyErrorDetails(foo='var', bar='var2'),
)
assert error.model_dump() == {
'message': 'We just had an error',
'details': {
'foo': 'var',
},
}


class SerializeAsAnyError(BaseModel, Generic[ErrorDataT]):
message: str
details: Optional[SerializeAsAny[ErrorDataT]]


# serialized as Any
error = SerializeAsAnyError(
message='We just had an error',
details=MyErrorDetails(foo='var', bar='baz'),
)
assert error.model_dump() == {
'message': 'We just had an error',
'details': {
'foo': 'var',
'bar': 'baz',
},
}
```

## Dynamic model creation
Expand Down
10 changes: 8 additions & 2 deletions pydantic/_internal/_generate_schema.py
Expand Up @@ -1450,11 +1450,17 @@ def _unsubstituted_typevar_schema(self, typevar: typing.TypeVar) -> core_schema.
assert isinstance(typevar, typing.TypeVar)

if typevar.__bound__:
return self.generate_schema(typevar.__bound__)
schema = self.generate_schema(typevar.__bound__)
elif typevar.__constraints__:
return self._union_schema(typing.Union[typevar.__constraints__]) # type: ignore
schema = self._union_schema(typing.Union[typevar.__constraints__]) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constraints feel closer to default than to a bound to me, I would personally make them function the old way. I’ll leave it up to you though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay changed

elif hasattr(typevar, '__default__'):
return self.generate_schema(getattr(typevar, '__default__'))
else:
return core_schema.any_schema()
schema['serialization'] = core_schema.wrap_serializer_function_ser_schema(
lambda x, h: h(x), schema=core_schema.any_schema()
)
return schema

def _computed_field_schema(
self,
Expand Down
9 changes: 5 additions & 4 deletions pydantic/types.py
Expand Up @@ -989,6 +989,7 @@ class Foo(BaseModel):
=== ":white_check_mark: Do this"
```py
from decimal import Decimal

from typing_extensions import Annotated

from pydantic import BaseModel, Field
Expand Down Expand Up @@ -1080,7 +1081,7 @@ def __hash__(self) -> int:
```py
import uuid

from pydantic import BaseModel, UUID1
from pydantic import UUID1, BaseModel

class Model(BaseModel):
uuid1: UUID1
Expand All @@ -1094,7 +1095,7 @@ class Model(BaseModel):
```py
import uuid

from pydantic import BaseModel, UUID3
from pydantic import UUID3, BaseModel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this change but looks like something in our docstring formatting stuff is sorting in reverse


class Model(BaseModel):
uuid3: UUID3
Expand All @@ -1108,7 +1109,7 @@ class Model(BaseModel):
```py
import uuid

from pydantic import BaseModel, UUID4
from pydantic import UUID4, BaseModel

class Model(BaseModel):
uuid4: UUID4
Expand All @@ -1122,7 +1123,7 @@ class Model(BaseModel):
```py
import uuid

from pydantic import BaseModel, UUID5
from pydantic import UUID5, BaseModel

class Model(BaseModel):
uuid5: UUID5
Expand Down
61 changes: 61 additions & 0 deletions tests/test_generics.py
Expand Up @@ -2622,3 +2622,64 @@ class Model(Generic[T], BaseModel):
m1 = Model[int](x=1)
m2 = Model[int](x=1)
assert len({m1, m2}) == 1


@pytest.mark.parametrize(
'type_var',
[
TypeVar('ErrorDataT', bound=BaseModel),
TypeVar('ErrorDataT', BaseModel, str),
],
)
def test_serialize_unsubstituted_typevars_bound_or_constraint(
type_var: TypeVar,
) -> None:
class ErrorDetails(BaseModel):
foo: str

class Error(BaseModel, Generic[type_var]):
message: str
details: Optional[type_var]

class MyErrorDetails(ErrorDetails):
bar: str

sample_error = Error(
message='We just had an error',
details=MyErrorDetails(foo='var', bar='baz'),
)

assert sample_error.model_dump() == {
'message': 'We just had an error',
'details': {
'foo': 'var',
'bar': 'baz',
},
}


def test_serialize_unsubstituted_typevars_default() -> None:
from typing_extensions import TypeVar

class ErrorDetails(BaseModel):
foo: str

DataT = TypeVar('DataT', default=ErrorDetails)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice example 👍


class Error(BaseModel, Generic[DataT]):
message: str
details: Optional[DataT]

class MyErrorDetails(ErrorDetails):
bar: str

sample_error = Error(
message='We just had an error',
details=MyErrorDetails(foo='var', bar='baz'),
)
assert sample_error.model_dump() == {
'message': 'We just had an error',
'details': {
'foo': 'var',
},
}