Skip to content

Commit

Permalink
Support for serialize_as_any runtime setting (#8830)
Browse files Browse the repository at this point in the history
Co-authored-by: David Montague <35119617+dmontagu@users.noreply.github.com>
  • Loading branch information
sydney-runkle and dmontagu committed Mar 25, 2024
1 parent df5be31 commit 33a275a
Show file tree
Hide file tree
Showing 6 changed files with 396 additions and 26 deletions.
173 changes: 171 additions & 2 deletions docs/concepts/serialization.md
Expand Up @@ -431,9 +431,28 @@ print(m.model_dump()) # note: the password field is not included
even if subclasses get passed when instantiating the object. In particular, this can help prevent surprises
when adding sensitive information like secrets as fields of subclasses.

### Serializing with duck-typing
### Serializing with duck-typing 🦆

If you want to preserve the old duck-typing serialization behavior, this can be done using `SerializeAsAny`:
!!! question "What is serialization with duck typing?"

Duck-typing serialization is the behavior of serializing an object based on the fields present in the object itself,
rather than the fields present in the schema of the object. This means that when an object is serialized, fields present in
a subclass, but not in the original schema, will be included in the serialized output.

This behavior was the default in Pydantic V1, but was changed in V2 to help ensure that you know precisely which
fields would be included when serializing, even if subclasses get passed when instantiating the object. This helps
prevent security risks when serializing subclasses with sensitive information, for example.

If you want v1-style duck-typing serialization behavior, you can use a runtime setting, or annotate individual types.

* Field / type level: use the `SerializeAsAny` annotation
* Runtime level: use the `serialize_as_any` flag when calling `model_dump()` or `model_dump_json()`

We discuss these options below in more detail:

#### `SerializeAsAny` annotation:

If you want duck-typing serialization behavior, this can be done using the `SerializeAsAny` annotation on a type:

```py
from pydantic import BaseModel, SerializeAsAny
Expand Down Expand Up @@ -468,6 +487,156 @@ annotated as `<SomeType>`, and type-checkers like mypy will treat the attribute
But when serializing, the field will be serialized as though the type hint for the field was `Any`, which is where the
name comes from.

#### `serialize_as_any` runtime setting

The `serialize_as_any` runtime setting can be used to serialize model data with or without duck typed serialization behavior.
`serialize_as_any` can be passed as a keyword argument to the `model_dump()` and `model_dump_json` methods of `BaseModel`s and `RootModel`s. It can also be passed as a keyword argument to the `dump_python()` and `dump_json()` methods of `TypeAdapter`s.

If `serialize_as_any` is set to `True`, the model will be serialized using duck typed serialization behavior,
which means that the model will ignore the schema and instead ask the object itself how it should be serialized.
In particular, this means that when model subclasses are serialized, fields present in the subclass but not in
the original schema will be included.

If `serialize_as_any` is set to `False` (which is the default), the model will be serialized using the schema,
which means that fields present in a subclass but not in the original schema will be ignored.

!!! question "Why is this flag useful?"
Sometimes, you want to make sure that no matter what fields might have been added in subclasses,
the serialized object will only have the fields listed in the original type definition.
This can be useful if you add something like a `password: str` field in a subclass that you don't
want to accidentally include in the serialized output.

For example:

```py
from pydantic import BaseModel


class User(BaseModel):
name: str


class UserLogin(User):
password: str


class OuterModel(BaseModel):
user1: User
user2: User


user = UserLogin(name='pydantic', password='password')

outer_model = OuterModel(user1=user, user2=user)
print(outer_model.model_dump(serialize_as_any=True)) # (1)!
"""
{
'user1': {'name': 'pydantic', 'password': 'password'},
'user2': {'name': 'pydantic', 'password': 'password'},
}
"""

print(outer_model.model_dump(serialize_as_any=False)) # (2)!
#> {'user1': {'name': 'pydantic'}, 'user2': {'name': 'pydantic'}}
```

1. With `serialize_as_any` set to `True`, the result matches that of V1.
2. With `serialize_as_any` set to `False` (the V2 default), fields present on the subclass,
but not the base class, are not included in serialization.

This setting even takes effect with nested and recursive patterns as well. For example:

```py
from typing import List

from pydantic import BaseModel


class User(BaseModel):
name: str
friends: List['User']


class UserLogin(User):
password: str


class OuterModel(BaseModel):
user: User


user = UserLogin(
name='samuel',
password='pydantic-pw',
friends=[UserLogin(name='sebastian', password='fastapi-pw', friends=[])],
)

print(OuterModel(user=user).model_dump(serialize_as_any=True)) # (1)!
"""
{
'user': {
'name': 'samuel',
'friends': [
{'name': 'sebastian', 'friends': [], 'password': 'fastapi-pw'}
],
'password': 'pydantic-pw',
}
}
"""

print(OuterModel(user=user).model_dump(serialize_as_any=False)) # (2)!
"""
{'user': {'name': 'samuel', 'friends': [{'name': 'sebastian', 'friends': []}]}}
"""
```

1. Even nested `User` model instances are dumped with fields unique to `User` subclasses.
2. Even nested `User` model instances are dumped without fields unique to `User` subclasses.

!!! note
The behavior of the `serialize_as_any` runtime flag is almost the same as the behavior of the `SerializeAsAny` annotation.
There are a few nuanced differences that we're working to resolve, but for the most part, you can expect the same behavior from both.
See more about the differences in this [active issue](https://github.com/pydantic/pydantic/issues/9049)

#### Overriding the `serialize_as_any` default (False)

You can override the default setting for `serialize_as_any` by configuring a subclass of `BaseModel` that overrides the default for the `serialize_as_any` argument to `model_dump()` and `model_dump_json()`, and then use that as the base class (instead of `pydantic.BaseModel`) for any model you want to have this default behavior.

For example, you could do the following if you want to use duck-typing serialization by default:

```py
from typing import Any, Dict

from pydantic import BaseModel, SecretStr


class MyBaseModel(BaseModel):
def model_dump(self, **kwargs) -> Dict[str, Any]:
return super().model_dump(serialize_as_any=True, **kwargs)

def model_dump_json(self, **kwargs) -> str:
return super().model_dump_json(serialize_as_any=True, **kwargs)


class User(MyBaseModel):
name: str


class UserInfo(User):
password: SecretStr


class OuterModel(MyBaseModel):
user: User


u = OuterModel(user=UserInfo(name='John', password='secret_pw'))
print(u.model_dump_json()) # (1)!
#> {"user":{"name":"John","password":"**********"}}
```

1. By default, `model_dump_json` will use duck-typing serialization behavior, which means that the `password` field is included in the output.

## `pickle.dumps(model)`

Pydantic models support efficient pickling and unpickling.
Expand Down
6 changes: 6 additions & 0 deletions pydantic/main.py
Expand Up @@ -298,6 +298,7 @@ def model_dump(
exclude_none: bool = False,
round_trip: bool = False,
warnings: bool = True,
serialize_as_any: bool = False,
) -> dict[str, Any]:
"""Usage docs: https://docs.pydantic.dev/2.7/concepts/serialization/#modelmodel_dump
Expand All @@ -315,6 +316,7 @@ def model_dump(
exclude_none: Whether to exclude fields that have a value of `None`.
round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T].
warnings: Whether to log warnings when invalid fields are encountered.
serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.
Returns:
A dictionary representation of the model.
Expand All @@ -330,6 +332,7 @@ def model_dump(
exclude_none=exclude_none,
round_trip=round_trip,
warnings=warnings,
serialize_as_any=serialize_as_any,
)

def model_dump_json(
Expand All @@ -344,6 +347,7 @@ def model_dump_json(
exclude_none: bool = False,
round_trip: bool = False,
warnings: bool = True,
serialize_as_any: bool = False,
) -> str:
"""Usage docs: https://docs.pydantic.dev/2.7/concepts/serialization/#modelmodel_dump_json
Expand All @@ -359,6 +363,7 @@ def model_dump_json(
exclude_none: Whether to exclude fields that have a value of `None`.
round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T].
warnings: Whether to log warnings when invalid fields are encountered.
serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.
Returns:
A JSON string representation of the model.
Expand All @@ -374,6 +379,7 @@ def model_dump_json(
exclude_none=exclude_none,
round_trip=round_trip,
warnings=warnings,
serialize_as_any=serialize_as_any,
).decode()

@classmethod
Expand Down
1 change: 1 addition & 0 deletions pydantic/root_model.py
Expand Up @@ -130,6 +130,7 @@ def model_dump( # type: ignore
exclude_none: bool = False,
round_trip: bool = False,
warnings: bool = True,
serialize_as_any: bool = False,
) -> Any:
"""This method is included just to get a more accurate return type for type checkers.
It is included in this `if TYPE_CHECKING:` block since no override is actually necessary.
Expand Down
6 changes: 6 additions & 0 deletions pydantic/type_adapter.py
Expand Up @@ -315,6 +315,7 @@ def dump_python(
exclude_none: bool = False,
round_trip: bool = False,
warnings: bool = True,
serialize_as_any: bool = False,
) -> Any:
"""Dump an instance of the adapted type to a Python object.
Expand All @@ -329,6 +330,7 @@ def dump_python(
exclude_none: Whether to exclude fields with None values.
round_trip: Whether to output the serialized data in a way that is compatible with deserialization.
warnings: Whether to display serialization warnings.
serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.
Returns:
The serialized object.
Expand All @@ -344,6 +346,7 @@ def dump_python(
exclude_none=exclude_none,
round_trip=round_trip,
warnings=warnings,
serialize_as_any=serialize_as_any,
)

def dump_json(
Expand All @@ -360,6 +363,7 @@ def dump_json(
exclude_none: bool = False,
round_trip: bool = False,
warnings: bool = True,
serialize_as_any: bool = False,
) -> bytes:
"""Usage docs: https://docs.pydantic.dev/2.7/concepts/json/#json-serialization
Expand All @@ -376,6 +380,7 @@ def dump_json(
exclude_none: Whether to exclude fields with a value of `None`.
round_trip: Whether to serialize and deserialize the instance to ensure round-tripping.
warnings: Whether to emit serialization warnings.
serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.
Returns:
The JSON representation of the given instance as bytes.
Expand All @@ -391,6 +396,7 @@ def dump_json(
exclude_none=exclude_none,
round_trip=round_trip,
warnings=warnings,
serialize_as_any=serialize_as_any,
)

def json_schema(
Expand Down
24 changes: 0 additions & 24 deletions tests/test_serialize.py
Expand Up @@ -17,9 +17,7 @@
Field,
FieldSerializationInfo,
PydanticUserError,
SecretStr,
SerializationInfo,
SerializeAsAny,
SerializerFunctionWrapHandler,
TypeAdapter,
computed_field,
Expand Down Expand Up @@ -902,28 +900,6 @@ def ser_model(self) -> Dict[str, Any]:
assert ta.dump_json(Model({'x': 1, 'y': 2.5})) == b'{"x":2,"y":7.5}'


def test_serialize_as_any() -> None:
class User(BaseModel):
name: str

class UserLogin(User):
password: SecretStr

class OuterModel(BaseModel):
maybe_as_any: Optional[SerializeAsAny[User]] = None
as_any: SerializeAsAny[User]
without: User

user = UserLogin(name='pydantic', password='password')

# insert_assert(json.loads(OuterModel(as_any=user, without=user).model_dump_json()))
assert json.loads(OuterModel(maybe_as_any=user, as_any=user, without=user).model_dump_json()) == {
'maybe_as_any': {'name': 'pydantic', 'password': '**********'},
'as_any': {'name': 'pydantic', 'password': '**********'},
'without': {'name': 'pydantic'},
}


@pytest.mark.parametrize('as_annotation', [True, False])
@pytest.mark.parametrize('mode', ['plain', 'wrap'])
def test_forward_ref_for_serializers(as_annotation, mode):
Expand Down

0 comments on commit 33a275a

Please sign in to comment.