New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#8379: allow for alias enforcement during serialization by setting serialize_by_alias
on ConfigDict
#8997
Conversation
CodSpeed Performance ReportMerging #8997 will not alter performanceComparing Summary
|
please review |
Thanks for this PR. I do think that we should offer a config setting for this behavior, but I have a few follow up questions / thoughts. Specifically, they relate to the fact that the from pydantic import BaseModel, ConfigDict, Field, ValidationError
class Model(BaseModel):
a: int = Field(..., alias='A')
class OuterModel(BaseModel):
x: Model = Field(..., alias='X')
y: int = Field(..., alias='Y')
# we validate using aliases by default (that is, the populate_by_name config setting is False, by default)
# looking forward, we could consider this to be equivalent to validate_by_alias = True, by default
m = OuterModel(X={'A': 123}, Y=456)
print(repr(m))
#> OuterModel(x=Model(a=123), y=456)
# when we serialize, the by_alias flag is False by default, so the field names are used
print(m.model_dump())
#> {'x': {'a': 123}, 'y': 456}
# if we want to use the aliases when serializing, we can use the by_alias flag, which notably applies to nested models
# as seen by the fact that all of the keys are capitalized in the result
print(m.model_dump(by_alias=True))
#> {'X': {'A': 123}, 'Y': 456}
class OuterModel2(BaseModel):
x: Model = Field(..., alias='X')
y: int = Field(..., alias='Y')
model_config = ConfigDict(populate_by_name=True)
# here, we validate using the field names, but we still use the aliases for the nested model,
# because the populate_by_name config setting doesn't apply to nested models
m = OuterModel2(x={'A': 123}, y=456)
print(repr(m))
#> OuterModel2(x=Model(a=123), y=456)
# if we don't, use field aliases for inner models we get a validation error,
# as the populate_by_name config setting doesn't apply to nested models
try:
m = OuterModel2(x={'a': 123}, y=456)
except ValidationError as e:
print(e)
"""
1 validation error for OuterModel2
x.A
Field required [type=missing, input_value={'a': 123}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.7/v/missing
"""
# with the new flag serialize_by_alias, we effectively patch the by_alias flag to be equal to the serialize_by_alias flag
# but this leads to inconsistent behavior in regards to nested models, compared to the populate_by_name (validation equivalent) flag So, given this context, here are some thoughts:
from pydantic import BaseModel, ConfigDict, Field, ValidationError
class Model(BaseModel):
a: int = Field(..., alias='A')
class OuterModel(BaseModel):
x: Model = Field(..., alias='X')
y: int = Field(..., alias='Y')
model_config = ConfigDict(serialize_by_alias=True)
m = OuterModel(X={'A': 123}, Y=456)
print(m.model_dump())
#> {'X': {'a': 123}, 'Y': 456}
class OuterModel2(BaseModel):
x: Model = Field(..., alias='X')
y: int = Field(..., alias='Y')
m = OuterModel2(X={'A': 123}, Y=456)
print(m.model_dump(by_alias=True))
#> {'X': {'A': 123}, 'Y': 456}
All of that being said, I do think that we need to move towards consistency for these alias focused settings. But I think we need to think more carefully about how we implement this config flag to avoid getting stuck maintaining another inconsistent API that users have issues with. |
I've added a commit documenting the nested behavior - which I agree could be better - but at least it is consistent with the recursive by_alias (thanks to the implementation) - i actually think a per model granularity would be better but - that requires getting into the rust codebase, and if ive read your comment correctly, the issue identified was inconsistency with by_alias? (still re-reading a couple of times on my end). |
for now my todo list is to either identify or create another per-model serialization setting |
I'll discuss with the team on Monday re the appropriate route forward. I think you've got a great start and are thinking about the right things! |
A more consolidated summary:
I would at least expect config settings to behave the same way in terms of application to nested models, and I would expect the same for parameters to validation / serialization method. However, the fact that these are different at the moment is a bit confusing, and may get more confusing if we choose to add a serialization config setting and/or a validation |
In the spirit of preservation of behavior, I would implement my change to
be consistent with populate by name - fine grained is a feature
…On Sun, Mar 17, 2024, 2:50 PM Sydney Runkle ***@***.***> wrote:
A more consolidated summary:
1. Flags for validation / serialization by alias are inconsistent in
terms of behavior with nested models
- populate_by_name config setting doesn't trickle down to nested models
- by_alias serialization parameter does trickle down to nested models
1. Default setting for validation vs serialization by alias is not
consistent (they should both use aliases by default in V3)
- populate_by_name is False by default (better phrased as
validate_by_alias is True)
- by_alias is False by default (better phrased as serialize_by_aliasis
False`)
I would at least expect config settings to behave the same way in terms of
application to nested models, and I would expect the same for parameters to
validation / serialization method. However, the fact that these are
different at the moment is a bit confusing, and may get more confusing if
we choose to add a serialization config setting and/or a validation
by_alias parameter.
—
Reply to this email directly, view it on GitHub
<#8997 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACECGJBZHBTD7AZLQMRWUT3YYXQYHAVCNFSM6AAAAABETLVZMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBSGU3TMNJZHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
serialize_by_alias
on ConfigDict
I haven't looked at the implementation, but I have read the discussion. We need either:
We've use both "config" and "settings" elsewhere, so I would propose class GlobalDefaults(TypedDict):
serialize_by_default: bool
# what else do we want here?
global_defaults = GlobalDefaults(serialize_by_default=True) |
Closing this for now - the global defaults work should be in a different PR, and the current implementation of the |
I still think #8379 specifies an important change to be made, just not in the way suggested in this PR. |
Change Summary
this allows for usage of field aliases as aliases - there is a discussion in #8379
Related issue number
#8379 (it may or may not be considered fully addressed after this PR)
Checklist
Selected Reviewer: @Kludex