-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V2 - Validating int->str #6045
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
|
I really don't think it makes sense as the default in V2 to coerce numbers to strings. Coercing strings to field types is a very special and very common case, it doesn't mean we should support the reverse. See this bit of the V2 plan for more details. The only thing I'd be willing to accept would be a compatibility shim (perhaps set via |
But not many people have asked about it, the one person I've spoken to who mentioned this basically said "the new behaviour is right, the only thing that confused me was the documentation being wrong". We'd need more demand for this before adding the shim. |
yeah, well that makes some sense, but my feeling is this will be pop up once it get released to wider audience like imagine you used pydantic to parse some yaml: import yaml
from pydantic import BaseModel
YAML = """
details:
name: 12345
description: Hello world
"""
class Details(BaseModel):
name: str
description: str
raw_data = yaml.safe_load(YAML)
print(raw_data)
data = Details(**raw_data["details"]) I do not see any nice way to handle this except change ALL |
@vitalik just to be clear, we'd be able to get it to behave the old way (i.e., converting That said, I do still understand the interest in a "global" flag. (But I also understand Samuel's point about wanting to see more concrete demand first.) |
well that's my issue here - as you can see in my example i'm not initialising model with arguments (so type checkers do not play here) I'm passing raw yaml data just to parse and validate user input Basically, I'm inspecting an impact of one of my clients where data parsing is heavily done with pydantic and migrating to v2 could simply lead to validation errors for data that was fine before Changing all the code from |
@adriangb I’m wondering if we could override the get_pydantic_core_schema or one of the other magic methods and have it insert annotations for certain types? I guess maybe a prepare pydantic annotations for a BaseModel class? Not sure if that would work on the class level instead of the field level but I’m thinking there might be something here that makes this “configurable” even without the use of config |
I think we can create or should be able to make changes so we can create a custom type like We have also discussed about making a hook to do some sort of global replacement |
Right I was mostly curious if there might be a way to hook into schema generation for models specifically as a way to handle this. I’ll play with it a bit this afternoon and see if I can find anything that works without code changes today, or at least see how far it is from working. |
So yeah you can do this: from typing import Annotated
from pydantic import BeforeValidator, TypeAdapter
LaxStr = Annotated[str, BeforeValidator(lambda x: str(x))]
ta = TypeAdapter(LaxStr)
print(ta.validate_python(123)) (Sorry about any confusion above about But there's no way to set this globally or recursively for a model. You could write a function that recursively inspects a |
Yeah that’s what I was thinking. (Though I hadn’t fleshed it out as much, that’s helpful.) I think that might be enough even if it didn’t do any nested model recursion (though it might need to recurse on core schemas for the purpose of interacting properly with other validators etc., at which point maybe it’s easier to do it fully recursive) since the hypothetical user could just use that as the base class for all their models. |
Caution Annotated[str, BeforeValidator(lambda x: str(x))]
ta.validate_python(None) # will result -> "None" (as type str) |
Probably want to use |
I am currently migrating v1 -> v2 and am facing issues with this. It would be great to at least have the option to set this in the model configs. The use case I have for coercing int -> str is we often have IDs that appear as numericals (e.g. 12839384) and are infered as such when using pandas, whereas we want to treat them as string in order to perform string manipulation or join them with other non-numerical IDs. Pydantic v1 allowed me not to worry about this, whereas now I have to coerce all of my IDs separately beforehand. May just add that I don't really see the benefit of removing this original behavior from v1, and that it will probably cause a lot of headaches when migrating, especially that this is not documented clearly. |
I think this could be another case for a type map we've discussed privately. As far as I remember one of the ideas was to have a global registry that will map an annotation to another one. It could look from a user point of view as something like this: type_map_registry: dict[type, type] = {str: Annotated[str, Field(force=True)]} This would force all |
@mrenner-ML Indeed, this isn't stated explicitly in the migration guide. In general, coercing numbers and more complex types to In Pydantic V2 it's possible to achieve the desired behavior using Custom Data Types. |
yeah, that should work for all of my cases... would also be nice to have this global and per base model class and a concrete model |
@vitalik I'm not sure when such a type map could be implemented. However, PRs are always welcome:) |
I can take a look, could you give some idea which module/class this best to add this functionality ? |
I think, |
with #6535 it would be possible to change annotation behaviour per model or even global BaseModel.model_config['replace_types'] = {str: LaxStr} |
What is the way to do this in 2.3.0 with pattern? Thank you! |
There seems to be another instance of documentation describing old behavior: https://docs.pydantic.dev/latest/usage/types/standard_types/ (https://github.com/pydantic/pydantic/blob/main/docs/usage/types/standard_types.md?plain=1#L14):
|
Remove bit about coercion of numeric types to str since it is not the case for v2 (pydantic#6045)
Solution: |
It would be great to have this as a configuration that pydantic loads from the environment/file. The reason is that patching BaseModel this way has the disadvantage that it must be done before importing any module that uses BaseModel, which adds to the mental load and is easy to mess up without noticing. Another possible solution is implementing a model validator in a base class that would replace BaseModel. This also has the disadvantage of having yet another thing to remember. |
@David-OConnor It might cause things break unexpectedly down the line if code using the model is not prepared to handle int. For example |
I've re-opened this. On balance I think we should add a config flag to re-instate the V1 behavior since grander changes to allow type replacement is going to take much longer. |
The case I'm referring to is parsing an int or string-of-int into int. So, downstream only sees and handles ints. |
Edit: Updating to 2.3.0 fixed this issue. I'm unsure if this is the same issue, but I'll post the issue I'm having here also. The data should be StrippedStr = Annotated[
str,
BeforeValidator(lambda v: v.strip() if isinstance(v, str) else v),
]
AlphaStr = Annotated[
StrippedStr,
Field(pattern=r"^[a-zA-Z]+$"),
]
class Data(BaseModel):
a: AlphaStr # works
b: AlphaStr | None # does not work (see error below)
|
@CallumAtCarter I can't reproduce, can you try installing from typing import Annotated
from pydantic import BaseModel, Field, BeforeValidator
StrippedStr = Annotated[
str,
BeforeValidator(lambda v: v.strip() if isinstance(v, str) else v),
]
AlphaStr = Annotated[
StrippedStr,
Field(pattern=r"^[a-zA-Z]+$"),
]
class Data(BaseModel):
a: AlphaStr # works
b: AlphaStr | None # does not work (see error below)
Data(a="abc", b=None)
Data(a="abc", b=" def ")
Data(a="abc", b="!!!")
"""
b
String should match pattern '^[a-zA-Z]+$' [type=string_pattern_mismatch, input_value='!!!', input_type=str]
For further information visit https://errors.pydantic.dev/2.3/v/string_pattern_mismatch
""" But yes this is unrelated to this issue so if that doesn't fix it for you please open a new discussion (and feel free to tag me). |
@adriangb I should have updated before posting 😔. Updating to 2.3.0 fixed it, thanks! |
@adriangb I really need this feature but I'm not familiar with Pydantic's release schedule. When can I expect this to land in a release? |
As soon as the release is out 🤷🏻♂️ Usually, this is not long after a fix is ready. At the moment, we are planning to make a release happen this week. There is a bunch of things coming to this release. Samuel outlined some priorities here #7324. This doesn't mean that everything from that list will make its way into the next release though. |
Hi team,
|
@durgeksh Hi! It seems like your comment is unrelated to the original issue. I'd recommend asking your question in a new discussion here https://github.com/pydantic/pydantic/discussions |
@lig Thank you. |
Continue of the #5993
this gives validation error that
s f d - Input should be a valid string
this was working in pydantic v1, and was actually very handy when dealing with parsing data of a public API where users send messy data from JS client sides
so would be nice to have some feature to be able to turn on str parsing (globally?)
Affected Components
Update: #6045 (comment)
The text was updated successfully, but these errors were encountered: