Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strip_whitespace after coercion, but before constraint checks on custom data types #6531

Closed
4 of 13 tasks
prryplatypus opened this issue Jul 8, 2023 · 5 comments
Closed
4 of 13 tasks
Assignees
Labels
feature request unconfirmed Bug not yet confirmed as valid/applicable

Comments

@prryplatypus
Copy link

prryplatypus commented Jul 8, 2023

Initial Checks

  • I have searched Google & GitHub for similar requests and couldn't find anything
  • I have read and followed the docs and still think this feature is missing

Description

Note: See #6469 for the full discussion.

In Pydantic V1, it was possible to do something as simple as this, which resulted in the following operations happening in this order:

  1. Type checking/coercion
  2. Whitespace stripping
  3. Length constraint checks
from pydantic import StrictStr

class Username(StrictStr):
    min_length = 3
    max_length = 20
    strip_whitespace = True

However, in Pydantic V2 this case is particularly tricky because the min_length constraint gets applied after coercion but before the Python string is created, and hence before the AfterValidator. So there is really no way to insert custom logic between these two steps.

The closest one can get is by using something like this, but this is a lot of mental gymnastics for something that was trivial in V1.

from typing_extensions import Annotated
from pydantic import BaseModel, Field, BeforeValidator

Username = Annotated[str, BeforeValidator(lambda x: str.strip(str(x))), Field(min_length=3, max_length=20)]

@adriangb suggested the easiest thing to do would probably be to just add back str_strip_whitespace to Field or create some sort of ConStr() marker for annotated (Annotated[str, ConStr(strip_whitespace=True), Field(max_length=10, ...)] or something like that), and that is why I am creating this feature request.

Affected Components

Selected Assignee: @lig

@zakstucke
Copy link
Contributor

Not much simpler at the minute, but this as_bv() util allows you to use existing validatory types such as constr, rather than rolling your own unnecessarily, by wrapping the type in a BeforeValidator.

tp.Annotated[str, Field(min_length=3, max_length=10), as_bv(constr(strip_whitespace=True))]

I'm hoping the equivalent of as_bv() is done implicitly pydantic-side at some point as per #6506 (comment)

import typing as tp
from pydantic import BeforeValidator, TypeAdapter, constr, Field

def as_bv(t: type) -> BeforeValidator:
    """Converts a type into a BeforeValidator"""
    ta = TypeAdapter(t)
    return BeforeValidator(lambda v: ta.validate_python(v))

MyT = tp.Annotated[str, Field(min_length=3, max_length=10), as_bv(constr(strip_whitespace=True))]

res = TypeAdapter(MyT).validate_python("   123456789   ")

print(f"res: '{res}'")
res: '123456789'

@adriangb
Copy link
Member

adriangb commented Jul 9, 2023

This comment relates more to #6506 but I'm keeping it here since this is where I originally posted it.

There are fundamental problems with this approach that will not allow us to include it in Pydantic directly.

Practically speaking, we've worked very hard to make the entirety of validation happen within a single Rust call (with callbacks to Python when necessary, yes, but avoiding these as much as possible for performance reasons). Not only would inserting a call to a TypeAdapter be problematic performance-wise, but it's also going to wreak havoc with exceptions (the exception from that validation call will end up just being a repr(ValueError()) instead of being collected into the top level ValueError) and probably other things I'm not even thinking of that we won't figure out until it's too late. It's just fundamentally an approach that Pydantic is not designed for.

On a perhaps more philosophical note the metadata in Annotated is not intended to be types, and while there are no runtime errors by putting a type in there it does seem just wrong to do.

That is not to say you can't use this utility function if it works for you, I just don't think it's as simple, universal, and cheap of a win as it seems on the surface.

@lig lig closed this as not planned Won't fix, can't repro, duplicate, stale Jul 17, 2023
@prryplatypus
Copy link
Author

@lig why is this "not planned"? It's a step back from Pydantic V1.

@adriangb suggested the easiest thing to do would probably be to just add back str_strip_whitespace to Field or create some sort of ConStr() marker for annotated (Annotated[str, ConStr(strip_whitespace=True), Field(max_length=10, ...)] or something like that), and his last comment in this thread related more to the other issue than to this one, as he said.

@lig
Copy link
Contributor

lig commented Aug 7, 2023

@prryplatypus At the moment solving this is too complicated as it would require some serious work in pydantic-core and it looks tricky to implement without affecting overall Pydantic performance.

This is not a step back but a cost of significant performance improvement of Pydantic v2 comparing to Pydantic v1.

That said, PRs are always welcome. In this particular case it's important that the change wouldn't affect the performance.

@adriangb
Copy link
Member

adriangb commented Aug 7, 2023

#6951 maybe solves this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request unconfirmed Bug not yet confirmed as valid/applicable
Projects
None yet
Development

No branches or pull requests

4 participants