Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting more validation errors then I would expect. #8014

Closed
1 task done
attila3d opened this issue Nov 4, 2023 · 6 comments
Closed
1 task done

Getting more validation errors then I would expect. #8014

attila3d opened this issue Nov 4, 2023 · 6 comments
Labels
bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation

Comments

@attila3d
Copy link

attila3d commented Nov 4, 2023

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

Hi! I just started to use pydantic, and I really love it.

In my following test, I have purposely provided a data so that the (validate_lock()) model validator would to raise a PydanticCustomError() inside the ParmData class. This error is indeed reported as expected.

However my issue is that I am also getting various other error raised, that I would not expect, and I so far could not figure out why. Such as this for example:

p5.ParmValueData.float
Input should be a valid number [type=float_type, input_value={'value': [1, {'expressio...'locked': [True, False]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.4/v/float_type

I simplified the example as much as I could, and here is a small explanation:

The data I am trying the validate is a dictionary (ParmDictData) of so called parameters (ParmData). Each parameter, could contain it's value (as ParmValueData), as well as it's locked state or whenever it is hidden or not.

The data for a dictionary of parameters could look something like this:
{
'p1' : { 'value': 5, 'hidden: True, 'locked': True},
'p2': { 'value': 'foo' },
}

However when the parameter data ONLY contains 'value' key, for ease of writing/editing the data, I could throw away 'value' key, and simply write the value of the parameter in the dictionary next to it's name:
{
'p1' : { 'value': 5, 'hidden: True},
'p2': 'foo',
}

Hence, I thought the best way to validate ParmDictData would be to declare the root type with a union such as this : dict[str, ParmData | ParmValueData] = None

But it seems to me, as soon as validate_lock() raises an error for a ParmData instance, pydantic then takes that data and tries to force validate it as ParmValueData as well. And as ParmValueData is also made of union type, (and those fail) it raises an error for each union type.

I have tried to set 'left_to_right' validation order but that did not help, and I am not sure why is this happening. If you uncomment validate_lock() then the data validates just fine otherwise.

Please let me know if I am doing anything wrong, or there is a better way to do this.
Attila

Example Code

from __future__ import annotations

from typing import (cast, TYPE_CHECKING, Any, Callable, Dict, Iterable, List,
                    Optional, Sequence, Tuple, Union)

from pydantic_core import PydanticCustomError
from pydantic import (BaseModel, RootModel, ConfigDict, ValidationError, model_validator, model_serializer)

from enum import Enum

expressiondata = {'expression': '$F4'}
keyframelistdata = [
    {'frame': 24, 'value': 0.1, 'keyexpression': 'linear()'},
    {'frame': 72, 'value': 1, 'keyexpression': 'linear()'}]

# In the following dictionary, I purposefully removed the would be 3rd item for the 'locked' key in
# parameter 'p5'. This way validate_lock() model validator inside the ParmData class will raise an error.
parmdictdata = {'p1': 1,
                'p2': expressiondata,
                'p3': {'value': expressiondata},
                'p4': {'value': 5, 'hidden': True},
                'p5': {'value': [1, expressiondata, False], 'locked': [True, False]},
                'p6': [expressiondata, False, keyframelistdata],
                'p8': keyframelistdata}

class ExpressionLanguage(Enum):
    Hscript = 'hscript'
    Python = 'python'


class KeyframeExpression(Enum):
    Constant = 'constant()'
    Linear = 'linear()'


class BaseData(BaseModel):
    model_config = ConfigDict(extra='allow', use_enum_values=True, validate_assignment=True, from_attributes=True)

    @classmethod
    def fromData(cls, data):
        """
        Creates a new class object instance given an input dictionary data
        """
        try:
            return cls(**data)
        except ValidationError as e:
            print(e)
            return None

    def asData(self):
        """
        Returns the serialized data
        """
        try:
            return self.model_dump(exclude_unset=True)
        except ValidationError as e:
            print(e)
            return None


class RootData(RootModel):

    @classmethod
    def fromData(cls, data):
        """
        Creates a new class object instance given an input dictionary data
        """
        try:
            return cls(data)
        except ValidationError as e:
            print(e)
            return None

    def asData(self):
        """
        Returns the serialized data
        """
        try:
            return self.model_dump(exclude_unset=True)
        except ValidationError as e:
            print(e)
            return None


class ParmValueExpressionData(BaseData):
    expression: str
    language: Optional[ExpressionLanguage] = None


class ParmValueKeyframeData(BaseData):
    frame: float
    value: float
    keyexpression: KeyframeExpression


class ParmValueKeyframesData(RootData):
    root: List[ParmValueKeyframeData]


class ParmValueData(RootData):
    """
    Class to store the value of a parameter.
    The value could be the basic types such as: int, float, bool, str, as well as
    a parameter value could hold data for an expression or for a list of keyframes.

    However a parameter could have multiple components, each component could take
    the form of any of the types mentioned above. For example, a vector3 value
    cloud be [float, ParmValueExpression, float]
    """
    root: (int | float | bool | str | ParmValueExpressionData | ParmValueKeyframesData |
           list[int | float | bool | str | ParmValueExpressionData | ParmValueKeyframesData])

class ParmData(BaseData):
    """
    This class represents a parameter that includes the Parameter Value Data along with other properties
    related to a parameter.
    """
    value: Optional[ParmValueData] = None
    locked: Optional[bool | list[bool]] = None
    hidden: bool = None

    @model_validator(mode='after')
    def validate_lock(self) -> 'ParmData':
        """
        Raise error when the item count under "locked" and "value" do not match.
        """
        if self.locked:

            locks = self.locked if isinstance(self.locked, list) else [self.locked, ]
            values = self.value if isinstance(self.value, list) else [self.value, ]

            if len(values) != len(locks):
                raise PydanticCustomError('locked_match_error',
                                          "Locked key should have the same number of components as Value key",
                                          {'value_count': len(values), 'locked_count': len(locks)}, )
        return self


class ParmDictData(RootData):
    """
    This class represents a dictionary that is a mix of ParmData and ParmValueData.

    """
    root: dict[str, ParmData | ParmValueData] = None

data_obj = ParmDictData.fromData(parmdictdata)

Python, Pydantic & OS Version

pydantic version: 2.4.2
        pydantic-core version: 2.10.1
          pydantic-core build: profile=release pgo=true
                 install path: /home/ati/.local/lib/python3.10/site-packages/pydantic
               python version: 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
                     platform: Linux-6.2.0-36-generic-x86_64-with-glibc2.35
             related packages: typing_extensions-4.8.0
@attila3d attila3d added bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation labels Nov 4, 2023
@samuelcolvin
Copy link
Member

I don't see a specific bug being reported here.

Please use discussions to ask questions, a shorter demonstration of your question will increase the chances of an answer.

If you really think there's a bug, please provide a clear description of the bug together with a MRE.

@attila3d
Copy link
Author

attila3d commented Nov 4, 2023

hm, I mean, I just started with pydantic, so I am not sure if it is a bug or I am doing something wrong. What it seems to me that when a union is given, pydantic is able to pick the right type, but when that type fails for other reasons like I raised an error for various other reasons, pydantic just jumps to try to validate all other types in the union which then naturally fail as a consequence.

@samuelcolvin
Copy link
Member

samuelcolvin commented Nov 5, 2023

Ah, now I understand your question.

Yes it does, that's by design.

If you want to avoid that, you should use tagged unions - https://docs.pydantic.dev/latest/api/standard_library_types/#discriminated-unions-aka-tagged-unions

@attila3d
Copy link
Author

attila3d commented Nov 5, 2023

Thank you so much for looking into it. I greatly appreciate it.

I have looked the animal examples under tag union before. However it seems to me, that tag selection is based on very specifically given values in the incoming data. So in the example, 'pet_type': 'cat' is in the data to be parsed.

However in my example, the data I parse (parmdictdata in the code below) does not include a specific key to be used as a discriminator.

So I am not sure if there is any other workaround not to try validating all other types in the union?

I have simplified my demonstration code, so maybe this helps better for testing it out.

Thanks so much for the help again!

from __future__ import annotations

from typing import (cast, TYPE_CHECKING, Any, Callable, Dict, Iterable, List,
                    Optional, Sequence, Tuple, Union)

from pydantic_core import PydanticCustomError
from pydantic import (BaseModel, RootModel, ValidationError, model_validator)

expressiondata = {'expression': '$F4', 'language': 'vex'}

parmdictdata = {
    'p1': 1,
    'p2': ['foo', expressiondata, 2.0],
    'p3': {'value': ['foo', expressiondata, 2.0], 'locked': [True, False, True]}
}


class ParmValueExpressionData(BaseModel):
    expression: str
    language: str


class ParmValueData(RootModel):
    root: (int | float | bool | str | ParmValueExpressionData |
           list[int | float | bool | str | ParmValueExpressionData])


class ParmData(BaseModel):
    value: Optional[ParmValueData] = None
    locked: Optional[bool | list[bool]] = None

    @model_validator(mode='after')
    def validate_locked(self) -> 'ParmData':
        raise PydanticCustomError('locked_match_error',
                                  "Locked key should have the same number of components as Value key",
                                  {'value_count': 1, 'locked_count': 1}, )


class ParmDictData(RootModel):
    root: dict[str, ParmData | ParmValueData]


try:
    parmdictmodel = ParmDictData(**parmdictdata)
    # print(parmdictmodel)
except ValidationError as e:
    print(e)

@samuelcolvin
Copy link
Member

#7983 is what you need, will be released next week.

@attila3d
Copy link
Author

attila3d commented Nov 5, 2023

Thanks so much for that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2 pending Awaiting a response / confirmation
Projects
None yet
Development

No branches or pull requests

2 participants