Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union field is broken: The order of Union arguments is indeterminate #247

Open
dairiki opened this issue Sep 16, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@dairiki
Copy link
Collaborator

dairiki commented Sep 16, 2023

Semantically, the order of the arguments of a union type is insignificant. Union[int, str] and Union[str, int] refer to the same type. When we ask a typing.Union instance for its arguments, it is under no obligation to give them to us in the same order as was used to create the instance.

Our [Union marshmallow field], however, does attach significance to the order of the arguments. When de/serializing a value, it tries to do so using a field appropriate for each of the types in the union, in order. E.g. for a (dataclass) field with type Union[int str], we try to deserialize first as an int, falling back to str, only if the input can not be parsed as an integer.

So the whole design of our system for handling union fields is flawed.

Most of the time, it happens to work. Python's typing does have a type cache. When a generic alias (e.g. a Union) is instantiated, a previously created instance may be returned, if one has already been created with the same arguments.

The type cache system (currently) does not ignore argument order, a fact which mostly saves us:

>>> from typing import *
>>> Union[int, str] is Union[int, str]
True
>>> Union[int, str] is Union[str, int]
False
>>> Union[int, str].__args__
(<class 'int'>, <class 'str'>)
>>> Union[str, int].__args__
(<class 'str'>, <class 'int'>)

Equality comparison between Unions, however, does ignore the argument order:

>>> Union[int, str] == Union[str, int]
True

When arguments to a union type are themselves unions, this can start to cause trouble.

>>> Union[Union[int, str]] is Union[Union[str, int]]
True
>>> Union[Union[int, str]]
typing.Union[int, str]
>>> Union[Union[str, int]]
typing.Union[int, str]

Because the argument lists to the outer calls to Union are equal, we get the same cached instance back. One of the two results will have the "wrong" argument order.


Here's a simple script that exercises the problem.

from typing import Optional, Union
from marshmallow_dataclass import dataclass

# Comment out the next line and the assert will pass, otherwise it will fail
Optional[Union[str, int]]

@dataclass
class Test:
    x: Optional[Union[int, str]]

assert Test.Schema().load({"x": "42"}) == Test(x=42)

The failure of the test in question (this time) was caused by the addition of an unrelated function annotation in pytest-mypy-plugins 1.11.0.
(The annotation involves Optional[Union[str, int]]. When our test constructs an Optional[Union[int, str]] we get the cached first instance that has the arguments in the other order.)


Related

This issue surfaced and was discussed in PR #246.

@dairiki dairiki added the bug Something isn't working label Sep 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant