New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursive model with discriminated union has exponential time and space complexity for validation #8709
Comments
Thanks for reporting this. @davidhewitt, could you please take a look at this performance issue when you have a chance? Thank you! |
It looks like there's a potential schema-building bug here. The core schema looks like this:
In particular the interesting bit to me is that we have 3 definitions instead of 4; the Hence the exponential blowup. cc @adriangb as the schema building expert ;) |
The schema simplifies to: {
"type": "definitions",
"schema": {
"type": "tagged-union",
"choices": {
"nested": {
"type": "definition-ref",
"schema_ref": "__main__.NestedState:4587657680"
},
"loop": {
"type": "definition-ref",
"schema_ref": "__main__.LoopState:4587670848"
},
"leaf": {
"type": "definition-ref",
"schema_ref": "__main__.LeafState:5124597648"
}
},
"discriminator": "state_type"
},
"definitions": [
{
"type": "model",
"cls": "__main__.LeafState",
"schema": {
"type": "model-fields",
"fields": {
"state_type": {
"type": "model-field",
"schema": {
"type": "literal",
"expected": [
"leaf"
]
}
}
},
"model_name": "LeafState"
},
"ref": "__main__.LeafState:5124597648"
},
{
"type": "model",
"cls": "__main__.LoopState",
"schema": {
"type": "model-fields",
"fields": {
"state_type": {
"type": "model-field",
"schema": {
"type": "literal",
"expected": [
"loop"
]
}
},
"substate": {
"type": "model-field",
"schema": {
"type": "default",
"schema": {
"type": "union",
"choices": [
{
"type": "definition-ref",
"schema_ref": "__main__.NestedState:4587657680"
},
{
"type": "definition-ref",
"schema_ref": "__main__.LoopState:4587670848"
},
{
"type": "definition-ref",
"schema_ref": "__main__.LeafState:5124597648"
}
]
}
}
}
}
},
"ref": "__main__.LoopState:4587670848"
},
{
"type": "model",
"cls": "__main__.NestedState",
"schema": {
"type": "model-fields",
"fields": {
"state_type": {
"type": "model-field",
"schema": {
"type": "literal",
"expected": [
"nested"
]
}
},
"substate": {
"type": "model-field",
"schema": {
"type": "default",
"schema": {
"type": "union",
"choices": [
{
"type": "definition-ref",
"schema_ref": "__main__.NestedState:4587657680"
},
{
"type": "definition-ref",
"schema_ref": "__main__.LoopState:4587670848"
},
{
"type": "definition-ref",
"schema_ref": "__main__.LeafState:5124597648"
}
]
}
}
}
}
},
"ref": "__main__.NestedState:4587657680"
}
]
} The issue is that the tagged union is not carried along so it's loss after the first union and thus performance degrades. @sydney-runkle is that enough to go off of? I'm guessing we're not doing the |
Afraid I'm reproducing this on main 😢 |
@sydney-runkle unfortunately still reproducible with 2.6.1, as David already noticed. |
@davidhewitt, ah rats, ok 🐀. |
+1 this seems pretty critical 😬, is there any workaround? |
This is a regression because I don't see this issue with
Unfortunately I can't downgrade because my dependencies have updated to >2.5.0 |
Thanks for the +1. I've added to the list of issues we'll attempt to fix for 2.6.2. Hopefully we can get that release out next week. I'll keep you updated as we progress. I don't currently know of a workaround for this performance issue. |
Here's a somewhat clumsy workaround: from __future__ import annotations
from typing import Any, Literal, Annotated
from pydantic import AfterValidator, Field, TypeAdapter, BaseModel
import time
def validate_any_state(value: Any):
return _ADAPTER.validate_python(value)
AnyState = Annotated[Any, AfterValidator(validate_any_state)]
class NestedState(BaseModel):
state_type: Literal["nested"]
substate: AnyState
# If this type is left out, the model behaves normally again
class LoopState(BaseModel):
state_type: Literal["loop"]
substate: AnyState
class LeafState(BaseModel):
state_type: Literal["leaf"]
_ADAPTER = TypeAdapter(Annotated[NestedState | LoopState | LeafState, Field(..., discriminator="state_type")])
def build_nested_state(n):
if n <= 0:
return {"state_type": "leaf"}
else:
return {"state_type": "loop", "substate": {"state_type": "nested", "substate": build_nested_state(n-1)}}
for i in range(13):
start = time.time()
_ADAPTER.validate_python(build_nested_state(i))
print(f"Time for {i}: {time.time() - start}") It works by cutting the recursive model apart a bit crudely with the Disclaimer: I haven't checked if doing this creates other unexpected effects. YMMV. |
I git bisected
|
Thanks for checking that 👍 Given that those commits changed the way a non-tagged union is evaluated, that's not a huge surprise. I suspect that the truth is the schema building bug illustrated above existed before those commits, but the newer smart union implementation (which tries to be more accurate in which type gets produced) exacerbated the existing bug. |
Ah I see. I don't think we necessarily need to include this fix in 2.6.2 then, though the sooner we get a fix the better. I'll add this to the 2.7 milestone. |
If anyone else needs a workaround until this fixed I'm currently using a def get_state_type(v: Any) -> Hashable:
if isinstance(v, dict):
return v.get("state_type")
return getattr(v, "state_type", None)
AnyState = Annotated[
Annotated[NestedState, Tag("nested")]
| Annotated[LoopState, Tag("loop")]
| Annotated[LeafState, Tag("leaf")],
Discriminator(get_state_type),
] |
@sydney-runkle by the way, as I've mentioned previously, this issue can lead to a denial of service attack (and I did find apps which are vulnerable to this issue). Should we publish a security advisory somewhere? |
Facing this issue as well at both the 01 and open-interpreter projects on my M2 Macbook Pro. Pydantic version 2.6.2, pydantic-core version 2.16.3. Things were working fine for me yesterday, but since today morning, my FastAPI server does not start due to this Pydantic schema issue. This is also coming up in my OpenAI dependency, for which I've opened an issue here. I did a memray memory profile to investigate further. Here is the report from open-interpreter, and here is the report from the 01 project. In both, the inflated memory seems to be due to this issue, since both FastAPI and OpenAI use Pydantic. Could you give a timeline for this issue to be fixed? If this is not expected soon, could you give a patch I can apply in the meantime? |
Hi all, Given the significance of this bug, we're working on a fix now, and will plan to do a |
Hey all 😄, After some extended debugging, we found a fix, and will release |
Initial Checks
Description
When validating the recursive model given in the example, Pydantic has exponential time and space complexity, meaning that validation time and memory usage may grow indefinitely and rapidly depending on model depth, even when using a discriminator field.
Given the presence of a discriminator field, this should not be the case.
A model like this can potentially be abused in a denial-of-service attack, if the data passed to the validator can be controlled by the user.
Example Code
Python, Pydantic & OS Version
The text was updated successfully, but these errors were encountered: