Very slow evaluation of recursive types #7149

mciucu · 2023-08-16T18:05:19Z

Initial Checks

I confirm that I'm using Pydantic V2

Description

I've tracked it down to the inner function count_refs in _core_utils.py, that seems to be doing at least quadratic work than what would be required. With ~10 classes that could be recursively included inside one another in a tree I'm getting ~1 minute execution on calling model_rebuild() for all.
The change that I think would be safe and efficient would be change lines 464-467 with:

        if ref_counts[ref] >= 2:
            # If this model is involved in a recursion this should be detected
            # on its second encounter, otherwise we can stop recursing.
            if current_recursion_ref_count[ref] != 0:
                involved_in_recursion[ref] = True
            return s

This should be safe, given that if a model X can contain itself the second time when entering the count_refs function should be directly or indirectly from an inclusion where model X is already in the call stack.
Call time for model_rebuild() is <1s now.
I can make a pull request if that's ok.

Example Code

from __future__ import annotations

import time
from typing import Union, Optional, Literal

from pydantic import BaseModel


class BaseC(BaseModel):
    subexpr: CAny


class C0(BaseC):
    type: Literal["C0"]


class C1(BaseC):
    type: Literal["C1"]


class C2(BaseC):
    type: Literal["C2"]


class C3(BaseC):
    type: Literal["C3"]


class C4(BaseC):
    type: Literal["C4"]


class C5(BaseC):
    type: Literal["C5"]


class C6(BaseC):
    type: Literal["C6"]


class C7(BaseC):
    type: Literal["C7"]
    r: Optional[CExpr]


class C8(BaseC):
    type: Literal["C8"]
    r: Optional[CExpr]


class C9(BaseC):
    type: Literal["C9"]
    r: Optional[CExpr]


class C10(BaseC):
    type: Literal["C10"]
    r: Optional[CExpr]


class CExpr(BaseModel):
    type: Literal["or", "and"]
    subexpr: list[CAny]


CAny = Union[C0, C1, C2, C3, C4, C5, C6, C7, C8, C9, C10]

start_time = time.time()
C0.model_rebuild()
print("Duration:", time.time() - start_time)

Python, Pydantic & OS Version

pydantic version: 2.1.1
        pydantic-core version: 2.4.0
          pydantic-core build: profile=release pgo=true mimalloc=true
                 install path: /blink/api/.venv/lib/python3.11/site-packages/pydantic
               python version: 3.11.4 (main, Jun  9 2023, 07:59:55) [GCC 12.3.0]
                     platform: Linux-6.2.0-27-generic-x86_64-with-glibc2.37
     optional deps. installed: ['typing-extensions']

Selected Assignee: @davidhewitt

The text was updated successfully, but these errors were encountered:

dmontagu · 2023-08-16T22:59:56Z

Great work investigating the issue!

I can confirm that with the proposed change all tests pass, but @adriangb this is your code and I figure you might be quicker to understand the implications?

@mciucu I think it's a good idea to open a PR for this either way, please feel free, very much appreciated.

adriangb · 2023-08-17T05:46:37Z

This generally looks good. I added some pretty in depth tests precisely with the idea that I'd TDD that code at the time and someone would come along and improve it later :). Let's take a careful look at the tests and make sure they cover all of the cases, in particular around those LOC, and add more if needed. If that works I'm very positive for this change.

mciucu · 2023-08-17T08:47:04Z

I've created a pull request with the change.

I think maybe a test based on the structure I had in that example code would help catch any performance regressions in the future, since that snipped takes about 5 minutes to run for me locally (so it seemed like it was O(N!)). I can add here anything if it would be helpful, was just not sure how benchmark tests are implemented in pydantic.

michaelgmiller1 · 2023-08-18T03:57:56Z

This is preventing us from migrating to v2, as our models take more than 10 minutes (!!!!) to rebuild. Would be great to identify a fix, and cut a release so we can bump versions and enjoy the performance of Rust! :)

adriangb · 2023-08-18T12:36:57Z

The pull request was merged, this will come out in the next release.

mciucu added bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable labels Aug 16, 2023

pydantic-hooky bot assigned davidhewitt Aug 16, 2023

dmontagu removed the unconfirmed Bug not yet confirmed as valid/applicable label Aug 16, 2023

adriangb closed this as completed Aug 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very slow evaluation of recursive types #7149

Very slow evaluation of recursive types #7149

mciucu commented Aug 16, 2023 •

edited

dmontagu commented Aug 16, 2023 •

edited

adriangb commented Aug 17, 2023

mciucu commented Aug 17, 2023

michaelgmiller1 commented Aug 18, 2023

adriangb commented Aug 18, 2023

Very slow evaluation of recursive types #7149

Very slow evaluation of recursive types #7149

Comments

mciucu commented Aug 16, 2023 • edited

Initial Checks

Description

Example Code

Python, Pydantic & OS Version

dmontagu commented Aug 16, 2023 • edited

adriangb commented Aug 17, 2023

mciucu commented Aug 17, 2023

michaelgmiller1 commented Aug 18, 2023

adriangb commented Aug 18, 2023

mciucu commented Aug 16, 2023 •

edited

dmontagu commented Aug 16, 2023 •

edited