Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment with reducing import time #7423

Closed
wants to merge 1 commit into from

Conversation

samuelcolvin
Copy link
Member

@samuelcolvin samuelcolvin commented Sep 13, 2023

ref #7409

Experiment with reducing import time.

Removing dataclasses is pretty hard since:

  1. it's used by annotated-types, I think persuading @adriangb to remove dataclasses from there and using a proxy/hack like we do here is pretty unlike
  2. it's used in dataclass transform like @dataclass_transform(field_specifiers=(dataclasses.field, Field)), while this is only required by type checkers, I think it would be quite to remove them and keep dataclass_transform behaviour the same, maybe it's possible
  3. We have a lot of dataclasses with pydantic, I've got rid of or deferred most of them here, but the remainders are hard

From my experience here the low hanging fruit to improve performance times are:

  • defer everything related to deprecated, using a mixture of local imports and pydantic.__init__.__getattr__
  • defer importing pydantic.types, and add it all it's exports to pydantic.__init__.__getattr__
  • defer import of pydantic_core.core_schema
  • hard to tell, but perhaps the deferred_dataclasses provides some improvements even through it doesn't yet avoid dataclasses being imported

@samuelcolvin samuelcolvin changed the title experiment with reducing import time, experiment with reducing import timex Sep 13, 2023
@samuelcolvin samuelcolvin changed the title experiment with reducing import timex experiment with reducing import time Sep 13, 2023
@samuelcolvin samuelcolvin mentioned this pull request Sep 13, 2023
13 tasks
@samuelcolvin
Copy link
Member Author

Worth noting that the big win from not importing dataclasses is not importing inspect, but to get that win you need to avoid importing typing_extensions too. At that point, I'm not really sure it's worth it.

@ofek
Copy link
Contributor

ofek commented Sep 13, 2023

FYI the current reduction is huge on Windows compared to before for me:

Before:

❯ python -m timeit -n 1 -r 1 "import pydantic"
1 loop, best of 1: 95.7 msec per loop

After:

❯ python -m timeit -n 1 -r 1 "import pydantic"
1 loop, best of 1: 65.9 msec per loop

@ofek
Copy link
Contributor

ofek commented Sep 13, 2023

I don't know how much it would help here but in case you're unaware since Python 3.7 there exists a way to dynamically set module attributes: https://peps.python.org/pep-0562/

For example, in Hatch I have core functionality that lets people migrate their build configuration and project metadata automatically and I use that to patch setuptools: https://github.com/pypa/hatch/blob/hatch-v1.7.0/src/hatch/cli/new/migrate.py#L317-L330

@ofek
Copy link
Contributor

ofek commented Sep 20, 2023

Anything I can do to help or test?

@ofek
Copy link
Contributor

ofek commented Sep 24, 2023

Actually, I found out that there is a well defined way to do lazy imports based on the PEP I mentioned that has already become a standard within the scientific Python community:

I'm going to begin introducing that at work in fact soon based on successful preliminary tests I did today

@samuelcolvin
Copy link
Member Author

I think we should do the first two things on my list, anything else if going to take longer. Probably easiest via a new pr.

Of course anyone else is welcome to have a dig and see if they can improve import time.

@samuelcolvin
Copy link
Member Author

I've made some progress on this in #7590, this should be released in v2.4 today.

If that goes well and we don't receive any issues about problems with module __getattr__ on v2.4, I would propose we put everything in pydantic/__init__.py in __getattr__ at runtime.

@samuelcolvin
Copy link
Member Author

BTW, @adiangb has done lots of work for v2.4 improve the time taken to build pydantic models, see #7565 #7536 #7535 #7529 #7528 #7527 #7523.

While this won't affect import time, it'll make the time taken to import modules which include pydantic models much smaller.

@ofek
Copy link
Contributor

ofek commented Oct 18, 2023

Is there anything I can do to help or test?

@samuelcolvin
Copy link
Member Author

@ofek see #7947 which should help a lot.

Closing this as it's not going to happen this way.

@samuelcolvin samuelcolvin deleted the import-time-experiment branch November 14, 2023 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants