Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum validator improvements #9045

Merged
merged 3 commits into from Mar 19, 2024
Merged

Enum validator improvements #9045

merged 3 commits into from Mar 19, 2024

Conversation

samuelcolvin
Copy link
Member

@samuelcolvin samuelcolvin commented Mar 19, 2024

Change Summary

While looking through how we do enum validation in preparation for moving some of it to Rust, I saw a few things that could be improved. Hopefully this provides a "cleaner" schema, and should improve performance a bit.

Performance gain is small, but not insignificant:

before

In [1]: import enum
   ...: from pydantic import TypeAdapter
   ...: 
   ...: class Foo(enum.IntEnum):
   ...:     x = 1
   ...:     y = 2
   ...: 
   ...: 
   ...: ta = TypeAdapter(Foo)
   ...: 
   ...: %timeit ta.validate_python(2)
   ...: %timeit ta.validate_json('2')
   ...: 
550 ns ± 2.37 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
560 ns ± 2.29 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

After

In [1]: import enum
   ...: from pydantic import TypeAdapter
   ...: 
   ...: class Foo(enum.IntEnum):
   ...:     x = 1
   ...:     y = 2
   ...: 
   ...: 
   ...: ta = TypeAdapter(Foo)
   ...: 
   ...: %timeit ta.validate_python(2)
   ...: %timeit ta.validate_json('2')
   ...: 
508 ns ± 0.84 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
517 ns ± 1.95 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Tests pass on CI
  • Documentation reflects the changes where applicable
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Copy link

cloudflare-pages bot commented Mar 19, 2024

Deploying pydantic-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8bd46c3
Status: ✅  Deploy successful!
Preview URL: https://11ebbd3f.pydantic-docs2.pages.dev
Branch Preview URL: https://enum-validator-improvements.pydantic-docs2.pages.dev

View logs

Copy link

codspeed-hq bot commented Mar 19, 2024

CodSpeed Performance Report

Merging #9045 will not alter performance

Comparing enum-validator-improvements (8bd46c3) with main (09f4456)

Summary

✅ 10 untouched benchmarks

@@ -90,47 +91,32 @@ def to_enum(input_value: Any, /) -> Enum:
try:
return enum_type(input_value)
except ValueError:
# The type: ignore on the next line is to ignore the requirement of LiteralString
raise PydanticCustomError('enum', f'Input should be {expected}', {'expected': expected}) # type: ignore
raise PydanticCustomError('enum', 'Input should be {expected}', {'expected': expected})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output is the same, but it's more correct to let pydantic-core take care of replacing the variables in the message template.

python_schema=strict_python_schema,
)
js_updates['type'] = 'integer'
lax_schema = core_schema.no_info_after_validator_function(to_enum, core_schema.int_schema())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced lots of use of core_schema.chain_schema with core_schema.no_info_after_validator_function which should be more efficient, and is more concise.

[enum_schema, core_schema.no_info_plain_validator_function(lambda x: x.value)]
)
if config.get('use_enum_values', False):
enum_schema = core_schema.no_info_after_validator_function(attrgetter('value'), enum_schema)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attrgetter is nearly twice as fast as a lambda:

In [1]: import enum

In [2]: class Foo(enum.Enum):
   ...:     x = 'a'
   ...:     y = 'b'
   ...: 

In [3]: with_lambda = lambda thing: thing.x

In [4]: %timeit with_lambda(Foo)
39.1 ns ± 0.161 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: from operator import attrgetter

In [6]: using_attrgetter = attrgetter('x')

In [7]: %timeit using_attrgetter(Foo)
24.5 ns ± 0.473 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using attrgetter more broadly?

@samuelcolvin samuelcolvin added the relnotes-performance Used for performance improvements. label Mar 19, 2024
Copy link
Member

@sydney-runkle sydney-runkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! This is much more simple than the existing behavior!

[enum_schema, core_schema.no_info_plain_validator_function(lambda x: x.value)]
)
if config.get('use_enum_values', False):
enum_schema = core_schema.no_info_after_validator_function(attrgetter('value'), enum_schema)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using attrgetter more broadly?

@sydney-runkle sydney-runkle merged commit b0dfc54 into main Mar 19, 2024
54 of 55 checks passed
@sydney-runkle sydney-runkle deleted the enum-validator-improvements branch March 19, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes-performance Used for performance improvements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants