fix sequence like validator with strict True #8977

andresliszt · 2024-03-08T21:28:26Z

Change Summary

This PR aims to solve #8930. Until now, when strict=True is passed in the config dict, only the type of the elements of the sequence fields is forced. Now with this change, both the type of the sequence and the items will be validated strictly. This validation is done perfectly in pydantic-core, the problem was that this information was not sent through the json schema to pydantic-core for sequence like types. However, the model_validate_json method, from what I understand, needs to be fixed in pydantic-core. An example for the last is shown below and also there is test marked with xfail in this PR

  class LaxModel(BaseModel):
      x: List[str]
      model_config = ConfigDict(strict=False)
      
 >>> LaxModel.model_validate_json(json.dumps({'x': ('a', 'b', 'c')}), strict=True)
 # No error.

The example has no error, strict=True in model_validate_json should overwrite config dict strict=False (To be honest, I make this statement looking at the numerous examples in the tests, where it is seen that it overwrites)

Related issue number

#8930

Checklist

The pull request title is a good summary of the changes - it will be used in the changelog
Unit tests for the changes exist
Tests pass on CI
Documentation reflects the changes where applicable
My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @dmontagu

codspeed-hq · 2024-03-08T21:31:25Z

CodSpeed Performance Report

Merging #8977 will degrade performances by 19.35%

_{Comparing andresliszt:fix/strict-iterable-like (cd2d802) with main (e58134b)}

Summary

⚡ 5 improvements
❌ 3 (👁 3) regressions
✅ 2 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`andresliszt:fix/strict-iterable-like`	Change
⚡	`test_fastapi_startup_perf`	3.6 s	1.7 s	×2.2
⚡	`test_fastapi_startup_perf`	1,272 ms	193.4 ms	×6.6
⚡	`test_north_star_json_dumps`	179.6 ms	123.1 ms	+45.9%
👁	`test_north_star_json_loads`	98.3 ms	116.9 ms	-15.9%
⚡	`test_north_star_validate_json`	296.6 ms	258.1 ms	+14.91%
⚡	`test_north_star_validate_json_strict`	293.1 ms	257.4 ms	+13.88%
👁	`test_north_star_validate_python`	169.2 ms	204.8 ms	-17.36%
👁	`test_north_star_validate_python_strict`	109.3 ms	135.6 ms	-19.35%

andresliszt · 2024-03-08T21:43:47Z

please review

sydney-runkle · 2024-03-17T16:31:55Z

@andresliszt,

Ugh, looks like this is a reflection of the issues we currently have with sequence type validation / schema building. It's odd that we don't even use the _set_schema() logic for this (something I need to fix when I refactor the sequence validators.

I'd prefer something like this, as a patch fix for now:

diff --git a/pydantic/_internal/_std_types_schema.py b/pydantic/_internal/_std_types_schema.py
index 5c61d8f0..77d5fc37 100644
--- a/pydantic/_internal/_std_types_schema.py
+++ b/pydantic/_internal/_std_types_schema.py
@@ -406,9 +406,21 @@ def sequence_like_prepare_pydantic_annotations(
     item_source_type = args[0]
 
     metadata, remaining_annotations = _known_annotated_metadata.collect_known_metadata(annotations)
-    _known_annotated_metadata.check_metadata(metadata, _known_annotated_metadata.SEQUENCE_CONSTRAINTS, source_type)
+    _known_annotated_metadata.check_metadata(
+        metadata, [*_known_annotated_metadata.SEQUENCE_CONSTRAINTS, *_known_annotated_metadata.STRICT], source_type
+    )
 
-    return (source_type, [SequenceValidator(mapped_origin, item_source_type, **metadata), *remaining_annotations])
+    strict = metadata.pop('strict', _config.get('strict', False))
+
+    return (
+        source_type,
+        [
+            SequenceValidator(
+                mapped_origin=mapped_origin, item_source_type=item_source_type, strict=strict, **metadata
+            ),
+            *remaining_annotations,
+        ],
+    )

Even if we resolve this with this patch (or something similar), I'd still like to keep an eye on this issue as an example of what's wrong with our sequence schema building! Thanks for your help!

Also, re your xfail test, I think it is desired to have model config take precedence over runtime serialization flags, so I think the behavior that you're seeing is expected.

sydney-runkle · 2024-03-17T16:52:29Z

Ah actually, here's a more simple fix:

diff --git a/pydantic/_internal/_std_types_schema.py b/pydantic/_internal/_std_types_schema.py
index 5c61d8f0..90ed7b6d 100644
--- a/pydantic/_internal/_std_types_schema.py
+++ b/pydantic/_internal/_std_types_schema.py
@@ -288,7 +288,7 @@ class SequenceValidator:
     item_source_type: type[Any]
     min_length: int | None = None
     max_length: int | None = None
-    strict: bool = False
+    strict: bool | None = None
 
     def serialize_sequence_via_list(
         self, v: Any, handler: core_schema.SerializerFunctionWrapHandler, info: core_schema.SerializationInfo

sydney-runkle · 2024-03-17T16:54:09Z

Another issue here that I stumbled upon during my testing is that you can't apply strict directly to a Set[int] field (which you should be able to), but that can be addressed separately.

sydney-runkle · 2024-03-21T14:39:55Z

@andresliszt,

Any chance you'll be able to update this in the next few days? We'll be doing a new release soon, and I want to make sure we can get your fix in :).

andresliszt · 2024-03-21T16:14:57Z

@sydney-runkle hey hello!, sure, are you requesting this change strict = metadata.pop('strict', _config.get('strict', False)) avoiding strict duplicates (if it's present in both, metadata and config), right?. Because my change is
SequenceValidator(mapped_origin, item_source_type, **metadata, strict=_config.get('strict', False)), so if metadata includes strict key, it's going to send it twice

sydney-runkle · 2024-03-21T16:16:55Z

@andresliszt,

I think you should be able to just implement this 1 line diff to fix the issue (but definitely keep your tests)!

diff --git a/pydantic/_internal/_std_types_schema.py b/pydantic/_internal/_std_types_schema.py
index 5c61d8f0..90ed7b6d 100644
--- a/pydantic/_internal/_std_types_schema.py
+++ b/pydantic/_internal/_std_types_schema.py
@@ -288,7 +288,7 @@ class SequenceValidator:
     item_source_type: type[Any]
     min_length: int | None = None
     max_length: int | None = None
-    strict: bool = False
+    strict: bool | None = None
 
     def serialize_sequence_via_list(
         self, v: Any, handler: core_schema.SerializerFunctionWrapHandler, info: core_schema.SerializationInfo

andresliszt · 2024-03-21T16:47:42Z

Nice i will review it once i finish working!. Im super curious, why that line makes the magic

sydney-runkle · 2024-03-21T16:53:22Z

@andresliszt,

The issue here is that if you have a model with strict=True and don't specify strict for some given sequence, it was defaulting to False. We don't want to default for specific types like this, because that messes up the behavior we should get by respecting the model's strict specification.

Thus, defaulting to None here fixes that issue, and follows the pattern we see with other settings for the sequence validator, like min_length and max_length :).

andresliszt · 2024-03-21T21:10:18Z

@sydney-runkle i pushed the one line change and it's failing, maybe i missed something? I still not understanding how the SequenceValidator gets the True value if it's never passed, in my original fix after debugging i noticed that strict=True was never passed to this validator, it was never included in metadata, and the current code was not getting it from config

sydney-runkle · 2024-03-21T22:53:28Z

@andresliszt,

Hmm, that's odd. I thought I had it working with just the one liner. I'll pull down the code tomorrow morning and take a closer work. Thanks for your work on this!!

andresliszt · 2024-03-21T23:32:12Z

It is failing on a test i added, so probably your fix is ok! Will review it tomorrow as well

andresliszt · 2024-03-22T03:21:01Z

It's failing on an existing test.

pydantic/tests/test_types.py

Line 4748 in e58134b

def test_deque_generic_success_strict(cls, value: Any, result):

I think is because deque and Counter are being validated using list_schema here

pydantic/pydantic/_internal/_std_types_schema.py

Line 329 in e58134b

constrained_schema = core_schema.list_schema(items_schema, **metadata)

So with the fix, passing deque to the test model fails because it is expecting a list in strict mode. So I think currently deque and Counter are not supported to be strict. Now i remember that i added an overwrite metadata['strict'] = False in my original commit, in that part of the code to avoid forcing strict mode for those types. I don't know if that is expected anyways, if that fixes the issue i can add that line again!

sydney-runkle · 2024-03-25T22:48:45Z

@andresliszt,

Currently debugging. I can't reproduce the test failure locally, for some reason...

sydney-runkle · 2024-03-25T22:50:21Z

tests/test_main.py

+@pytest.mark.xfail(
+    reason='strict=True in model_validate_json does not overwrite strict=False given in ConfigDict'
+    'See issue: https://github.com/pydantic/pydantic/issues/8930'
+)
+def test_model_validate_list_strict() -> None:
+    # FIXME: This change must be implemented in pydantic-core. The argument strict=True
+    # in model_validate_json method is not overwriting the one set with ConfigDict(strict=False)
+    # for sequence like types. See: https://github.com/pydantic/pydantic/issues/8930
+
+    class LaxModel(BaseModel):
+        x: List[str]
+        model_config = ConfigDict(strict=False)
+
+    assert LaxModel.model_validate_json(json.dumps({'x': ('a', 'b', 'c')}), strict=None) == LaxModel(x=('a', 'b', 'c'))
+    assert LaxModel.model_validate_json(json.dumps({'x': ('a', 'b', 'c')}), strict=False) == LaxModel(x=('a', 'b', 'c'))
+    with pytest.raises(ValidationError) as exc_info:
+        LaxModel.model_validate_json(json.dumps({'x': ('a', 'b', 'c')}), strict=True)
+    assert exc_info.value.errors(include_url=False) == [
+        {'type': 'list_type', 'loc': ('x',), 'msg': 'Input should be a valid list', 'input': ('a', 'b', 'c')}
+    ]
+
+


@davidhewitt, interesting rust bug here!

Going to chat with @davidhewitt about this PR tomorrow and will get back to you :).

AHHHHH I can reproduce now. Dumb error on my end :(

I think this is actually an issue with the deque schema generation, not with the solution we've implemented :).

sydney-runkle · 2024-03-26T01:33:37Z

You're right, we have to add that strict override back for the list, but I don't think we need the other change to the initialization of SequenceValidator. Thanks again for your great work on this!

sydney-runkle

Looking good now, great work!

sydney-runkle · 2024-03-26T01:50:04Z

Acknowledged benchmarks don't truly represent a regression (this is just bc this was opened before we moved the benchmarks to 3.12)

pydantic-hooky bot added the ready for review label Mar 8, 2024

pydantic-hooky bot assigned dmontagu Mar 8, 2024

sydney-runkle added the relnotes-fix Used for bugfixes. label Mar 12, 2024

andresliszt added 7 commits March 21, 2024 17:14

fix sequence like validator with strict True

7d661db

update comments

8a85bc0

codespell

19c4b98

fix tests

05a5681

update test

f8f9d81

update comments

145720e

update SequenceValidator strict default

b9dfaf2

andresliszt force-pushed the fix/strict-iterable-like branch from ad435e1 to b9dfaf2 Compare March 21, 2024 20:14

andresliszt added 2 commits March 21, 2024 17:16

fmt

aee557d

fmt

fb8ed5d

sydney-runkle reviewed Mar 25, 2024

View reviewed changes

Update _std_types_schema.py

cd2d802

sydney-runkle approved these changes Mar 26, 2024

View reviewed changes

sydney-runkle enabled auto-merge (squash) March 26, 2024 01:50

sydney-runkle merged commit af3d335 into pydantic:main Mar 26, 2024
52 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix sequence like validator with strict True #8977

fix sequence like validator with strict True #8977

andresliszt commented Mar 8, 2024 •

edited by sydney-runkle

codspeed-hq bot commented Mar 8, 2024 •

edited

andresliszt commented Mar 8, 2024

sydney-runkle commented Mar 17, 2024

sydney-runkle commented Mar 17, 2024

sydney-runkle commented Mar 17, 2024

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024 •

edited

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024

andresliszt commented Mar 22, 2024

sydney-runkle commented Mar 25, 2024

sydney-runkle Mar 25, 2024

sydney-runkle Mar 25, 2024

sydney-runkle Mar 25, 2024

sydney-runkle Mar 25, 2024

sydney-runkle commented Mar 26, 2024

sydney-runkle left a comment

sydney-runkle commented Mar 26, 2024

fix sequence like validator with strict True #8977

fix sequence like validator with strict True #8977

Conversation

andresliszt commented Mar 8, 2024 • edited by sydney-runkle

Change Summary

Related issue number

Checklist

codspeed-hq bot commented Mar 8, 2024 • edited

CodSpeed Performance Report

Merging #8977 will degrade performances by 19.35%

Summary

Benchmarks breakdown

andresliszt commented Mar 8, 2024

sydney-runkle commented Mar 17, 2024

sydney-runkle commented Mar 17, 2024

sydney-runkle commented Mar 17, 2024

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024 • edited

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024

sydney-runkle commented Mar 21, 2024

andresliszt commented Mar 21, 2024

andresliszt commented Mar 22, 2024

sydney-runkle commented Mar 25, 2024

sydney-runkle Mar 25, 2024

Choose a reason for hiding this comment

sydney-runkle Mar 25, 2024

Choose a reason for hiding this comment

sydney-runkle Mar 25, 2024

Choose a reason for hiding this comment

sydney-runkle Mar 25, 2024

Choose a reason for hiding this comment

sydney-runkle commented Mar 26, 2024

sydney-runkle left a comment

Choose a reason for hiding this comment

sydney-runkle commented Mar 26, 2024

andresliszt commented Mar 8, 2024 •

edited by sydney-runkle

codspeed-hq bot commented Mar 8, 2024 •

edited

andresliszt commented Mar 21, 2024 •

edited