Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of keys differs in schema output #7580

Open
1 task done
vigneshmanick opened this issue Sep 23, 2023 · 20 comments · Fixed by #7817
Open
1 task done

Order of keys differs in schema output #7580

vigneshmanick opened this issue Sep 23, 2023 · 20 comments · Fixed by #7817
Assignees
Labels
bug V2 Bug related to Pydantic V2
Milestone

Comments

@vigneshmanick
Copy link

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

When a model is specified as a Field and the values are instantiated , the keys are output in sorted order. I would expect that the order of the keys be maintained.

See sample code below. The output of the class Stepping when executed standalone is

print(Stepping(type="const", step=0.15, multi_step=True))
# type='const' step=0.15 multi_step=True

But when the class Stepping is used as a field in another Model and the Field is instantiated then the output of the schema has the values sorted. 'default': {'multi_step': True, 'step': 0.15, 'type': 'const'}}},

print(Movement.model_json_schema())

{'$defs': {'Stepping': {'properties': {'type': {'const': 'const',
     'title': 'Type'},
    'step': {'title': 'Step', 'type': 'number'},
    'multi_step': {'default': False,
     'title': 'Multi Step',
     'type': 'boolean'}},
   'required': ['type', 'step'],
   'title': 'Stepping',
   'type': 'object'}},
 'properties': {'type': {'const': 'movement',
   'default': 'movement',
   'title': 'Type'},
  'step_settings': {'allOf': [{'$ref': '#/$defs/Stepping'}],
   'default': {'multi_step': True, 'step': 0.15, 'type': 'const'}}},
 'title': 'Movement',
 'type': 'object'}

Is there a setting that would allow the retaining of the field order in the output? Or am i doing something wrong?

Thanks in advance.

Example Code

from pydantic import BaseModel, Field
from typing import Literal

class Stepping(BaseModel):
    type: Literal["const"] = Field(
        ...,
    )
    step: float = Field(
        ...,
    )
    multi_step: bool = Field(
        default=False,
    )

print(Stepping(type="const", step=0.15, multi_step=True))

class Movement(BaseModel):
    type: Literal["movement"] = Field(
        default="movement"
    )
    step_settings: Stepping = Field(
        default=Stepping(type="const", step=0.15, multi_step=True)
    )

print(Movement.model_json_schema())

Python, Pydantic & OS Version

pydantic version: 2.3.0
pydantic-core version: 2.6.3
pydantic-core build: profile=release pgo=false
install path: /scratch/myprograms/miniconda3/envs/pydantic-v2/lib/python3.11/site-packages/pydantic
python version: 3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:34:09) [GCC 12.3.0]
                     platform: Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.17
optional deps. installed: ['typing-extensions']
@vigneshmanick vigneshmanick added bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable labels Sep 23, 2023
@sydney-runkle sydney-runkle self-assigned this Sep 28, 2023
@vigneshmanick
Copy link
Author

vigneshmanick commented Sep 28, 2023

update , i checked the code and the source seems to be

return _sort_json_schema(json_schema)

it would be nice to have a flag that also allows non sorted schema output.

def generate(self, schema: CoreSchema, mode: JsonSchemaMode = 'validation') -> JsonSchemaValue:

Some reasons:

  • We also generate documentation from the schema and it is better to have the documentation output mirror the input
  • Order of fields in output is same as input
  • In pydantic v1 the order was not sorted

Update:

6043 is where the change to sorting was added. Please consider adding a flag so that the output can be unsorted too.

@sydney-runkle sydney-runkle added question feature request and removed unconfirmed Bug not yet confirmed as valid/applicable bug V2 Bug related to Pydantic V2 labels Sep 29, 2023
@sydney-runkle
Copy link
Member

Hi @vigneshmanick,

Thanks for looking into this! Seems like a reasonable feature request to me. Also related is #7424, enabling sorting for model_dump and friends.

Are you interested in opening a PR to help with this feature request? 😄

@vigneshmanick
Copy link
Author

@sydney-runkle sure, let me give it a try.

@adriangb
Copy link
Member

adriangb commented Oct 9, 2023

Maybe I’m misunderstanding. The goal is to make json schemas determined to stick in the face of re-factoring. We should be sorting anything that does not have a natural order and keep the natural order on the things that do. Model fields have a natural order so they should never be sorted, if they are that’s a bug.

@sydney-runkle
Copy link
Member

@adriangb are you suggesting that the alphabetical sorting of the fields z and a in this test when the sorting logic is implemented is a bug, given that z is defined first in the model definition?

@adriangb
Copy link
Member

adriangb commented Oct 9, 2023

Well that's a new test being added right? The goal of sorting is that schema generation is deterministic, e.g. when you migrate from v2 to v3 your tests don't break because the JSON schema output is different. Or if we refactor pydantic/json_schema.py. Without sorting we might accidentally change the order from e.g. {"type": "string", "enum": ["one", "two", "three"]} to {"enum": ["one", "two", "three"], "type": "string"} or {"type": "string", "enum": ["three", "two", "one"]}. Some of this is up for debate (e.g. should we sort enum values?) but I think the model fields have a natural order (the order they are defined in code) that we try to preserve elsewhere and should try to preserve in the json shcema.

@vigneshmanick
Copy link
Author

yes, the order of model fields is sorted in the output which is the main issue i have with the schema output.

e.g from the issue, 'default': {'multi_step': True, 'step': 0.15, 'type': 'const'}}}, the fields are sorted.

@adriangb
Copy link
Member

adriangb commented Oct 10, 2023

@sydney-runkle seems like a bug around

if parent_key != 'properties':
then. Can you investigate and make a PR to fix? We may need to handle default as a special case...

@sydney-runkle
Copy link
Member

@adriangb @vigneshmanick,

Yep, I'll look into this bug and open a fix. @vigneshmanick, if we don't sort the model fields in the schema, does that eliminate the need for your sorting flag?

@sydney-runkle sydney-runkle added bug V2 Bug related to Pydantic V2 and removed question labels Oct 10, 2023
@vigneshmanick
Copy link
Author

Yes that works for me.

@LostInDarkMath
Copy link

In my case, #7817 does not fix my issue since I'm using json_schema_extra. Would it be possible to re-open #7767 which would fix my issue?

@adriangb
Copy link
Member

I think we should just exclude json_schema_extra as well @sydney-runkle

@LostInDarkMath
Copy link

LostInDarkMath commented Nov 21, 2023

That would be great! :)

/CC @Galaxy102

@sydney-runkle
Copy link
Member

Sure thing. Can take a look at this after the Thanksgiving holiday 👍.

@LostInDarkMath
Copy link

Can you please re-open this to reduce the risk of forgetting this? ;)

@sydney-runkle sydney-runkle reopened this Nov 27, 2023
@sydney-runkle
Copy link
Member

@LostInDarkMath,

Could you provide an example of the schema that you're getting vs the schema that you're expecting when you're using json_schema_extra? Thanks!

@LostInDarkMath
Copy link

LostInDarkMath commented Nov 28, 2023

Here is a small example to reproduce the bug:

from collections import OrderedDict
from typing import Callable

from pydantic import BaseModel, Field, computed_field

MAP = OrderedDict({'2': 'hi', '1': 'hello', '3': 'world'})  # note the ordering
# does not matter if I use a OrderedDict or a normal dict, since dicts are ordered too


class FieldMetaData(BaseModel):
    values: Callable[[], dict[str, str]] | None = Field(default=None, exclude=True)

    @computed_field
    def my_values(self) -> dict[str, str]:
        return self.values() if self.values else None


class Bar(BaseModel):
    v: str = Field(default='42', json_schema_extra={'extra': FieldMetaData(values=lambda: MAP)})


if __name__ == '__main__':
    bar = Bar()
    json_schema = bar.model_json_schema()

    actual = json_schema['properties']['v']['extra']['my_values']
    assert actual == MAP
    assert actual.items() == MAP.items()

    for k, v in actual.items():
        print(k, v)

    assert str(actual.items()) == str(MAP.items())  # FAILS because order is wrong

@sydney-runkle
Copy link
Member

So in some sense this is a different issue, which is sourced from the following call:

json_schema.update(to_jsonable_python(json_schema_extra))

Which appears to sort the data, which isn't what we want.

@sydney-runkle sydney-runkle added this to the v2.7.0 milestone Jan 26, 2024
@mwildehahn
Copy link

I confirmed sorting happens here:

return _sort_json_schema(json_schema)
to_jsonable_python doesn't sort the json_schema_extra.

_sort_json_schema currently avoids sorting some keys based on looking at the parent key, but that doesn't mean anything for schema extras. we could pass extras to that function and then inspect it or it seems easiest jsut to give people an option to opt out of sorting.

@sydney-runkle
Copy link
Member

@mwildehahn,

Nice find - any interest in contributing a fix?

@sydney-runkle sydney-runkle modified the milestones: v2.7.0, v2.8.0 Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants