Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: constrain pydantic to <2.4 #8705

Closed
wants to merge 1 commit into from
Closed

Conversation

syun64
Copy link
Contributor

@syun64 syun64 commented Oct 3, 2023

Similar to #8647, I think we would need version 2.4 constraint here on the overall project's dependencies.

Given that pydantic has not de-shelved these faulty versions, we might not need to exclude them from the PyIceberg dependencies in order to prevent the dependency resolver from from installing pydantic 2.4.0 or 2.4.1

When pip installing pyiceberg 0.5.0 without version constraints, we see the following error on load_catalog function call:

KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_252/2476882581.py in <cell line: 1>()
----> 1 from pyiceberg.catalog import load_catalog
      2 
      3 catalog = load_catalog("lacus")

/lib/python3.10/site-packages/pyiceberg/catalog/__init__.py in <module>
     40 from pyiceberg.partitioning import UNPARTITIONED_PARTITION_SPEC, PartitionSpec
     41 from pyiceberg.schema import Schema
---> 42 from pyiceberg.serializers import ToOutputFile
     43 from pyiceberg.table import (
     44     CommitTableRequest,

/lib/python3.10/site-packages/pyiceberg/serializers.py in <module>
     23 
     24 from pyiceberg.io import InputFile, InputStream, OutputFile
---> 25 from pyiceberg.table.metadata import TableMetadata, TableMetadataUtil
     26 
     27 GZIP = "gzip"

/lib/python3.10/site-packages/pyiceberg/table/__init__.py in <module>
     69     visit,
     70 )
---> 71 from pyiceberg.table.metadata import INITIAL_SEQUENCE_NUMBER, TableMetadata
     72 from pyiceberg.table.snapshots import Snapshot, SnapshotLogEntry
     73 from pyiceberg.table.sorting import SortOrder

/lib/python3.10/site-packages/pyiceberg/table/metadata.py in <module>
    344 
    345 
--> 346 class TableMetadataV2(TableMetadataCommonFields, IcebergBaseModel):
    347     """Represents version 2 of the Table Metadata.
    348 

/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py in __new__(mcs, cls_name, bases, namespace, __pydantic_generic_metadata__, __pydantic_reset_parent_namespace__, **kwargs)
    182             types_namespace = get_cls_types_namespace(cls, parent_namespace)
    183             set_model_fields(cls, bases, config_wrapper, types_namespace)
--> 184             complete_model_class(
    185                 cls,
    186                 cls_name,

/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py in complete_model_class(cls, cls_name, config_wrapper, raise_errors, types_namespace)
    493         return False
    494 
--> 495     schema = apply_discriminators(simplify_schema_references(schema))
    496 
    497     # debug(schema)
    
/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in simplify_schema_references(schema)
    517         return s
    518 
--> 519     schema = walk_core_schema(schema, count_refs)
    520 
    521     assert all(c == 0 for c in state['current_recursion_ref_count'].values()), 'this is a bug! please report it'

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in walk_core_schema(schema, f)
    437         core_schema.CoreSchema: A processed CoreSchema.
    438     """
--> 439     return f(schema, _dispatch)
    440 
    441 

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in count_refs(s, recurse)
    513 
    514         state['current_recursion_ref_count'][ref] += 1
--> 515         recurse(state['definitions'][ref], count_refs)
    516         state['current_recursion_ref_count'][ref] -= 1
    517         return s

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in walk(self, schema, f)
    213 
    214     def walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:
--> 215         return f(schema, self._walk)
    216 
    217     def _walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in count_refs(s, recurse)
    501 
    502         if s['type'] != 'definition-ref':
--> 503             return recurse(s, count_refs)
    504         ref = s['schema_ref']
    505         state['ref_counts'][ref] += 1
...
...
...
/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in walk(self, schema, f)
    213 
    214     def walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:
--> 215         return f(schema, self._walk)
    216 
    217     def _walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in count_refs(s, recurse)
    513 
    514         state['current_recursion_ref_count'][ref] += 1
--> 515         recurse(state['definitions'][ref], count_refs)
    516         state['current_recursion_ref_count'][ref] -= 1
    517         return s

KeyError: 'pyiceberg.types.NestedField:43624544'

Similar to apache#8647, I think we would need version 2.4 constraint here on the overall project's dependencies.

When pip installing pyiceberg 0.5.0 without version constraints, we see the following error on load_catalog function call:

```
KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_252/2476882581.py in <cell line: 1>()
----> 1 from pyiceberg.catalog import load_catalog
      2 
      3 catalog = load_catalog("lacus")

/lib/python3.10/site-packages/pyiceberg/catalog/__init__.py in <module>
     40 from pyiceberg.partitioning import UNPARTITIONED_PARTITION_SPEC, PartitionSpec
     41 from pyiceberg.schema import Schema
---> 42 from pyiceberg.serializers import ToOutputFile
     43 from pyiceberg.table import (
     44     CommitTableRequest,

/lib/python3.10/site-packages/pyiceberg/serializers.py in <module>
     23 
     24 from pyiceberg.io import InputFile, InputStream, OutputFile
---> 25 from pyiceberg.table.metadata import TableMetadata, TableMetadataUtil
     26 
     27 GZIP = "gzip"

/lib/python3.10/site-packages/pyiceberg/table/__init__.py in <module>
     69     visit,
     70 )
---> 71 from pyiceberg.table.metadata import INITIAL_SEQUENCE_NUMBER, TableMetadata
     72 from pyiceberg.table.snapshots import Snapshot, SnapshotLogEntry
     73 from pyiceberg.table.sorting import SortOrder

/lib/python3.10/site-packages/pyiceberg/table/metadata.py in <module>
    344 
    345 
--> 346 class TableMetadataV2(TableMetadataCommonFields, IcebergBaseModel):
    347     """Represents version 2 of the Table Metadata.
    348 

/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py in __new__(mcs, cls_name, bases, namespace, __pydantic_generic_metadata__, __pydantic_reset_parent_namespace__, **kwargs)
    182             types_namespace = get_cls_types_namespace(cls, parent_namespace)
    183             set_model_fields(cls, bases, config_wrapper, types_namespace)
--> 184             complete_model_class(
    185                 cls,
    186                 cls_name,

/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py in complete_model_class(cls, cls_name, config_wrapper, raise_errors, types_namespace)
    493         return False
    494 
--> 495     schema = apply_discriminators(simplify_schema_references(schema))
    496 
    497     # debug(schema)
    
/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in simplify_schema_references(schema)
    517         return s
    518 
--> 519     schema = walk_core_schema(schema, count_refs)
    520 
    521     assert all(c == 0 for c in state['current_recursion_ref_count'].values()), 'this is a bug! please report it'

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in walk_core_schema(schema, f)
    437         core_schema.CoreSchema: A processed CoreSchema.
    438     """
--> 439     return f(schema, _dispatch)
    440 
    441 

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in count_refs(s, recurse)
    513 
    514         state['current_recursion_ref_count'][ref] += 1
--> 515         recurse(state['definitions'][ref], count_refs)
    516         state['current_recursion_ref_count'][ref] -= 1
    517         return s

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in walk(self, schema, f)
    213 
    214     def walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:
--> 215         return f(schema, self._walk)
    216 
    217     def _walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in count_refs(s, recurse)
    501 
    502         if s['type'] != 'definition-ref':
--> 503             return recurse(s, count_refs)
    504         ref = s['schema_ref']
    505         state['ref_counts'][ref] += 1
...
...
...
/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in walk(self, schema, f)
    213 
    214     def walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:
--> 215         return f(schema, self._walk)
    216 
    217     def _walk(self, schema: core_schema.CoreSchema, f: Walk) -> core_schema.CoreSchema:

/lib/python3.10/site-packages/pydantic/_internal/_core_utils.py in count_refs(s, recurse)
    513 
    514         state['current_recursion_ref_count'][ref] += 1
--> 515         recurse(state['definitions'][ref], count_refs)
    516         state['current_recursion_ref_count'][ref] -= 1
    517         return s

KeyError: 'pyiceberg.types.NestedField:43624544'
```
@syun64 syun64 changed the title constraint pydantic to <2.4 constrain pydantic to <2.4 Oct 3, 2023
@syun64 syun64 changed the title constrain pydantic to <2.4 Python: constrain pydantic to <2.4 Oct 3, 2023
@github-actions github-actions bot added the python label Oct 3, 2023
@Fokko
Copy link
Contributor

Fokko commented Oct 3, 2023

Thanks @syun64 for opening this. We also noticed this and reported this to Pydantic. It has been fixed in 2.4.2: pydantic/pydantic#7646 (comment) We could add exclusions for 2.4.0 and 2.4.1. WDYT?

@@ -52,7 +52,7 @@ requests = ">=2.20.0,<3.0.0"
click = ">=7.1.1,<9.0.0"
rich = ">=10.11.0,<14.0.0"
strictyaml = ">=1.7.0,<2.0.0" # CVE-2020-14343 was fixed in 5.4.
pydantic = ">=2.0,<3.0"
pydantic = ">=2.0,<2.4" # 2.4 release breaks model construction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can exclude certain versions like this:

Suggested change
pydantic = ">=2.0,<2.4" # 2.4 release breaks model construction
pydantic = ">=2.0,<3.0,!=2.4.0,!=2.4.1" # 2.4.0, 2.4.1 has a critical bug

@Fokko
Copy link
Contributor

Fokko commented Oct 3, 2023

@syun64 We just migrated the repository to: https://github.com/apache/iceberg-python

@syun64 syun64 closed this Oct 3, 2023
@syun64
Copy link
Contributor Author

syun64 commented Oct 3, 2023

Thanks @syun64 for opening this. We also noticed this and reported this to Pydantic. It has been fixed in 2.4.2: pydantic/pydantic#7646 (comment) We could add exclusions for 2.4.0 and 2.4.1. WDYT?

Sure! Just closed this PR and opened another one in the new repository. Thanks @Fokko

@syun64
Copy link
Contributor Author

syun64 commented Oct 3, 2023

Moved to: apache/iceberg-python#38

@syun64 syun64 deleted the patch-1 branch October 3, 2023 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants