Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to cloudpickle Pydantic model classes #6763

Closed
1 task done
shrekris-anyscale opened this issue Jul 20, 2023 · 31 comments · Fixed by #7876
Closed
1 task done

Unable to cloudpickle Pydantic model classes #6763

shrekris-anyscale opened this issue Jul 20, 2023 · 31 comments · Fixed by #7876
Assignees
Labels
bug V2 Bug related to Pydantic V2 help wanted Pull Request welcome unconfirmed Bug not yet confirmed as valid/applicable

Comments

@shrekris-anyscale
Copy link

shrekris-anyscale commented Jul 20, 2023

Initial Checks

  • I confirm that I'm using Pydantic V2

Description

cloudpickle cannot serialize Pydantic model classes. It fails with a TypeError: cannot pickle 'pydantic_core._pydantic_core.SchemaSerializer' object exception.

Example Code

# bug.py

"""Cloudpickling Pydantic models raises an exception."""

from pydantic import BaseModel
import ray.cloudpickle as cloudpickle

class SimpleModel(BaseModel):
    val: int

cloudpickle.dumps(SimpleModel)

"""
Output:

% python bug.py

Traceback (most recent call last):
  File "/Users/shrekris/Desktop/scratch/dump4.py", line 9, in <module>
    cloudpickle.dumps(SimpleModel)
  File "/Users/shrekris/Desktop/ray/python/ray/cloudpickle/cloudpickle_fast.py", line 88, in dumps
    cp.dump(obj)
  File "/Users/shrekris/Desktop/ray/python/ray/cloudpickle/cloudpickle_fast.py", line 733, in dump
    return Pickler.dump(self, obj)
TypeError: cannot pickle 'pydantic_core._pydantic_core.SchemaSerializer' object
"""

Python, Pydantic & OS Version

/Users/shrekris/miniforge3/envs/pydantic-fix/lib/python3.9/site-packages/pydantic/_migration.py:275: UserWarning: `pydantic.utils:version_info` has been moved to `pydantic.version:version_info`.
  warnings.warn(f'`{import_path}` has been moved to `{new_location}`.')
             pydantic version: 2.0.3
        pydantic-core version: 2.3.0 release build profile
                 install path: /Users/shrekris/miniforge3/envs/pydantic-fix/lib/python3.9/site-packages/pydantic
               python version: 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:38:11)  [Clang 14.0.6 ]
                     platform: macOS-11.4-arm64-arm-64bit
     optional deps. installed: ['typing-extensions']

Selected Assignee: @lig

@shrekris-anyscale shrekris-anyscale added bug V2 Bug related to Pydantic V2 unconfirmed Bug not yet confirmed as valid/applicable labels Jul 20, 2023
@shrekris-anyscale
Copy link
Author

The root cause seems to be that the SchemaSerializer cannot be serialized. I tried overriding the CloudPickler's reducer_override method with a workaround that deletes the SchemaSerializer upon serialization and recreates it upon deserialization, which seems to work. However, I cannot use this workaround in my project since it requires modifying cloudpickle's reducer_override method, which is used across all objects that need to be serialized. Ideally, this workaround could be pushed into the Pydantic classes themselves, so cloudpickling them just works.

@shrekris-anyscale
Copy link
Author

My code example uses a version of cloudpickle that's forked in the Ray repo. I reproduced this issue using the latest cloudpickle directly:

# bug.py

"""Cloudpickling Pydantic models raises an exception."""

from pydantic import BaseModel
import cloudpickle

class SimpleModel(BaseModel):
    val: int

cloudpickle.dumps(SimpleModel)

"""
Output:

% python bug.py

Traceback (most recent call last):
  File "/Users/shrekris/Desktop/scratch/dump4.py", line 18, in <module>
    cloudpickle.dumps(SimpleModel)
  File "/Users/shrekris/miniforge3/envs/pydantic-fix/lib/python3.9/site-packages/cloudpickle/cloudpickle_fast.py", line 73, in dumps
    cp.dump(obj)
  File "/Users/shrekris/miniforge3/envs/pydantic-fix/lib/python3.9/site-packages/cloudpickle/cloudpickle_fast.py", line 632, in dump
    return Pickler.dump(self, obj)
TypeError: cannot pickle 'pydantic_core._pydantic_core.SchemaSerializer' object
"""

@lig
Copy link
Contributor

lig commented Jul 21, 2023

@shrekris-anyscale Thank you for the report. Any ideas what could make it work?

@lig lig added the help wanted Pull Request welcome label Jul 21, 2023
@dmontagu
Copy link
Contributor

Is there a dunder method that cloudpickle uses? We have custom __getstate__ and __setstate__ to make models pickleable, not sure if there's something we could/should do to make it work with cloudpickle.

@shrekris-anyscale
Copy link
Author

shrekris-anyscale commented Jul 21, 2023

One solution is to store SchemaSerializer's schema and config as instance attributes in the serializer. Then, we can define a custom __reduce__ method that serializes the SchemaSerializer by serializing its schema and config. To deserialize, we can initialize a new SchemaSerializer using the schema and config.

Essentially, adding the following methods to SchemaSerializer:

def __init__(self, schema: CoreSchema, config: CoreConfig):
    self.schema = schema
    self.config = config

def __reduce__(self):
    return SchemaSerializer, (self.schema, self.config, )

Here's a gist where I overwrote the SchemaSerializer's __init__ and __reduce__ methods as proposed. cloudpickle and pickle successfully serialize the Pydantic model.

@dmontagu
Copy link
Contributor

If you’d like to make a PR adding this on pydantic-core I’m sure we’d accept it, otherwise @davidhewitt maybe you could whip something up?

@samuelcolvin
Copy link
Member

I don't think SchemaSerializer needs to be picklable, you should just store the core scheme and recreate the serializer.

@shrekris-anyscale
Copy link
Author

I don't think SchemaSerializer needs to be picklable, you should just store the core scheme and recreate the serializer.

I agree, but I don't think this plays nicely with cloudpickle. I tried adding __reduce__ methods to the ModelMetaclass and the BaseModel classes, but cloudpickle didn't enter them when it tried to dump the model classes. Instead, it seems to try to serialize all the attributes of the classes directly, including the SchemaSerializer.

I'm not sure how we'd be able to just store the schema and recreate the serializer if we can't use a custom __reduce__ method in these classes.

@davidhewitt
Copy link
Contributor

I did some digging and it looks like cloudpickle switches between "by reference" or "by value" pickling modes according to whether your type is importable.

So in the repro discussed above, the class is __main__.SimpleModel which is treated as not importable. In this case it looks to me like cloudpickle attempts to recreate a "skeleton" class which functions the same as the provided type. I don't see a way to customise this behaviour. So to support "by value" pickling we need to support naive pickling for all the attributes of the class, as @shrekris-anyscale says. I can't see a way to customise this behaviour. Maybe cloudpickle maintainers know of solutions.

On the other hand, if SimpleModel is moved into a module and imported from there (e.g. from foo import SimpleModel), then cloudpickle will use "by reference" pickling. This already works fine (the pickled data just contains the reference to the import path).

So @shrekris-anyscale a possible workaround may be to move your model definitions out of __main__ files / entry points into modules. Without knowing the full details of your application I don't know if that's actually viable.

@shrekris-anyscale
Copy link
Author

shrekris-anyscale commented Jul 25, 2023

So @shrekris-anyscale a possible workaround may be to move your model definitions out of main files / entry points into modules. Without knowing the full details of your application I don't know if that's actually viable.

Thanks for investigating! We use cloudpickle in part because we want to pickle classes that aren't cleanly importable (e.g. classes defined inside a function), and it's not viable to move the model definitions.

So to support "by value" pickling we need to support naive pickling for all the attributes of the class, as @shrekris-anyscale says. I can't see a way to customise this behaviour.

When cloudpickle serializes a Pydantic model by value, it calls the model's SchemaSerializer's __reduce__ method. The solution I proposed earlier would make a Pydantic model serializable by value. Would it be possible to implement that solution?

@davidhewitt
Copy link
Contributor

The downside of this is that I'd expect we'd need to take a deep copy of the schema and config and store them inside the SchemaSerializer. Currently it just contains Rust state which derives from those (the inverse operation, while possible, is not implemented and would likely be complex).

This would probably also be the case for SchemaValidator.

It seems like a lot of extra CPU and memory baggage that most users will not need. The zero-cost solution would be to implement the "inverse operation" described above (i.e. rebuild schema and config from the Rust state). I couldn't guarantee that's even possible, though.

A proposal which allows pickle support to be opt-in may be a more feasible solution for now.

@shrekris-anyscale
Copy link
Author

shrekris-anyscale commented Jul 26, 2023

One low-baggage alternative is to delete the serializer upon serialization and reconstruct it whenever it's first called. The SchemaSerializer's __reduce__ function could be:

def __reduce__(self):
    return lambda: None, tuple()

Then whenever the SchemaSerializer is first called, the Pydantic model can initialize it using the schema and config and cache it.

This should only affect users that are serializing the SchemaSerializer, and the only added cost is the initialization upon the first call.

@shrekris-anyscale
Copy link
Author

Another option is to have the SchemaSerializer hold a reference to the Pydantic model that it's a part of. Upon serialization, the serializer could use this reference to obtain the schema and config. That way, the serializer won't contain a deep copy of either one during its lifetime.

@dmontagu
Copy link
Contributor

dmontagu commented Jul 26, 2023

One low-baggage alternative is to delete the serializer upon serialization and reconstruct it whenever it's first called. The SchemaSerializer's reduce function could be:

Does this work with monkey-patching SchemaSerializer? Something like:

from pydantic_core import SchemaSerializer

def __reduce__(self):
    return lambda: None, tuple()
SchemaSerializer.__reduce__ = __reduce__

(maybe you'd need to configure it to bind properly, but I'm not sure if PyO3 will cause other issues)

@shrekris-anyscale
Copy link
Author

Does this work with monkey-patching SchemaSerializer?

I haven't actually run that example myself, so I'm not sure if the syntax is correct. The high level approach of that __reduce__ method should be possible though. I don't think monkeypatching alone is sufficient because the SchemaSerializer also needs to be re-initialized in the BaseModel and ModelMetaclass whenever it's first called.

@jrapin
Copy link

jrapin commented Oct 2, 2023

Hello,
FYI I am running into the exact same issue (pickling a locally defined model, which is really convenient for prototyping code), but my error is slightly different from what is mentioned before (with the exact same "SimpleModel" example):

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../.conda/envs/xxx/lib/python3.10/site-packages/cloudpickle/cloudpickle_fast.py:73: in dumps
    cp.dump(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <cloudpickle.cloudpickle_fast.CloudPickler object at 0x7f1466cad180>
obj = <class 'test_pickling.<locals>.SimpleModel'>

    def dump(self, obj):
        try:
>           return Pickler.dump(self, obj)
E           TypeError: cannot pickle '_PydanticWeakRef' object

../../../.conda/envs/xxx/lib/python3.10/site-packages/cloudpickle/cloudpickle_fast.py:632: TypeError

This is with cloudpickle 2.2.1 and

             pydantic version: 2.3.0
        pydantic-core version: 2.6.3
          pydantic-core build: profile=release pgo=true
                 install path: /private/home/jrapin/.conda/envs/brainai/lib/python3.10/site-packages/pydantic
               python version: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
                     platform: Linux-5.4.0-124-generic-x86_64-with-glibc2.31
     optional deps. installed: ['typing-extensions']

I am also interested in any way that could make this work ;)

@jrapin
Copy link

jrapin commented Oct 3, 2023

A bit more on my use case:
I am sending pickled model instances defining computation to a cluster where this computation will run. The issue is that the model is possibly created in a notebook/script, not in a loaded module, and using pickle as in the documentation won't work because we reload it in a context where the class is not defined. This is a use case where cloudpickle is supposed to work well on the other hand.

To remove the offending (class) fields I can surround the (cloud)pickling with this context manager which works fine:

@contextlib.contextmanager
def picklable_pydantic_model(cls: tp.Type[pydantic.BaseModel]) -> tp.Iterator[None]:
   name = f"{cls.__module__}.{cls.__qualname__}"
    if "<locals>." not in name and not name.startswith("__main__."):
        # the class is defined in a module, nothing to do :)
        yield
        return
    logger.warning(
        f"Hacking {name} fields to enable pickling as it is locally defined, "
        "please move to a module asap for more robustness"
    )
    content: tp.Dict[str, tp.Any] = {
        "__pydantic_parent_namespace__": None,
        "__pydantic_serializer__": None,
    }
    for name in content:
        content[name] = cls.__dict__[name]
        type.__setattr__(cls, name, None)
    try:
        yield
    finally:
        for name, data in content.items():
            type.__setattr__(cls, name, data)

It's applied on the class, and allows both class and instances to be pickled.
But that's only useful because I am not using many pydantic features on the cluster, I'm losing features such as model_dump in the process, and I have absolutely no clue how many more side effects I am creating :s
Are there easy ways to recreate them easily?

@edoakes
Copy link
Contributor

edoakes commented Oct 5, 2023

I also ran into the above issues:

  • SchemaSerializer not being cloudpickleable due to being a native type (written in Rust).
  • _PydanticWeakref not being cloudpickleable due to inheriting from weakref.ref, which has known issues with serialization.

I was able to enable cloudpickleing a wide variety of Pydantic model definitions with the following two patches:

  1. Wrap SchemaSerializer to save the Python arguments necessary to reconstruct it:
# Define a wrapper that saves the `schema` and `core_config` to reconstruct the native `SchemaSerializer`.
class CloudpickleableSchemaSerializer:
    def __init__(self, schema, core_config):
        self._schema = schema
        self._core_config = core_config
        self._schema_serializer = SchemaSerializer(self._schema, self._core_config)

    def __reduce__(self):
        return CloudpickleableSchemaSerializer, (self._schema, self._core_config)

    def __getattr__(self, attr: str):
        return getattr(self._schema_serializer, attr)

# Override all usages of `SchemaSerializer` (obviously not needed if we upstream the above wrapper):
pydantic._internal._model_construction.SchemaSerializer = CloudpickleableSchemaSerializer
pydantic._internal._dataclasses.SchemaSerializer = CloudpickleableSchemaSerializer
pydantic.type_adapter.SchemaSerializer = CloudpickleableSchemaSerializer

The __getattr__ bit is somewhat hacky. This is required because SchemaSerializer does not allow Python classes to subclass it. This can be cleaned up by adding the subclass parameter to the PyO3 #[pyclass] macro in pydantic_core. I've tested this as well, with the wrapper looking like:

class CloudpickleableSchemaSerializer(SchemaSerializer):
    def __init__(self, schema, core_config):
        self._schema = schema
        self._core_config = core_config
        # No need for `super().__init__()` because `SchemaSerializer` initialization happens in `__new__`.

    def __reduce__(self):
        return CloudpickleableSchemaSerializer, (self._schema, self._core_config)
  1. Wrap weakref.ref instead of inheriting from it:
class WeakRefWrapper:
    def __init__(self, obj: Any):
        if obj is None:
            self._wr = None
        else:
            self._wr = weakref.ref(obj)

    def __reduce__(self):
        return WeakRefWrapper, (self(),)

    def __call__(self) -> Any:
        if self._wr is None:
            return None
        else:
            return self._wr()

# Override all usages of `_PydanticWeakRef` (obviously not needed if we upstream the above wrapper):
pydantic._internal._model_construction._PydanticWeakRef = WeakRefWrapper

AFAICT there's no downside to this wrapper but it gets around the strange ABC-related pickling error.

@davidhewitt @dmontagu @lig I'm happy to contribute a patch if you think this is a reasonable direction. Let me know what you think. The only downside I can see is that the SchemaSerializer wrapper will hold a reference to the schema and core_config objects (though I imagine these are probably already referenced somewhere in one of the BaseModel or ModelMetaclass members).

@davidhewitt
Copy link
Contributor

Thanks @edoakes for investigating. My question would be whether you plan to merge the CloudpickleableSchemaSerializer into pydantic or you just want to make the adjustments so that projects which need this functionality can patch themselves.

If your goal is to merge this fix into pydantic then it's probably better to just make SchemaSerializer always store these members so there's no need for wrapping.

@edoakes
Copy link
Contributor

edoakes commented Oct 6, 2023

@davidhewitt I'd prefer to merge the functionality into pydantic so that cloudpickle "just works" out of the box and folks don't have to worry about patching things.

Storing the members and defining __reduce__ on SchemaSerializer itself would indeed be preferable, I'm just not sure how to accomplish that using PyO3 (not familiar with the framework). I can try to get it working.

So then the plan of action would be:

  • PR 1 (against pydantic_core): Make SchemaSerializer directly cloudpickleable in pydantic_core by storing references to the constructor arguments.
  • PR 2 (against pydantic): Use a WeakrefWrapper similar to the above rather than the weakref directly.

With these two, we should be good to go. Given that these changes would be split across the repos, is there any special versioning story between them? Or does pydantic treat pydantic_core like any other Python package dependency? In PR 2 I'll add a regression test but that will depend on using a version of pydantic_core that includes PR 1.

@edoakes
Copy link
Contributor

edoakes commented Oct 6, 2023

Ah, I see it looks like the pydantic_core version is pinned: https://github.com/pydantic/pydantic/blob/main/pyproject.toml#L64

So I assume then that this will require first patching pydantic_core, waiting for a release, then updating the pinned version in pydantic?

@davidhewitt
Copy link
Contributor

Yes exactly, we can release pydantic_core shortly after your PR is merged so there's not a long delay in getting this working on pydantic main.

@edoakes
Copy link
Contributor

edoakes commented Oct 9, 2023

I have opened PRs for each of the above issues:

I will begin testing that these fixes are comprehensive using locally installed copies. Once these PRs have been merged and a new version of pydantic_core has been released, I will add integration/regression tests to the pydantic repo.

@edoakes
Copy link
Contributor

edoakes commented Oct 10, 2023

Both PRs linked above are now merged. I've begun manually testing and they appear to address the issue for all of my use cases. @jrapin if you could test out your workflow with pydantic_core and pydantic installed from main that would be helpful.

@davidhewitt Please let me know when a pydantic_core release can be done in order to add integration testing & catch the next pydantic release.

@jrapin
Copy link

jrapin commented Oct 13, 2023

@edoakes sorry for the delay
I was just able to test the updated packages, and I can confirm it worked perfectly for my use case :)
Thanks a lot!

edoakes added a commit to ray-project/ray that referenced this issue Oct 16, 2023
Initial steps towards full `pydantic>=2.0` compatibility.

Included:

- Monkeypatch logic to make new pydantic models serializable. See pydantic/pydantic#6763 for the plan to properly fix this.
- A new CI build running against Pydantic 2.0+.
- Moved all internal usage of Pydantic models to `ray._private.pydantic_compat` to abstract out the `from pydantic` vs `from pydantic.v1` import paths.

Future work:

- Add tests for higher `fastapi` versions w/ Pydantic 2.0.
- Add more comprehensive testing for Pydantic 2.0+ model serialization.
- Add support for our custom-installed JSON encoders in Pydantic 2.0+ (or drop them because they are using internal APIs).
@edoakes
Copy link
Contributor

edoakes commented Oct 24, 2023

Thanks for the help @davidhewitt 🚀 do you know when the next release is scheduled and what its version tag will be?

@davidhewitt
Copy link
Contributor

No specific ETA, I would assume 2.5.

@edoakes
Copy link
Contributor

edoakes commented Nov 7, 2023

@davidhewitt any update on when the next pydantic release is going to come out? This is a pretty big pain point for our users and I want to make sure we can unpin the dependency soon.

@fkaleo
Copy link

fkaleo commented Nov 7, 2023

@davidhewitt any update on when the next pydantic release is going to come out? This is a pretty big pain point for our users and I want to make sure we can unpin the dependency soon.

Reading from #8028 there seems to be a couple of items still pending before 2.5 comes out

@davidhewitt
Copy link
Contributor

I believe we are going to release 2.5 beta today, with a view to a final 2.5 production release next week.

@jrapin
Copy link

jrapin commented Nov 27, 2023

@edoakes unfortunately I have been fooled by my unit tests running better thanks to your fix ("local" classes defined within the scope of a unittest can be pickled while they could not before the fix), but I still have issues with Ipython console and Jupyter notebooks :( (see #8232). Did everything work fine on your end? Do you have any idea what can be causing this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2 help wanted Pull Request welcome unconfirmed Bug not yet confirmed as valid/applicable
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants