Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with dynamically imported functions. #515

Closed
tobiasraabe opened this issue Sep 25, 2023 · 2 comments
Closed

Error with dynamically imported functions. #515

tobiasraabe opened this issue Sep 25, 2023 · 2 comments

Comments

@tobiasraabe
Copy link

tobiasraabe commented Sep 25, 2023

Hi!

I want to be able to dynamically import functions and then execute them with loky. I posted it here joblib/loky#406 and it was suggested that cloudpickle might be the better place for this issue.

I hope I correctly boiled down the example to the following script.

Here is a minimal example with a func.py which defines the function and in main.py, we import the function and store it as a .pkl file.

# Content of func.py.
def func(): pass
# Content of main.py
import sys
import importlib.util
import cloudpickle

from pathlib import Path
from types import ModuleType


def import_path(path: Path) -> ModuleType:
    """Taken from https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly."""
    module_name = path.name

    spec = importlib.util.spec_from_file_location(module_name, str(path))

    if spec is None:
        raise ImportError(f"Can't find module {module_name!r} at location {path}.")

    mod = importlib.util.module_from_spec(spec)

    # Comment the line out to successfully unpickle the function.
    sys.modules[module_name] = mod

    spec.loader.exec_module(mod)
    return mod


if __name__ == "__main__":
    module = import_path(Path("func.py").resolve())
    Path("func.pkl").write_bytes(cloudpickle.dumps(module.func))

If you then try to load the .pkl file in another file, you see the following output.

# Content of deserialize.py
from pathlib import Path
import cloudpickle

def import_path(): ...

if __name__ == "__main__":
    # Uncomment to make it work even if sys.modules is uncommented.
    # module = import_path(Path("func.py").resolve())
    cloudpickle.loads(Path("func.pkl").read_bytes())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'func.py'; 'func' is not a package

Is this error expected?

Interestingly, this error does not occur if

  • you comment out the line with sys.modules[module_name] = mod.
  • or you import func.py in deserialize.py.

But, sys.modules[module_name] = mod is necessary since you otherwise see errors with dataclasses: pytask-dev/pytask#373.

I hope you have more insights into why this error is happening. If you need more info, I am happy to give it to you.

Thanks for looking into this issue! 🙏

python: 3.10
cloudpickle: 2.2.1
@tobiasraabe
Copy link
Author

This example contains an error and does not reproduce my problem. I will come back if I manage to replicate it. Thanks for your time 🙏.

@tobiasraabe
Copy link
Author

I found the solution. Since my module was dynamically imported and added to sys.modules, I need to call register_pickle_by_value on the module to mark it as a dynamic module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant