You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a package called pytask, which behaves almost like pytest in that it dynamically collects tasks (task and not test functions) from modules and executes them.
Then, there is also pytask-parallel (similar to pytest-xdist) that provides multiple backends to parallelize the execution of tasks, and one is loky. When executing tasks in parallel, I get an error with loky.
Here is a minimal example with a task_example.py which defines the task function and in main.py you find the runner that dynamically imports the function and executes it with loky.
# Content of task_example.py.deftask_example(): pass
# Content of main.pyfrompathlibimportPathimportsysfromtypesimportModuleTypeimportimportlib.utilfromlokyimportget_reusable_executordefimport_path(path: Path) ->ModuleType:
module_name=path.namespec=importlib.util.spec_from_file_location(module_name, str(path))
ifspecisNone:
raiseImportError(f"Can't find module {module_name!r} at location {path}.")
mod=importlib.util.module_from_spec(spec)
# Comment the line out to make the submitted task succeed.sys.modules[module_name] =modspec.loader.exec_module(mod)
returnmodif__name__=="__main__":
module=import_path(Path("task_example.py"))
function=module.task_exampleexecutor=get_reusable_executor(max_workers=2, timeout=2)
res=executor.submit(function)
print(res.exception())
print(res.exception().__cause__)
If you run python main.py, you see the following output.
A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
"""
Traceback (most recent call last):
File "/home/tobia/mambaforge/envs/pytask-parallel/lib/python3.11/site-packages/loky/process_executor.py", line 426, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tobia/mambaforge/envs/pytask-parallel/lib/python3.11/multiprocessing/queues.py", line 122, in get
return _ForkingPickler.loads(res)
^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'task_example.py'; 'task_example' is not a package
"""
Interestingly, this error does not occur if you comment out the line with sys.modules[module_name] = mod.
But, this line is necessary since you otherwise see errors with dataclasses: pytask-dev/pytask#373.
Using another backend like concurrent.futures.ProcessPoolExecutor does not lead to this error.
I hope you have more insights into why this error is happening. If you need more info, I am happy to give it to you.
Thanks for looking into this issue! 🙏
The text was updated successfully, but these errors were encountered:
Hi!
I have a package called pytask, which behaves almost like pytest in that it dynamically collects tasks (task and not test functions) from modules and executes them.
Then, there is also pytask-parallel (similar to pytest-xdist) that provides multiple backends to parallelize the execution of tasks, and one is loky. When executing tasks in parallel, I get an error with loky.
Here is a minimal example with a
task_example.py
which defines the task function and inmain.py
you find the runner that dynamically imports the function and executes it with loky.If you run
python main.py
, you see the following output.Interestingly, this error does not occur if you comment out the line with
sys.modules[module_name] = mod
.But, this line is necessary since you otherwise see errors with
dataclasses
: pytask-dev/pytask#373.Using another backend like
concurrent.futures.ProcessPoolExecutor
does not lead to this error.I hope you have more insights into why this error is happening. If you need more info, I am happy to give it to you.
Thanks for looking into this issue! 🙏
The text was updated successfully, but these errors were encountered: