Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not swallow PicklingError #1359

Merged
merged 11 commits into from
Feb 15, 2023
21 changes: 17 additions & 4 deletions joblib/_store_backends.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Storage providers backends for Memory caching."""

from pickle import PicklingError
import re
import os
import os.path
Expand Down Expand Up @@ -185,12 +186,24 @@ def dump_item(self, path, item, verbose=1):

def write_func(to_write, dest_filename):
with self._open_item(dest_filename, "wb") as f:
numpy_pickle.dump(to_write, f,
compress=self.compress)
try:
numpy_pickle.dump(to_write, f, compress=self.compress)
except PicklingError as e:
warnings.warn(
'Unable to cache to disk: '
'failed to pickle output. '
'In future versions of joblib '
'this will raise an exception.'
'Exception: %s.' % e,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to avoid the old formating synthax in the new code base, so please use:

Suggested change
'Exception: %s.' % e,
f'Exception: {e}.'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you please rewrap the string to better use the available <80 char limit of PEP8?

Some people complain than 80 is not enough, others seem reluctant to use more than 60 apparently ;) 80 is a good middle ground.

FutureWarning
)

self._concurrency_safe_write(item, filename, write_func)
except: # noqa: E722
" Race condition in the creation of the directory "
except Exception as e: # noqa: E722
warnings.warn(
'Unable to cache to disk. '
'Possibly a race condition in the creation of the directory. '
'Exception: %s' % e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are changing a bit this, I would be in favor of sanitizing this, by putting a try/except for OSError in create_location, and having this raise a deprecation warning saying that any other error will be raised in future version of joblib. WDYT?

Small nitpick on style.

Suggested change
except Exception as e: # noqa: E722
warnings.warn(
'Unable to cache to disk. '
'Possibly a race condition in the creation of the directory. '
'Exception: %s' % e)
except Exception as e: # noqa: E722
warnings.warn(
'Unable to cache to disk. '
'Possibly a race condition in the creation of the directory. '
f'Exception: {e}'
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So following the conversation with @ogrisel , I would not do a deprecation warning but just a warning for the error.
Maybe create a specfic CacheWarning that will be easy to filter out?


def clear_item(self, path):
"""Clear the item at the path, given as a list of strings."""
Expand Down
47 changes: 45 additions & 2 deletions joblib/test/test_store_backends.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
import warnings
from pickle import PicklingError
from unittest.mock import MagicMock

try:
# Python 2.7: use the C pickle to speed up
# test_concurrency_safe_write which pickles big python objects
Expand All @@ -10,8 +14,9 @@
from joblib.testing import parametrize, timeout
from joblib.test.common import with_multiprocessing
from joblib.backports import concurrency_safe_rename
from joblib import Parallel, delayed
from joblib._store_backends import concurrency_safe_write
from joblib import Parallel, delayed, numpy_pickle
from joblib._store_backends import concurrency_safe_write, \
FileSystemStoreBackend


def write_func(output, filename):
Expand Down Expand Up @@ -54,3 +59,41 @@ def test_concurrency_safe_write(tmpdir, backend):
if i % 3 != 2 else load_func for i in range(12)]
Parallel(n_jobs=2, backend=backend)(
delayed(func)(obj, filename) for func in funcs)


@with_multiprocessing
def test_warning_on_dump_failure(tmpdir, monkeypatch):
backend = FileSystemStoreBackend()
backend.location = tmpdir.join('test_warning_on_pickling_error').strpath
backend.compress = None

def monkeypatched_pickle_dump(*args, **kwargs):
raise Exception()

warnings_mock = MagicMock()
monkeypatch.setattr(numpy_pickle, "dump", monkeypatched_pickle_dump)
monkeypatch.setattr(warnings, "warn", warnings_mock)

backend.dump_item("testpath", "testitem")

warnings_mock.assert_called_once()


@with_multiprocessing
def test_warning_on_pickling_error(tmpdir, monkeypatch):
# This is separate from test_warning_on_dump_failure because in the
# future we will turn this into an exception.
backend = FileSystemStoreBackend()
backend.location = tmpdir.join('test_warning_on_pickling_error').strpath
backend.compress = None

def monkeypatched_pickle_dump(*args, **kwargs):
raise PicklingError()

warnings_mock = MagicMock()
monkeypatch.setattr(numpy_pickle, "dump", monkeypatched_pickle_dump)
monkeypatch.setattr(warnings, "warn", warnings_mock)

backend.dump_item("testpath", "testitem")

warnings_mock.assert_called_once()