Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Make deep classifier's convert_y_to_keras private #6373

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

cedricdonie
Copy link

@cedricdonie cedricdonie commented May 1, 2024

The method convert_y_to_keras refits the classifiers' onehot and
label encoder based on the input data. It is automatically called
during deep_classifier.fit. If an unwitting user calls it again with
data that has different classes, the behavior of the predict function
will change.

Making the method private prevents accidental incorrect usage and
simplifies the interface.

Other considerations

  • A deprecation might be sensible rather than making the method private right away.
  • We could consider providing an encode_y function that uses the label encoder if users will want to convert the labels the same way.

Fixes #5864.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This fixes a "bug" in the sense that a public method of the deep classifiers has a misleading name and should not be callled by unknowing users.
The method sounds like it will be side-effect free, but actually changes the state of the estimator substantially.

Does your contribution introduce a new dependency? If yes, which one?

No dependencies added.

What should a reviewer concentrate their feedback on?

Please focus on how this fits into the overall API from a user perspective.
I would be especially interested whether the core functionality of a classifier: fit, predict, evaluate remains ergonomic.
Perhaps there were also reasons why convert_y_to_keras was initially public.

Did you add any tests for the change?

No tests were added. As per the discussion in #5864, there is already a test that classifier.predict is deterministic and free from side-effects.

Any other comments?

  • I don't fully understand how users of the classifiers will use the predict methods, as reversing label/one-hot encoding would be required to get y_pred in a format that matches the training data input. Perhaps we should offer a side-effect-free encode_y method or simply reverse encoding labels and y in the predict method. Furthermore, I am not sure that the following works as expected because: train with X_train and y_train_multilabel, predict on X_test, score on X_test and y_test_multilabel. I suspect that classifier.predict(X_test) will return one-hot encoded results, but classifier.score(X_test, y_test_multilabel will get multilabel format.
  • We may want to introduce a deprecation period first, instead of directly making the method private to avoid breaking existing code. To that end, we could have convert_y_to_keras issue a DeprecationWarning and then call _convert_y_to_keras.

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
    See here for further details on the algorithm maintainer role.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

The method `convert_y_to_keras` refits the classifiers' onehot and
label encoder based on the input data. It is automatically called
during `deep_classifier.fit`. If an unwitting user calls it again with
data that has different classes, the behavior of the predict function
will change.

Making the method private prevents accidental incorrect usage and
simplifies the interface.
@cedricdonie cedricdonie changed the title Make deep classifier's convert_y_to_keras private [BUG] Make deep classifier's convert_y_to_keras private May 1, 2024
@fkiraly fkiraly added module:classification classification module: time series classification enhancement Adding new functionality labels May 1, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand how users of the classifiers will use the predict methods, as reversing label/one-hot encoding would be required to get y_pred in a format that matches the training data input.

That needs to be the case indeed, or it would violate the API contract - I am perplexed why the tests did not catch this. Some possible explanations:

  • scenarios covered only the case of two classes. There is not a single scenario that has three (the one that has three was turned off). I am not sure why this is the case, 2 years ago I refactored tests from an earlier state to the scenarios.
  • some classifiers, including inception time, are on the exception list, in tests._config. If an estimator is on this list, no tests from the check_estimator suite is being run, i.e., the CI in this PR does not test the InceptionTimeClassifier.

Regarding how to handle: indeed, the one-hot-encoding needs to be inverted. The sklearn one, strangely, does not seem to have the obvious inverse_transform, so we need to do this manually.

To that end, we could have convert_y_to_keras issue a DeprecationWarning and then call _convert_y_to_keras.

Agree - probably no one is using it, but better be safe. I would direct it to the private method, and schedule deprecation for 0.31.0 (the earliest release possible given the policy). Target is complete removal.

@fkiraly
Copy link
Collaborator

fkiraly commented May 2, 2024

Regarding DL classifiers and multiclass: I ran all tests with a scenario of three classes, no DL classifiers are failing.
See #6376

I think this is because the "test all" workflow is missing coverage, see #6315 - this also applies to the "test all" workflow, i.e., no deep learning classifiers are tested when I trigger "test all".

@yarnabrina, how would we test all deep learning estimtators? It seems we are missing some serious issues.
How about a "test all DL" workflow?

@yarnabrina
Copy link
Collaborator

How about a "test all DL" workflow?

We may do that. Plus I think we should do one in normal new CI too, which will cover dl for pandas v2.

@cedricdonie
Copy link
Author

cedricdonie commented May 2, 2024

Regarding DL classifiers and multiclass: I ran all tests with a scenario of three classes, no DL classifiers are failing. See #6376

I think this is because the "test all" workflow is missing coverage, see #6315 - this also applies to the "test all" workflow, i.e., no deep learning classifiers are tested when I trigger "test all".

It seems that the tests are correctly passing, because InceptionTime does fulfill the API contract. The following looks like an argmax function, returning y_pred in multilabel form [class1, class2, class3, class1] rather than one-hot.

def _predict(self, X, **kwargs):
probs = self._predict_proba(X, **kwargs)
rng = check_random_state(self.random_state)
return np.array(
[
self.classes_[int(rng.choice(np.flatnonzero(prob == prob.max())))]
for prob in probs
]
)

I removed the exclusion of InceptionTime (see patch below) to ensure that a deep classifier is triggered and ran the tests locally via pytest -vvv sktime/classification/tests/test_all_classifiers.py::TestAllClassifiers. I also used git cherry-pick --no-commit efcdcd8bd1208618d1 319578e6729047b 3d75693b2968b6f to get the commits from #6374. All tests pass, even the assertion that y_pred is always one-dimensional.

diff --git a/sktime/classification/deep_learning/base.py b/sktime/classification/deep_learning/base.py
index 4276d0b8..88aa6424 100644
--- a/sktime/classification/deep_learning/base.py
+++ b/sktime/classification/deep_learning/base.py
@@ -72,12 +72,14 @@ class BaseDeepClassifier(BaseClassifier, ABC):
     def _predict(self, X, **kwargs):
         probs = self._predict_proba(X, **kwargs)
         rng = check_random_state(self.random_state)
-        return np.array(
+        ret = np.array(
             [
                 self.classes_[int(rng.choice(np.flatnonzero(prob == prob.max())))]
                 for prob in probs
             ]
         )
+        assert ret.ndim == 1
+        return ret
 
     def _predict_proba(self, X, **kwargs):
         """Find probability estimates for each class for all cases in X.
diff --git a/sktime/classification/tests/test_all_classifiers.py b/sktime/classification/tests/test_all_classifiers.py
index 173edc08..33ef651b 100644
--- a/sktime/classification/tests/test_all_classifiers.py
+++ b/sktime/classification/tests/test_all_classifiers.py
@@ -88,8 +88,8 @@ class TestAllClassifiers(ClassifierFixtureGenerator, QuickTester):
         # check predict
         assert isinstance(y_pred, np.ndarray)
         assert y_pred.shape == (X_new_instances,)
+        assert y_pred.ndim == y_train.ndim
         assert np.all(np.isin(np.unique(y_pred), np.unique(y_train)))
-
         # check predict proba (all classifiers have predict_proba by default)
         y_proba = scenario.run(estimator_instance, method_sequence=["predict_proba"])
         assert isinstance(y_proba, np.ndarray)
diff --git a/sktime/tests/_config.py b/sktime/tests/_config.py
index c2160a9c..63d941a9 100644
--- a/sktime/tests/_config.py
+++ b/sktime/tests/_config.py
@@ -36,7 +36,7 @@ EXCLUDE_ESTIMATORS = [
     "EditDist",
     "CNNClassifier",
     "FCNClassifier",
-    "InceptionTimeClassifier",
+    #"InceptionTimeClassifier",
     "LSTMFCNClassifier",
     "MLPClassifier",
     "MLPRegressor",

Could someone please confirm that this interpretation is correct?

@fkiraly Should we add the assert y_pred.ndim == y_train.ndim test to the classifier tests or is it redundant?

With regard to the next steps, I suggest to simply add the deprecation as discussed and close the PR.

@fkiraly
Copy link
Collaborator

fkiraly commented May 2, 2024

Could someone please confirm that this interpretation is correct?

What interpretation - that the tests passing imply that y_pred has dimension 1 (in the single label case)? Yes, I think this is correct - test_classifier_output, line 90.

@fkiraly Should we add the assert y_pred.ndim == y_train.ndim test to the classifier tests or is it redundant?

I am not sure that this would always be true (at least if we want to cover the general contract).

This is because we have added multilabel classification, and for that a 2D numpy or pd.DataFrame y is possible. However, to ensure downwards compatibility, predict always returns a 1D y if the input was (n, 1), i.e., univariate.

It might be true for all current scenarios, but I would be cautious about adding a check that is not robust with respect to adding more, in contract compliant cases.

@fkiraly
Copy link
Collaborator

fkiraly commented May 2, 2024

With regard to the next steps, I suggest to simply add the deprecation as discussed and close the PR.

If you mean: merge (i.e., add to the code base; not close = throw away), then, yes, I agree.

@fkiraly
Copy link
Collaborator

fkiraly commented May 9, 2024

@cedricdonie, is this ready to merge?

fkiraly
fkiraly previously approved these changes May 22, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it is? Looks good to me!

@cedricdonie
Copy link
Author

Hi @fkiraly, the deprecation is still missing. If no deprecation is needed, we can merge this PR.

Sorry for the late reply. I can try to add the deprecation in the next days.

@fkiraly
Copy link
Collaborator

fkiraly commented May 23, 2024

Oh, I see - of course.

Next minor release is scheduled for May 26, so if you could add it until then, would be appreciated.

I would imagine it looks like this

from sktime.utils.warnings import warn

# todo 0.31.0: remove the public
def convert_y_to_keras(self, y):
    warn(
        "warning, convert_y_to_keras of sktime deep learning estimators is deprecated and will be removed in 0.31.0. "
        "For equivalent behaviour, please use sklearn OneHotEncoder.fit_transform directly."
        obj=self,
    )
    return _convert_y_to_keras(y=y)

`convert_y_to_keras` should not be a public method, as disDeprecate
BaseDeepClassifier convert_y_to_keras

`convert_y_to_keras` should not be a public method, as explained in
d1a8029. This change issues a
deprecation warning when the method is called and delegates to the
private method, which is used by all deriving classes. The deprecation
message starts with "WARNING" to match the other deprecation message in
the modified file.
@cedricdonie
Copy link
Author

Hi @fkiraly, I have added the deprecation warning and all tests are passing on my local machine. Shall we merge the PR?

After testing the warning output, the additional "WARNING" text seems
redundant and is removed. Furthermore, spacing is fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:classification classification module: time series classification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] InceptionTime does not support validation data with fewer classes than in the training data
3 participants