[BUG] Make deep classifier's convert_y_to_keras private #6373

cedricdonie · 2024-05-01T15:59:34Z

The method convert_y_to_keras refits the classifiers' onehot and
label encoder based on the input data. It is automatically called
during deep_classifier.fit. If an unwitting user calls it again with
data that has different classes, the behavior of the predict function
will change.

Making the method private prevents accidental incorrect usage and
simplifies the interface.

Other considerations

A deprecation might be sensible rather than making the method private right away.
We could consider providing an encode_y function that uses the label encoder if users will want to convert the labels the same way.

Fixes #5864.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This fixes a "bug" in the sense that a public method of the deep classifiers has a misleading name and should not be callled by unknowing users.
The method sounds like it will be side-effect free, but actually changes the state of the estimator substantially.

Does your contribution introduce a new dependency? If yes, which one?

No dependencies added.

What should a reviewer concentrate their feedback on?

Please focus on how this fits into the overall API from a user perspective.
I would be especially interested whether the core functionality of a classifier: fit, predict, evaluate remains ergonomic.
Perhaps there were also reasons why convert_y_to_keras was initially public.

Did you add any tests for the change?

No tests were added. As per the discussion in #5864, there is already a test that classifier.predict is deterministic and free from side-effects.

Any other comments?

I don't fully understand how users of the classifiers will use the predict methods, as reversing label/one-hot encoding would be required to get y_pred in a format that matches the training data input. Perhaps we should offer a side-effect-free encode_y method or simply reverse encoding labels and y in the predict method. Furthermore, I am not sure that the following works as expected because: train with X_train and y_train_multilabel, predict on X_test, score on X_test and y_test_multilabel. I suspect that classifier.predict(X_test) will return one-hot encoded results, but classifier.score(X_test, y_test_multilabel will get multilabel format.
We may want to introduce a deprecation period first, instead of directly making the method private to avoid breaking existing code. To that end, we could have convert_y_to_keras issue a DeprecationWarning and then call _convert_y_to_keras.

PR checklist

For all contributions

I've added myself to the list of contributors with any new badges I've earned :-)
How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
See here for full badge reference
Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
See here for further details on the algorithm maintainer role.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

For new estimators

I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
dependency isolation, see the estimator dependencies guide.

The method `convert_y_to_keras` refits the classifiers' onehot and label encoder based on the input data. It is automatically called during `deep_classifier.fit`. If an unwitting user calls it again with data that has different classes, the behavior of the predict function will change. Making the method private prevents accidental incorrect usage and simplifies the interface.

fkiraly

I don't fully understand how users of the classifiers will use the predict methods, as reversing label/one-hot encoding would be required to get y_pred in a format that matches the training data input.

That needs to be the case indeed, or it would violate the API contract - I am perplexed why the tests did not catch this. Some possible explanations:

scenarios covered only the case of two classes. There is not a single scenario that has three (the one that has three was turned off). I am not sure why this is the case, 2 years ago I refactored tests from an earlier state to the scenarios.
- I now also added a scenario with three classes: [ENH] classification test scenario with three classes #6374 - I will run this in a "test all" run and we need to see what breaks. Given your observation, all DL classifiers should break.
some classifiers, including inception time, are on the exception list, in tests._config. If an estimator is on this list, no tests from the check_estimator suite is being run, i.e., the CI in this PR does not test the InceptionTimeClassifier.

Regarding how to handle: indeed, the one-hot-encoding needs to be inverted. The sklearn one, strangely, does not seem to have the obvious inverse_transform, so we need to do this manually.

To that end, we could have convert_y_to_keras issue a DeprecationWarning and then call _convert_y_to_keras.

Agree - probably no one is using it, but better be safe. I would direct it to the private method, and schedule deprecation for 0.31.0 (the earliest release possible given the policy). Target is complete removal.

fkiraly · 2024-05-02T08:42:15Z

Regarding DL classifiers and multiclass: I ran all tests with a scenario of three classes, no DL classifiers are failing.
See #6376

I think this is because the "test all" workflow is missing coverage, see #6315 - this also applies to the "test all" workflow, i.e., no deep learning classifiers are tested when I trigger "test all".

@yarnabrina, how would we test all deep learning estimtators? It seems we are missing some serious issues.
How about a "test all DL" workflow?

yarnabrina · 2024-05-02T09:36:17Z

How about a "test all DL" workflow?

We may do that. Plus I think we should do one in normal new CI too, which will cover dl for pandas v2.

cedricdonie · 2024-05-02T15:12:39Z

Regarding DL classifiers and multiclass: I ran all tests with a scenario of three classes, no DL classifiers are failing. See #6376

I think this is because the "test all" workflow is missing coverage, see #6315 - this also applies to the "test all" workflow, i.e., no deep learning classifiers are tested when I trigger "test all".

It seems that the tests are correctly passing, because InceptionTime does fulfill the API contract. The following looks like an argmax function, returning y_pred in multilabel form [class1, class2, class3, class1] rather than one-hot.

sktime/sktime/classification/deep_learning/base.py

Lines 72 to 80 in d86114f

    
           def _predict(self, X, **kwargs): 
        
               probs = self._predict_proba(X, **kwargs) 
        
               rng = check_random_state(self.random_state) 
        
               return np.array( 
        
                   [ 
        
                       self.classes_[int(rng.choice(np.flatnonzero(prob == prob.max())))] 
        
                       for prob in probs 
        
                   ] 
        
               )

I removed the exclusion of InceptionTime (see patch below) to ensure that a deep classifier is triggered and ran the tests locally via pytest -vvv sktime/classification/tests/test_all_classifiers.py::TestAllClassifiers. I also used git cherry-pick --no-commit efcdcd8bd1208618d1 319578e6729047b 3d75693b2968b6f to get the commits from #6374. All tests pass, even the assertion that y_pred is always one-dimensional.

diff --git a/sktime/classification/deep_learning/base.py b/sktime/classification/deep_learning/base.py
index 4276d0b8..88aa6424 100644
--- a/sktime/classification/deep_learning/base.py
+++ b/sktime/classification/deep_learning/base.py
@@ -72,12 +72,14 @@ class BaseDeepClassifier(BaseClassifier, ABC):
     def _predict(self, X, **kwargs):
         probs = self._predict_proba(X, **kwargs)
         rng = check_random_state(self.random_state)
-        return np.array(
+        ret = np.array(
             [
                 self.classes_[int(rng.choice(np.flatnonzero(prob == prob.max())))]
                 for prob in probs
             ]
         )
+        assert ret.ndim == 1
+        return ret
 
     def _predict_proba(self, X, **kwargs):
         """Find probability estimates for each class for all cases in X.
diff --git a/sktime/classification/tests/test_all_classifiers.py b/sktime/classification/tests/test_all_classifiers.py
index 173edc08..33ef651b 100644
--- a/sktime/classification/tests/test_all_classifiers.py
+++ b/sktime/classification/tests/test_all_classifiers.py
@@ -88,8 +88,8 @@ class TestAllClassifiers(ClassifierFixtureGenerator, QuickTester):
         # check predict
         assert isinstance(y_pred, np.ndarray)
         assert y_pred.shape == (X_new_instances,)
+        assert y_pred.ndim == y_train.ndim
         assert np.all(np.isin(np.unique(y_pred), np.unique(y_train)))
-
         # check predict proba (all classifiers have predict_proba by default)
         y_proba = scenario.run(estimator_instance, method_sequence=["predict_proba"])
         assert isinstance(y_proba, np.ndarray)
diff --git a/sktime/tests/_config.py b/sktime/tests/_config.py
index c2160a9c..63d941a9 100644
--- a/sktime/tests/_config.py
+++ b/sktime/tests/_config.py
@@ -36,7 +36,7 @@ EXCLUDE_ESTIMATORS = [
     "EditDist",
     "CNNClassifier",
     "FCNClassifier",
-    "InceptionTimeClassifier",
+    #"InceptionTimeClassifier",
     "LSTMFCNClassifier",
     "MLPClassifier",
     "MLPRegressor",

Could someone please confirm that this interpretation is correct?

@fkiraly Should we add the assert y_pred.ndim == y_train.ndim test to the classifier tests or is it redundant?

With regard to the next steps, I suggest to simply add the deprecation as discussed and close the PR.

fkiraly · 2024-05-02T23:10:09Z

Could someone please confirm that this interpretation is correct?

What interpretation - that the tests passing imply that y_pred has dimension 1 (in the single label case)? Yes, I think this is correct - test_classifier_output, line 90.

@fkiraly Should we add the assert y_pred.ndim == y_train.ndim test to the classifier tests or is it redundant?

I am not sure that this would always be true (at least if we want to cover the general contract).

This is because we have added multilabel classification, and for that a 2D numpy or pd.DataFrame y is possible. However, to ensure downwards compatibility, predict always returns a 1D y if the input was (n, 1), i.e., univariate.

It might be true for all current scenarios, but I would be cautious about adding a check that is not robust with respect to adding more, in contract compliant cases.

fkiraly · 2024-05-02T23:10:48Z

With regard to the next steps, I suggest to simply add the deprecation as discussed and close the PR.

If you mean: merge (i.e., add to the code base; not close = throw away), then, yes, I agree.

fkiraly · 2024-05-09T12:43:12Z

@cedricdonie, is this ready to merge?

fkiraly

I suppose it is? Looks good to me!

cedricdonie · 2024-05-23T05:23:27Z

Hi @fkiraly, the deprecation is still missing. If no deprecation is needed, we can merge this PR.

Sorry for the late reply. I can try to add the deprecation in the next days.

fkiraly · 2024-05-23T15:12:37Z

Oh, I see - of course.

Next minor release is scheduled for May 26, so if you could add it until then, would be appreciated.

I would imagine it looks like this

from sktime.utils.warnings import warn

# todo 0.31.0: remove the public
def convert_y_to_keras(self, y):
    warn(
        "warning, convert_y_to_keras of sktime deep learning estimators is deprecated and will be removed in 0.31.0. "
        "For equivalent behaviour, please use sklearn OneHotEncoder.fit_transform directly."
        obj=self,
    )
    return _convert_y_to_keras(y=y)

`convert_y_to_keras` should not be a public method, as disDeprecate BaseDeepClassifier convert_y_to_keras `convert_y_to_keras` should not be a public method, as explained in d1a8029. This change issues a deprecation warning when the method is called and delegates to the private method, which is used by all deriving classes. The deprecation message starts with "WARNING" to match the other deprecation message in the modified file.

…ersion

cedricdonie · 2024-05-26T19:08:24Z

Hi @fkiraly, I have added the deprecation warning and all tests are passing on my local machine. Shall we merge the PR?

After testing the warning output, the additional "WARNING" text seems redundant and is removed. Furthermore, spacing is fixed.

cedricdonie requested review from achieveordie, benHeid, fkiraly and yarnabrina as code owners May 1, 2024 15:59

Add cedricdonie to contributors

ffe5b1e

cedricdonie changed the title ~~Make deep classifier's convert_y_to_keras private~~ [BUG] Make deep classifier's convert_y_to_keras private May 1, 2024

This was referenced May 1, 2024

[BUG] InceptionTime does not support validation data with fewer classes than in the training data #5864

Open

[DOC] high-level explanation of deprecation policy principles #6364

Merged

fkiraly added module:classification classification module: time series classification enhancement Adding new functionality labels May 1, 2024

fkiraly requested changes May 1, 2024

View reviewed changes

fkiraly previously approved these changes May 22, 2024

View reviewed changes

Merge branch 'main' into pr/6373

caab7ae

cedricdonie dismissed fkiraly’s stale review via 77edad3 May 26, 2024 18:58

Merge branch 'main' into bugfix/private-inceptiontime-y-to-keras-conv…

62ca41d

…ersion

Fix and improve deprecation warning message

951f07f

After testing the warning output, the additional "WARNING" text seems redundant and is removed. Furthermore, spacing is fixed.

fkiraly approved these changes May 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Make deep classifier's convert_y_to_keras private #6373

[BUG] Make deep classifier's convert_y_to_keras private #6373

cedricdonie commented May 1, 2024 •

edited

fkiraly left a comment •

edited

fkiraly commented May 2, 2024

yarnabrina commented May 2, 2024

cedricdonie commented May 2, 2024 •

edited

fkiraly commented May 2, 2024

fkiraly commented May 2, 2024 •

edited

fkiraly commented May 9, 2024

fkiraly left a comment

cedricdonie commented May 23, 2024

fkiraly commented May 23, 2024 •

edited

cedricdonie commented May 26, 2024

[BUG] Make deep classifier's convert_y_to_keras private #6373

Are you sure you want to change the base?

[BUG] Make deep classifier's convert_y_to_keras private #6373

Conversation

cedricdonie commented May 1, 2024 • edited

Other considerations

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Any other comments?

PR checklist

For all contributions

For new estimators

fkiraly left a comment • edited

Choose a reason for hiding this comment

fkiraly commented May 2, 2024

yarnabrina commented May 2, 2024

cedricdonie commented May 2, 2024 • edited

fkiraly commented May 2, 2024

fkiraly commented May 2, 2024 • edited

fkiraly commented May 9, 2024

fkiraly left a comment

Choose a reason for hiding this comment

cedricdonie commented May 23, 2024

fkiraly commented May 23, 2024 • edited

cedricdonie commented May 26, 2024

cedricdonie commented May 1, 2024 •

edited

fkiraly left a comment •

edited

cedricdonie commented May 2, 2024 •

edited

fkiraly commented May 2, 2024 •

edited

fkiraly commented May 23, 2024 •

edited