[ENH] classification test scenario with three classes #6374

fkiraly · 2024-05-01T21:46:08Z

This adds a classification test scenario with three classes.

Currently, only two classes were tested.

cedricdonie · 2024-05-02T15:18:56Z

sktime/utils/_testing/scenarios_classification.py

+        "X_univariate": True,
+        "X_unequal_length": False,
+        "is_enabled": True,
+        "n_classes": 3,


Would there be any harm in going to more than three classes? I understand that we are interested in multi-class prediction rather than three-class prediction, so e.g., five classes might catch more errors. Or perhaps this test can be made parametric with anywhere from three to e.g., 15 classes to increase coverage further (at the cost of more runtime)?

Yes, you are right, of course - it is runtime that worries me as you say, and the minimum sample size requirement.

Some classifiers have grid search internally, 5-fold as default. So you'd want to see at least 3 instances of each in the training set, which gets you to n_instances = n_classes * 4 (or better 5, 6).

Many classifiers have between second and third power scaling on the number of instances, and there are a large number of classifiers, on each of which the scenario is run. So, going from 3 to about 4 doubles the runtime caused by this scenario, I would guess, which is about 1/3 or 1/4 of the total classifier runtime already.

We can of course check how much it is really, empirically, if you would like to, I do not mind - though I wonder if 5 classes gives that much more coverage than 3. My gut feeling is, if sth breaks with 5, it already breaks with 3, e.g., the one-hot encoder example.

3 class

efcdcd8

fkiraly added module:classification classification module: time series classification enhancement Adding new functionality labels May 1, 2024

fkiraly requested review from achieveordie, benHeid and yarnabrina as code owners May 1, 2024 21:46

fkiraly mentioned this pull request May 1, 2024

[BUG] Make deep classifier's convert_y_to_keras private #6373

Open

6 tasks

Update scenarios_classification.py

319578e

This was referenced May 2, 2024

[BUG] classifiers failing on multiclass scenario due to _get_train_probs #6376

Open

[BUG] fix _get_train_probs in some classifiers to accept any input data type #6377

Open

Update scenarios_classification.py

3d75693

cedricdonie reviewed May 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] classification test scenario with three classes #6374

[ENH] classification test scenario with three classes #6374

fkiraly commented May 1, 2024

cedricdonie May 2, 2024

fkiraly May 2, 2024 •

edited

[ENH] classification test scenario with three classes #6374

Are you sure you want to change the base?

[ENH] classification test scenario with three classes #6374

Conversation

fkiraly commented May 1, 2024

cedricdonie May 2, 2024

Choose a reason for hiding this comment

fkiraly May 2, 2024 • edited

Choose a reason for hiding this comment

fkiraly May 2, 2024 •

edited