Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] all_objects redirects sys.stdout permanently #327

Closed
weenerplasticsgroup opened this issue May 11, 2024 · 7 comments · Fixed by #328
Closed

[BUG] all_objects redirects sys.stdout permanently #327

weenerplasticsgroup opened this issue May 11, 2024 · 7 comments · Fixed by #328
Labels
bug Something isn't working

Comments

@weenerplasticsgroup
Copy link

weenerplasticsgroup commented May 11, 2024

Describe the bug
Sktime changes stdout, which breaks a jupyter notebook

To Reproduce

from sktime.registry import all_estimators
import sys
orig_stdout = sys.stdout
all_estimators(estimator_types="clusterer", filter_tags="capability:unequal_length")
new_stdout = sys.stdout
print(f"sktime changes stdout from {orig_stdout} to {new_stdout}")

sktime changes stdout from <ipykernel.iostream.OutStream object at 0x7f1a0ea77f10> to <_io.TextIOWrapper name='' mode='w' encoding='utf-8'>

Expected behavior
No changes

Versions

Additional context

Please also not that not of them work, see sktime/sktime#6276

from pydoc import locate
from sktime.registry import all_estimators
from sktime.datasets import load_acsf1

# create test data set
RANDOM_STATE= 2
no_of_unknown_clusters = 5
X, _ = load_acsf1(return_type='pd-multiindex')
# remove last 10 rows from the last appliance to simulate unequal time series 
X_mod = X.iloc[:-10]


for model in all_estimators(estimator_types="clusterer", filter_tags="capability:unequal_length"):
    sys.stdout = orig_stdout  # to fix stdout bug
    obj = model[1]
    unequal_clst = locate(".".join([obj.__module__, obj.__name__]))
    try:
        unequal_clst().fit(X=X_mod) 
        print(f"{obj.__name__} did not fail")
    except TypeError:
        print(f"{obj.__name__} failed on typeerror")
    except ValueError as e:
         print(f"{obj.__name__} failed on {e}")
ClustererPipeline failed on typeerror
SklearnClustererPipeline failed on typeerror
TimeSeriesDBSCAN failed on typeerror
TimeSeriesKMeans failed on The data has unequal length series, this clusterer cannot handle unequal length series
TimeSeriesKMeansTslearn failed on The data has unequal length series, this clusterer cannot handle unequal length series
TimeSeriesKMedoids failed on The data has unequal length series, this clusterer cannot handle unequal length series
TimeSeriesKShapes failed on The data has unequal length series, this clusterer cannot handle unequal length series
TimeSeriesKernelKMeans failed on The data has unequal length series, this clusterer cannot handle unequal length series
TimeSeriesLloyds failed on typeerror
@fkiraly
Copy link
Contributor

fkiraly commented May 11, 2024

Thanks for reporting! This might be coming from scikit-base all_objects.

Could you kindly report:

  • your sktime version, use show_versions from sktime
  • if this is still the case if you switch suppress_import_stdout to False?

@fkiraly
Copy link
Contributor

fkiraly commented May 11, 2024

Regarding the "unequal length does not work":

  • you need to pass a dict as filter_tags, i.e., {"capability:unequal_length": True} at the moment. This may be unintuitive, so perhaps we want to allow strings to mean "True".
  • for some estimators, whether it can handle unequal length daata may depend on the values, e.g., DBSCAN needs to be given a distance that can handle unequal length time series.

@AxelJanRousseau
Copy link

I have the same issue: after calling all_estimators, print stops working in jupyter notebooks.
Using suppress_import_stdout=False fixes the issue for me.

For reference, here is my output of show_versions:

System:
python: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:40:08) [MSC v.1938 64 bit (AMD64)]
executable: c:\Users\user\Miniconda3\envs\timeseries\python.exe
machine: Windows-10-10.0.19045-SP0

Python dependencies:
pip: 24.0
sktime: 0.28.0
sklearn: 1.4.1.post1
skbase: 0.7.5
numpy: 1.26.4
scipy: 1.13.0
pandas: 2.2.1
matplotlib: 3.8.3
joblib: 1.3.2
numba: 0.59.1
statsmodels: 0.14.1
pmdarima: 2.0.4
statsforecast: 1.7.3
tsfresh: None
tslearn: None
torch: 2.2.2
tensorflow: None
tensorflow_probability: None
c:\Users\user\Miniconda3\envs\timeseries\lib\site-packages\statsforecast\core.py:26: TqdmExperimentalWarning: Using tqdm.autonotebook.tqdm in notebook mode. Use tqdm.tqdm instead to force console mode (e.g. in jupyter console)
from tqdm.autonotebook import tqdm
c:\Users\user\Miniconda3\envs\timeseries\lib\site-packages\statsforecast\utils.py:237: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.
"ds": pd.date_range(start="1949-01-01", periods=len(AirPassengers), freq="M"),

@fkiraly
Copy link
Contributor

fkiraly commented May 14, 2024

For reference, here is my output of show_versions:

Thanks, @AxelJanRousseau.

Just to check, are the last lines also sth you get when using show_versions? I.e., the TqdmExperimentalWarning etc?

@AxelJanRousseau
Copy link

are the last lines also sth you get when using show_versions? I.e., the TqdmExperimentalWarning etc?

Those lines were part of the output from show_versions, and I included them just to make sure.

But I tried again just now in a clean notebook and the warnings disappeared, so it was probably related to the boatload of other imports I had open in my notebook

@fkiraly
Copy link
Contributor

fkiraly commented May 14, 2024

This is an skbase bug, moving to the repository.

@fkiraly fkiraly transferred this issue from sktime/sktime May 14, 2024
@fkiraly
Copy link
Contributor

fkiraly commented May 14, 2024

Attempted fix here:
#328

Review and testing would be appreciated.

@fkiraly fkiraly changed the title [BUG] all_estimators redirect std_out [BUG] all_objects redirects sys.stdout permanently May 14, 2024
fkiraly added a commit that referenced this issue May 14, 2024
This fixes #327, which caused a
permanently muted `stdout` after `all_objects` call.

The bug was due to the `sys.stdout` not being properly reset after
muting, in one of two instances where this was done.

This PR also replaces both instances with a muter context manager - the
implementation of the muting was a bit brittle, relying on manual
try/except and temp stdout handling which is risky.

Includes a test that the `sys.stdout` is indeed unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants