[ENH] set random seed in `TestAllForecasters` data generation - potential solution for sporadic failures #6382

fkiraly · 2024-05-03T17:40:50Z

@benHeid suggested that sporadic test failures and long test times in #6344 could be related to LU decomposition or similar issues in ARIMA - compare #6201.

This PR aims to help with diagnosis, by setting the random seed in data generation in TestAllForecasters. This should greatly reduce the number of different time series occurring, and hopefully make the failure behaviour - if impacted - deterministic.

FYI @yarnabrina

The CI should run all forecaster tests because of the "test class has changed" criterion.

fkiraly · 2024-05-03T22:27:15Z

hm, no failures but long runtimes - StatsForecastAutoTBATS and StatsForecastAutoTheta specifically.

benHeid · 2024-05-05T12:39:43Z

hm, no failures but long runtimes - StatsForecastAutoTBATS and StatsForecastAutoTheta specifically.

Was this one run or multiples?

After taking a look into the runtimes. Based on the one action run I observe:

MacOS is slow on python 3.8. The other python versions seems to be much faster..
In some tests, I think ubuntu the HFForecaster is quite slow. (Taking more than 50 secs...) Not sure what the reason is... I suppose the download speed should not be the issue since the model used for testing shouldn't be that large.

fkiraly · 2024-05-05T13:50:38Z

I suppose the download speed should not be the issue since the model used for testing shouldn't be that large.

Maybe it is latency? Perhaps this is some DDoS protection kicking in, not allowing too many downloads from the same IP address?

This is perhaps related to a new problem I have been seeing: sometimes when I try to access the logs, my virus scanner says the IP has been blacklisted.

Hypothesis:

there are many hugging face model downloads - each individual fit in the test suite triggers it
hugging face detects this with a DDoS filter - false positive, but this also causes the IP to be blocked, temporarily, or blacklisted in shared IP blacklists by antivirus providers
that causes the load to fail or to hang.

Could it be this, @benHeid, @yarnabrina?

If yes, it may indicate we need to think carefully about testing of hugging face based models - we had similar issues with downloads earlier, so we moved them out to a separate "downloads" CI element. Only that this time, these are downloads attached to models.

benHeid · 2024-05-05T14:14:50Z

Hypothesis:

there are many hugging face model downloads - each individual fit in the test suite triggers it

hugging face detects this with a DDoS filter - false positive, but this also causes the IP to be blocked, temporarily, or blacklisted in shared IP blacklists by antivirus providers

that causes the load to fail or to hang.

Could it be this, @benHeid, @yarnabrina?

We need to test if there are really that many downloads. As far as I know, hugging face is caching downloads. Thus, once the model is downloaded it shouldn't be downloaded again..

benHeid · 2024-05-05T14:20:26Z

Regarding the long runtimes of the AutoModel from Statsforecast. I suppose that there is something strange with python 3.8 (and Mac). Furthermore, I think that this issue is not located in sktime:

Local measurement of the execution time of the unit test with different initializations of the model:

[237.04926705360413, 227.80348587036133, 171.04005408287048, 161.69240593910217] Python 3.8.19
[65.56261396408081, 56.138838052749634, 43.59553575515747, 46.759567975997925] Python 3.10.13

Measurements of the direct fit time. (20 fits with a random time series with a length 1000). I executed it multiple times, since the numbers are fluctuating a lot... Random data 20 fits

24 - 54 sec Python 3.8
11 - 22 sec Python 3.10

Update test_all_forecasters.py

f47d0eb

fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting do not merge should not be merged - e.g., CI diagnostic, experimental diagnostics diagnostic PR to run CI with a modification, e.g., pre-release of dependencies labels May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] set random seed in `TestAllForecasters` data generation - potential solution for sporadic failures #6382

[ENH] set random seed in `TestAllForecasters` data generation - potential solution for sporadic failures #6382

fkiraly commented May 3, 2024 •

edited

fkiraly commented May 3, 2024

benHeid commented May 5, 2024 •

edited

fkiraly commented May 5, 2024 •

edited

benHeid commented May 5, 2024 •

edited

benHeid commented May 5, 2024 •

edited

[ENH] set random seed in TestAllForecasters data generation - potential solution for sporadic failures #6382

Are you sure you want to change the base?

[ENH] set random seed in TestAllForecasters data generation - potential solution for sporadic failures #6382

Conversation

fkiraly commented May 3, 2024 • edited

fkiraly commented May 3, 2024

benHeid commented May 5, 2024 • edited

fkiraly commented May 5, 2024 • edited

benHeid commented May 5, 2024 • edited

benHeid commented May 5, 2024 • edited

[ENH] set random seed in `TestAllForecasters` data generation - potential solution for sporadic failures #6382

[ENH] set random seed in `TestAllForecasters` data generation - potential solution for sporadic failures #6382

fkiraly commented May 3, 2024 •

edited

benHeid commented May 5, 2024 •

edited

fkiraly commented May 5, 2024 •

edited

benHeid commented May 5, 2024 •

edited

benHeid commented May 5, 2024 •

edited