Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MAJOR ISSUE] Error in the Calculation of Inverse Transformations for model.predict() #48

Open
reinbugnot opened this issue Nov 6, 2023 · 0 comments

Comments

@reinbugnot
Copy link

reinbugnot commented Nov 6, 2023

Hi Petronio, I'm Rein, Master of AI student.

I wanted to use the PyFTS module to perform a specific subset of my project.

However, from the FTS class, under the predict method, you have the following line at the end of the method used to apply the inverse of the transformation applied to the data during forecasting:

One thing I noticed here is that you fed the test data itself into the params argument of the apply_inverse_transformations() function.

since, with a max_lag of 1 (first order differencing), data[self.max_lag - 1:] is equal to data.

if not self.is_multivariate:
            kw['type'] = type
            ret = self.apply_inverse_transformations(ret, params=[data[self.max_lag - 1:]], **kw)

But then, following the downstream operations, when I looked into the Differential class (/transformations/differential.py), I noticed that the default calculation used to perform the inverse transformation is the following:

if steps_ahead == 1:
            if type == "point":
                inc = [data[t] + param[t] for t in np.arange(0, n)]

where data here is the forecasted data y'(t), param is the test data itself, and inc is supposed to be the output inverse-transformed forecast.

I would like to ask clarification regarding the correctness of this logic. The value range of data is very small, hence, what happens is that the output inverse-transformed forecast is primarily dominated by the value of the test data itself. That is, if in the cited codeblock above, we remove data[t] from [data[t] + param[t] for t in np.arange(0, n)], we would still get a "convincing forecast" because you're practically returning the ground truth values as 'forecasted values'.

This figure comes from the A_short_tutorial_on_Fuzzy_Time_Series _ Part_II_(with_an_case_study_on_Solar Energy) notebook that uses the Chen model. As shown, the forecasted data (in green) is able to capture the test data almost exactly because the test data itself was injected into the output forecast logic.
image

Here is my attempt of replicating the logic with my own data, using the same model.
image

Without using the training data for forecasts, and doing a 365 step_ahead forecast using only the starting test data point as an input, I get the following forecast for the same data as above:
image

Is there something I'm missing? I think the bug above is crucial with respect to the entire logic of the fts model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant