Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH], [BUG]: Adds 27 fpp3 datasets; fixes/enhances plot_series(); stl decomposition plot; predict_interval() for polynomial forecaster #6404

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ericjb
Copy link

@ericjb ericjb commented May 10, 2024

Reference Issues/PRs

Fixes #5895

What does this implement/fix? Explain your changes.

  1. Adds 27 datasets from Hyndman's fpp3. User entry point is sktime.datasets.load_fpp3(). The returned object has appropriate mtype with multiindex as required. The load_fpp3() function could be of interest to the broader Python community, including those not otherwise using sktime. See the accompanying pdf for details.
  2. Adds a new method STLForecaster.plot_components() to plot the STL decomposition
  3. sktime.utils.plot_series() now handles xticks following matplotlib defaults. (Fixes [BUG] Specification of xticks for matplotlib axis returned by plot_series does not work as expected #5895) As a result fig,ax objects returned by plot_series() can be used correctly by matplotlib. See accompanying pdf for details including some before and after plot comparisons.
  4. Implemented predict_interval() method for sktime.forecasting.PolynomialTrendForecaster. Note that the intervals are computed based on Hyndman's formulas in FPP3 (Section 7.9).
  5. Fixed (or enhanced) sktime.utils.plot_series() so that it can plot multiple coverage levels if present in the pred_interval multiindex.

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

No.

Any other comments?

A reviewer should read the attached pdf that provides details. There is considerable information there, including details about the datasets and code examples for the new functionality (and bug fixes).

For all contributions
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
    PullRequestNotes.pdf

Eric Berger and others added 7 commits April 4, 2024 15:31
…_components to forecasting/trend/_stl_forecaster.py
Fixed handling of PeriodIndex in predict_interval(). Also changed from Student t distribution to Normal distribution following Hyndman's formulas in Section 7.9 in FPP3.
@fkiraly
Copy link
Collaborator

fkiraly commented May 10, 2024

This is excellent! Lots of useful contributions!

May I suggest to split these up into multiple PR, I would say four or five?

That makes it much easier to review, since this PR touches topics which need different reviewers etc.
(you can see this if you look at the tags I just added. It is almost easier to list the tags I did not add 😁 )

I would suggest to leave this branch open until things are split into multiple PR, to avoid overwriting or deleting things by accident.

@fkiraly fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing bugfix Fixes a known bug or removes unintended behavior enhancement Adding new functionality module:datasets&loaders data sets and data loaders module:plotting&utilities utilities including plottinng labels May 10, 2024
@ericjb
Copy link
Author

ericjb commented May 10, 2024

I need some guidance on the Git issues related to doing the multiple PRs. i) would these all be relative to the same sktime branch? ii) if so, which branch? and iii) how do I start from that branch? [If the Git issues are a problem maybe it's not so bad to coordinate with the multiple reviewers?]

@fkiraly
Copy link
Collaborator

fkiraly commented May 10, 2024

all PR should have their individual branches, and make PR to main - not another feature branch.

That should mean we have five feature branches, e.g., hyndman-fpp-datasets, stl-plot, etc, each with an associated PR to sktime:main.

Your fork's main now contains all changes, so this might be a bit tricky. Options:

  • create a new fork and add the branches, changes, incrementally
  • create a branch on your fork, e.g., sktime-main which is in sync with sktime main, leave ericjb:main intact. Branch off that branch, ericjb:sktime-main, with 5 branches corresponding to the parts, and PR to sktime:main.

If you are not sure how to do this, you can ask in dev-chat, or attend a meetup (1pm UTC, i.e., in a few minutes today or next week) for screensharing.

@fkiraly
Copy link
Collaborator

fkiraly commented May 10, 2024

PS: I realize it might be slightly painful because everything is on your main now - though it's a very very useful mini-workflow to learn, especially for someone as productive as you who has multiple ideas to work on at the same time! It's the only way to avoid a certain kind of catastrophic interaction from not separating too many innovations at once in the repo. sktime would long be collapsed if the core devs would not be doing that.

Once you know how to do it as part of your dev routine, you'll probably never want to go back to the "mix everything in one big pot" workflow...

@fkiraly
Copy link
Collaborator

fkiraly commented May 10, 2024

I thought of a third option, bit hacky but perhaps easiest to carry out:

  • take your current ericjb:main
  • make a branch for every featuer branch in the end
  • in that branch, copy the files from sktime main over the files you do not want to change, commit
  • make PR from all such branches to sktime

@fkiraly
Copy link
Collaborator

fkiraly commented May 10, 2024

let us know if you would like help with this.

@ericjb
Copy link
Author

ericjb commented May 11, 2024

I need some minor help with git. A few emails with a git expert should help resolve my issues and I will be able to take it from there. Thanks

@fkiraly
Copy link
Collaborator

fkiraly commented May 11, 2024

sure! If you can make it on our discord, you can catch one of the core devs, and I am sure we can sort it in 10-20min of direct exchange.

Discord link: https://discord.com/invite/54ACzaFsn7

@fkiraly
Copy link
Collaborator

fkiraly commented May 11, 2024

Also, related (source: https://xkcd.com/1597/)

git_2x

If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a known bug or removes unintended behavior enhancement Adding new functionality module:datasets&loaders data sets and data loaders module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting module:plotting&utilities utilities including plottinng module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Specification of xticks for matplotlib axis returned by plot_series does not work as expected
2 participants