[ENH] Adding ADI/CV Feature Extractor #6336

shlok191 · 2024-04-26T05:20:15Z

Reference Issues/PRs

Fixes #6286. See also #6279 for more information about the original request!

What does this implement/fix? Explain your changes.

This PR implements a feature extractor that has the capability to process time series data representing
demand over time into one of 4 categories (smooth, intermittent, erratic, lumpy) based on the guidelines
detailed in the paper: "The accuracy of Intermittent Demand Estimates" by J. Boylan, A. Syntetos.

Does your contribution introduce a new dependency? If yes, which one?

The PR only requires pandas, which I do not believe will be a new dependency!

What should a reviewer concentrate their feedback on?

This is my first time implementing a transformer, so I would really appreciate any feedback on how I specifically am processing my input and output types and if the way I calculate the class labels is the way someone with more experience would do it!

Did you add any tests for the change?

Yes, I added 3 test parameters with their own ADI and CV threshold values to test how varying thresholds
can impact classification. I also set some thresholds to 0.0 to see how that might impact the labels given!

Requests for next steps

I am really hoping to test out this estimator locally (I am sure I must have some mistakes right now) so I would really appreciate some advice on some test data I could potentially try out the estimator on and if there might be a recommended approach to testing the new feature.

This would also tie to the remaining items in my PR checklist, and will help me complete this issue!

PR checklist

For all contributions

I've added myself to the list of contributors with any new badges I've earned :-)
Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

For new estimators

I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
dependency isolation, see the estimator dependencies guide.

…#6258)" This reverts commit c998b3d.

sktime/classification/feature_based/_adi_cv_classifier.py

sktime/datasets/_single_problem_loaders.py

fkiraly

Really nice! This is almost ready, there are small but blocking issues:

this is a transformer, so it should be in the transformations module. I would suggest transformations.series.adi_cv as a file to put it
for testing, we need to add tests, since the test suite data currently does not generate expressly data in the four classes. It should not be too difficult to make synthetic data that fits in either of the four categories. For consistency, I would suggest a new file transformations.series.tests.test_adi_cv.
I would make features the first parameter, as that is probably the most "interesting" for users
does the ADI value in your formula have the fraction the wrong way round?
the changes to the data loaders seem unrelated but potentially valid - can you kindly make these in a different PR, and describe what these are about?

sktime/classification/feature_based/_adi_cv_classifier.py

fkiraly · 2024-04-26T10:10:27Z

Just to help concretely with synthetic generation:

for high CV2, sampling from normal and squaring should work
for low CV2, constant or almost constant should work
for high intermittency, set many values to 0

shlok191 · 2024-04-28T00:06:02Z

@fkiraly, I just pushed a new commit that addresses the suggestions! I had some points that I wanted to quickly discuss:

I have updated the location of the transformer to the appropriate location: sktime/transformations/series
I have also added tests! I have added a file sktime/transformations/series/tests/test_adi_cv.py that generates time Series of all 4 formats. For maintaining low variance, I set standard deviation to 0.25 to the np.random.normal() function, I hope that is okay!` Also, for high variance, I set standard deviation to 1.0 and square the generated values like you suggested
I have made features the first parameter for the transformer's initialization now, I do agree that it will likely be more interesting for users. 😄
I have reversed the numerator and denominator for my ADI calculations
I accidentally reverted the merge from [BUG] Fix tsf data error log and make it more precise #6258. I had made my first commit from VS Code which did not show me the code quality tests. So, I attempted to revert my commit and make them from the command prompt but I accidentally reverted back the [BUG] Fix tsf data error log and make it more precise #6258 commit as well. Sorry about that, and I will fix it in the next commit when I add examples and API documentation!

fkiraly

Nice, thanks! Great addition!

the changes to the datasets module are stlil present. If git behaviour is confusing, you can copy the files from main over the modified files and commit. PRs are squashed, so the individual commits around those files will not be visible.
your get_test_params seems non-compliant. If you add a new estimator, please always check that check_estimator runs.

shlok191 · 2024-04-29T07:43:23Z

Thank you! I've made a new commit that adds in the reverted commit. Also, I believe I have fixed the error originating from the get_test_params() function. I really hope that it works now!

fkiraly

Nice!

Some comments:

please do not delete other people from the "all contributors" file
you are still making changes to the data loaders file. Kindly move these to a separate PR.
your test test_adi_cv is not runing - I would suggest you try running it locally on your computer before you push
the failure related to get_params is genuine. Parameters of the estimator should be written to self in __init__ and not changed afterwards.

shlok191 · 2024-05-04T06:49:58Z

@fkiraly, I have some updates!

Sorry about the changes to the contributors file. I think I'm doing something wrong with how the PR merges with main which is leading to this issue with certain files. However, I copied all files from main this time. I still see some changes in .all-contributorsrc which I did not make, but I'll try to resolve them if they are still different from main.
I have worked out the changes in test_adi_cv and I tested it locally before pushing it. The test seem to be running okay locally at least!
Yes, I'm starting to work on the get_params issue as well. I'm hoping that once I can get the tests working I'll start making progress on that end
Could we possibly get on a call sometime to discuss adding test parameters? I'm struggling to think of some good ones but I think talking this through with you will be helpful for me to understand how I can find good edge cases! 😃

EDIT: From the CI, it seems that the tests are sporadic and only work sometimes...I'll look further into this!

fkiraly · 2024-05-04T09:43:17Z

Great! If you have any question about resolving the get_params issue, feel free to follow up.

shlok191 · 2024-05-06T22:57:09Z

@fkiraly, I believe that my code's working better now!

For my test data generation, I increased the variance associated with lumpy / erratic Series to assist with the test cases. I have also made sure not to modify any of the variables from the __init__ function of the transformer which has helped with the get_test_params related errors.

I'll start work on the final items in the checklists!

fkiraly

This is very nice!

This could be merged, I am just requesting improved documentation.

docstring should mention the standard thresholds in the preamble.
docstring should be clearer on the return of transform, e.g., what is in the columns
the paper should be cited with its full reference in a References section, see other estimators that do that
the estimator should be added to the API reference, docs/source/api_reference/transformations.rst, I would add it in the "Summarization" section of Series-to-Features transformers

shlok191 · 2024-05-07T00:21:55Z

Got it! I'll make the changes and push them soon! :)

…nto primitives_transformer

shlok191 · 2024-05-09T02:16:20Z

@fkiraly, I just added in the required changes for the documentation, included the research paper's citation and added the changes to the docstrings like you requested!

Update: I understand that there are some failing checks but I am not sure if they are stemming from the ADICVTransformer. I am not sure of the reason for these...have you seen them before?

fkiraly

I understand that there are some failing checks but I am not sure if they are stemming from the ADICVTransformer. I am not sure of the reason for these...have you seen them before?

These are probably related to the transformer, what it means is that somewhere in the file you used characters that cannot be parsed by python.

I suspect the apostrophes in the citation you added, and that removing them will fix the issue.

fkiraly

This fixed it

shlok191 · 2024-05-10T19:47:14Z

@fkiraly, thanks! Closing the PR now.

fkiraly · 2024-05-10T19:49:21Z

closed by mistake, should get merged

…sktime into pr/6336

fkiraly

Approving again.

Also added a full description in the docstring, and fixed an error with the "minus one" in the computation of the ADI.

shlok191 · 2024-05-18T22:21:58Z

@fkiraly, I have added a very rough first commit for the compositor discussed in #6279. I have implemented the _fit and _predict functions and am going to try and make progress on additional functions.

Could you please let me know if my approach for this problem seems okay? Also, please let me know if you'd like me to work in a new branch. I can try and make a fork from the primitives_transformer branch or possibly make a branch off main again. Thank you!

This reverts commit e3b7530.

fkiraly · 2024-05-19T02:19:51Z

@fkiraly, I have added a very rough first commit for the compositor discussed in #6279. I have implemented the _fit and _predict functions and am going to try and make progress on additional functions.

Could you please let me know if my approach for this problem seems okay? Also, please let me know if you'd like me to work in a new branch. I can try and make a fork from the primitives_transformer branch or possibly make a branch off main again. Thank you!

@shlok191, that's great!

Yes, it would be great if you could work on this on a new branch. To make things easier, I have merged the parts with the ADI/CV transformer.

If you do not update from the branch, you should be able to continue on it.
If you do update, you may have to revert my last commit, to get your changes back.

fkiraly · 2024-05-19T02:22:19Z

Regarding the approach - this is nice, but I was imagining it more general.

That is, I thought it could take any series-to-primitives transformer, including the ADI/CV transformer - the transformer should be also an argument, assume it produces categories out of the time series in a column, first. ADI/CV could be a default though.

shlok191 · 2024-05-19T04:00:11Z

Thank you so much for the merge, I'll make a new branch and copy over any new changes I made 🙂
Also, got it. I can definitely make changes to make this more general. I'll get to the updates!

shlok191 added 3 commits April 25, 2024 13:24

completed code for ADI/CV feature extractor

4d3d89a

Revert "[BUG] Fix tsf data error log and make it more precise (sktime…

84d7a95

…#6258)" This reverts commit c998b3d.

completed code for ADI/CV feature extractor

da23654

shlok191 requested review from achieveordie, benHeid, fkiraly and yarnabrina as code owners April 26, 2024 05:20

VascoSch92 reviewed Apr 26, 2024

View reviewed changes

sktime/classification/feature_based/_adi_cv_classifier.py Outdated Show resolved Hide resolved

VascoSch92 reviewed Apr 26, 2024

View reviewed changes

sktime/datasets/_single_problem_loaders.py Outdated Show resolved Hide resolved

fkiraly requested changes Apr 26, 2024

View reviewed changes

fkiraly reviewed Apr 26, 2024

View reviewed changes

sktime/classification/feature_based/_adi_cv_classifier.py Outdated Show resolved Hide resolved

fkiraly reviewed Apr 26, 2024

View reviewed changes

sktime/classification/feature_based/_adi_cv_classifier.py Outdated Show resolved Hide resolved

updated code according to suggestions

cf53140

fkiraly requested changes Apr 28, 2024

View reviewed changes

added example code and refined code behavior

6194f78

fkiraly requested changes Apr 29, 2024

View reviewed changes

fixed tests for ADI/CV transformer

ab626f5

shlok191 added 2 commits May 6, 2024 03:26

Refined test data generation

666eaeb

Fixed bug with adi_cv code

584af12

fkiraly added enhancement Adding new functionality module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing labels May 7, 2024

Merge branch 'main' into pr/6336

ca5bcaa

fkiraly requested changes May 7, 2024

View reviewed changes

shlok191 added 2 commits May 8, 2024 21:13

Improved docstrings and added references

6ddefdd

Merge branch 'primitives_transformer' of github.com:shlok191/sktime i…

7b10dfd

…nto primitives_transformer

fkiraly requested changes May 9, 2024

View reviewed changes

Removed apostrophe from citation

ce9f106

fkiraly previously approved these changes May 10, 2024

View reviewed changes

shlok191 closed this May 10, 2024

fkiraly reopened this May 10, 2024

fkiraly added 2 commits May 11, 2024 01:11

docstring, fixes

f85487c

Merge branch 'primitives_transformer' of https://github.com/shlok191/…

6e76978

…sktime into pr/6336

fkiraly dismissed their stale review via 6e76978 May 11, 2024 00:11

fkiraly previously approved these changes May 11, 2024

View reviewed changes

Update adi_cv.py

ccbab9e

fkiraly dismissed their stale review via ccbab9e May 11, 2024 00:13

fkiraly added 3 commits May 11, 2024 01:15

Update adi_cv.py

7bcb4ee

more test cases

12d2899

fix incorrect check of features

25890cc

fkiraly mentioned this pull request May 14, 2024

[ENH] Composition forecaster that clusters time series and forecasts by cluster #6279

Open

Rough draft of ADI/CV based compositor

e3b7530

Revert "Rough draft of ADI/CV based compositor"

77b51c4

This reverts commit e3b7530.

fkiraly merged commit 76725c5 into sktime:main May 19, 2024
1 of 2 checks passed

shlok191 deleted the primitives_transformer branch May 19, 2024 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Adding ADI/CV Feature Extractor #6336

[ENH] Adding ADI/CV Feature Extractor #6336

shlok191 commented Apr 26, 2024 •

edited

fkiraly left a comment •

edited

fkiraly commented Apr 26, 2024

shlok191 commented Apr 28, 2024

fkiraly left a comment

shlok191 commented Apr 29, 2024

fkiraly left a comment

shlok191 commented May 4, 2024 •

edited

fkiraly commented May 4, 2024

shlok191 commented May 6, 2024

fkiraly left a comment

shlok191 commented May 7, 2024

shlok191 commented May 9, 2024 •

edited

fkiraly left a comment

fkiraly left a comment

shlok191 commented May 10, 2024 •

edited

fkiraly commented May 10, 2024 •

edited

fkiraly left a comment

shlok191 commented May 18, 2024

fkiraly commented May 19, 2024

fkiraly commented May 19, 2024

shlok191 commented May 19, 2024

[ENH] Adding ADI/CV Feature Extractor #6336

[ENH] Adding ADI/CV Feature Extractor #6336

Conversation

shlok191 commented Apr 26, 2024 • edited

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Requests for next steps

PR checklist

For all contributions

For new estimators

fkiraly left a comment • edited

Choose a reason for hiding this comment

fkiraly commented Apr 26, 2024

shlok191 commented Apr 28, 2024

fkiraly left a comment

Choose a reason for hiding this comment

shlok191 commented Apr 29, 2024

fkiraly left a comment

Choose a reason for hiding this comment

shlok191 commented May 4, 2024 • edited

fkiraly commented May 4, 2024

shlok191 commented May 6, 2024

fkiraly left a comment

Choose a reason for hiding this comment

shlok191 commented May 7, 2024

shlok191 commented May 9, 2024 • edited

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly left a comment

Choose a reason for hiding this comment

shlok191 commented May 10, 2024 • edited

fkiraly commented May 10, 2024 • edited

fkiraly left a comment

Choose a reason for hiding this comment

shlok191 commented May 18, 2024

fkiraly commented May 19, 2024

fkiraly commented May 19, 2024

shlok191 commented May 19, 2024

shlok191 commented Apr 26, 2024 •

edited

fkiraly left a comment •

edited

shlok191 commented May 4, 2024 •

edited

shlok191 commented May 9, 2024 •

edited

shlok191 commented May 10, 2024 •

edited

fkiraly commented May 10, 2024 •

edited