New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Refactor base series annotator #6265
base: main
Are you sure you want to change the base?
Conversation
ClaSP uses `fmt` to change the output format, so it has been kept as a attribute of the `ClaSPSegmentation` class.
sktime/annotation/adapters/_pyod.py
Outdated
""" | ||
|
||
_tags = {"python_dependencies": "pyod"} | ||
|
||
def __init__(self, estimator, fmt="dense", labels="indicator"): | ||
def __init__(self, estimator, learning_type="unsupervised"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't pyod
always unsupervised?
As discussed in the last developer meeting, the approach using tags is showcased here: #6271 This allows to keep all "property"-like attributes in one, inspectable place. |
PS: if you migrate to tags, you need to merge the parts of #6271 as well that add tags to the registry. |
I am currently reviewing and will write a longer review. Regarding the last question, would a solution not listed clear things up: transforming the last interval to |
For that example yes but I think it becomes less clear if we have an index that is not equally spaces. For example, taking the example from before and making it into a proper pandas series we get,
the interval of the final segment could be
For example, I don’t think an interval of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
I think this will be the foundation of finally sorting this module out.
I also like how this removes boilerplate from the concrete classes.
The one worry I have is about removal of arguments and that the changes here are not deprecation safe.
The module is clearly marked as "experimental", so we could do that, I wonder though whether we can do sth, like leaving the old args, and raising warnings that they will disappear, instead.
I see. How about making the last interval run over by the last difference, then? I would also like to summon @VascoSch92 who might have a useful opinion here, given both sequences and sets are involved at the same time - hope you enjoy this 😁 The greater question is coming up with mutually inverse functions from a value representation to a segment representation, as above. This is not possible if the target index is not known (only partial inverses), but the function assume we know |
Just some thoughts from my side: I think the solution of @Alex-JG3 is good. What are you doing here is to represent in an intelligent way the values of a column/serie, i.e., you want to group interval of contiguous indexes where the column/serie is constant. After that if you want to consider closed/open/right-closed/... interval, it is just taste, but it should be written clearly in the documentation or/and in the docstring. The choice of taking left-closed and right-open interval ( for example
|
Not for |
Warn that fmt and labels will be deprecated
No the index is not currently always monotonically increasing but it should be. In fact I think some of the new methods will break if index is not monotonically increasing. We need to enforce this. I have added deprecation warnings in clasp and the pyod annotator. I think those are the only two classes that are affected by deprecation. Currently, the deprecation message feels very brief:
Is this sufficient? Do we want to point a specific version when these arguments will be depracted? |
I would say sth like
using |
I have rewritten the deprecation warning. I have also removed the deprecation warning for If everyone else is happy are we ready to merge? |
Could someone please rerun the CI/CD pipeline? Looks like the docs jobs has failed due to running out of time to build. I don't think this as a result of changes introduced in this PR. |
Sure - rerunning |
Restarted - I also updated the deprecation messages and added release manager notes. Question, it seems we want to remove the |
This reverts commit fdaf7af.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are ready to merge for release with 0.30.0.
@Alex-JG3, it would be great if you could check the updated warnings and release manager comments.
Reference Issues/PRs
See #3214 .
What does this implement/fix? Explain your changes.
Refactors the
BaseSeriesAnnotator
class:fmt
andlabel
attributes.learning_type
andtask
attributes.predict
andtransform
method.predict_points and
predict_segments` methods.predict
andtransform
.Does your contribution introduce a new dependency? If yes, which one?
No.
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Any other comments?
PR checklist
For all contributions
How to: add yourself to the all-contributors file in the
sktime
root directory (not theCONTRIBUTORS.md
). Common badges:code
- fixing a bug, or adding code logic.doc
- writing or improving documentation or docstrings.bug
- reporting or diagnosing a bug (get this pluscode
if you also fixed the bug in the PR).maintenance
- CI, test framework, release.See here for full badge reference
maintainers
tag - do this if you want to become the owner or maintainer of an estimator you added.See here for further details on the algorithm maintainer role.
For new estimators
docs/source/api_reference/taskname.rst
, follow the pattern.Examples
section.python_dependencies
tag and ensureddependency isolation, see the estimator dependencies guide.