Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] coordination discussion on foundation models, deep learning, backends #6381

Open
fkiraly opened this issue May 3, 2024 · 16 comments
Open
Labels
enhancement Adding new functionality implementing algorithms Implementing algorithms, estimators, objects native to sktime implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality interfacing algorithms Interfacing existing algorithms/estimators from third party packages module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented May 3, 2024

Opening this issue to coordinating the various summer projects in relation to foundation models, deep learning, backends, interfaces.

Below a list of related umbrella issues and individual issues - for now, focusing on forecasting primarily.

FYI @fnhirwa, @geetu040, @julian-fong, @pranavvp16, @Xinyu-Wu-0000.
FYI mentors @benHeid, @kirilral, @marrov, @onyekaugochukwu, @yarnabrina.

@fkiraly fkiraly added implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality implementing algorithms Implementing algorithms, estimators, objects native to sktime interfacing algorithms Interfacing existing algorithms/estimators from third party packages module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality labels May 3, 2024
@yarnabrina
Copy link
Collaborator

yarnabrina commented May 3, 2024

Another darts PR: #5043

This one was generic. The regression focused one is #5997 (and related #5447), but these two are not deep learning based.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

If we go by the topics proposed in GSoC proposals and previous work, we get - for the start:

  • @fnhirwa -> darts, pytorch, pytorch-forecasting
  • @geetu040 -> pytorch, pytorch-forecasting; hugging face, peft/lora
  • @pranavvp16 -> polars, sklearn/polars, nixtla/polars; hugging face, peft/lora
  • @julian-fong -> pre-trained models, fine-tuning, fastAI like API
  • @Xinyu-Wu-0000 -> global API, pre-training and fine-tuning examples

There are three intersections here:

One possible assignment that avoids conditionalities and duplications for the 1st month would be:

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

please add any corrections, suggestions for improvement, comments, etc - if preferences lie elsewhere, we can of course switch things around. For discussion until the 1st tech meeting where we'll plan.

@benHeid
Copy link
Contributor

benHeid commented May 3, 2024

Possible further work item could be integration of GluonTS.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 3, 2024

Yes - gluonts was actually part of the very original forecaster feature wishlist, here: #220
It's funny to see how long that wishlist was and how almost everything on it is now available - most recently the bagging ensemble which I upgraded due to tsbootstrap (FYI @astrogilda). In a similar vein, the newer "auto-gluon" is a "one size fits all" approach similar to autots.

gluonts has its own data container, there are already some converters in sktime (possibly incomplete): #2860

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 10, 2024

from May 10 meeting:

@pranavvp16
Copy link
Contributor

As discussed in todays mentoring meet with @benHeid, and as mentioned here I will start working on adding support to polars scitype for the first month. Commenting this here to co-ordinate with other mentees and mentors, Please feel free to reply on this if any other mentee is also interested or working on adding polars support.

@julian-fong
Copy link
Contributor

julian-fong commented May 14, 2024

I am not too familiar with polars/parallel and distributed functionality yet but would like the opportunity to learn and contribute to adding polars support

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 14, 2024

@pranavvp16, @julian-fong, a high-level outline is in this issue here: #5423 (comment)

There are two things one could work in parallel:

  • sktime needs mtypes implemented, we could start with Series, then Panel. Since polars has no multi-index (is this correct?) I would suggest the same conventions as in dask, for Panel.
  • in skpro, a polars container format is already implemented, although this is not full support as internally it just converts back/forth.

(and imo these are currently the only two fully parallel items)

The "battle plan" for support is - in both packages - first mtype support, then enable support in a few estimators, see if we can support eager and even lazy.

I think skpro estimators are simpler, so if in parallel to working on sktime mtypes, we try enabling native polars support for a number of estimators, we will learn of the challenges and solutions ahead of time, and at a lower cognitive cost.

What are your thoughts? Any preferences?

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 14, 2024

Related to polars, there is also this refactor of the datatypes module:

#6033

This would make it easier to add mtypes with soft dependencies, and someone could pick it up and review - or complete - it. I was also going to look at it soon.

@pranavvp16
Copy link
Contributor

yess the plan looks good to me for now, but polars doesn't even support index as well as multindex. Also if I'm not wrong we have to get this refactor merged before we can start adding polars mtype support in sktime ??

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 14, 2024

yes the plan looks good to me for now, but polars doesn't even support index as well as multindex.

That could be handled the same way, no? Have a column called __index or similar. That seems only a minor adaptation to how we have dealt with dask.

Also if I'm not wrong we have to get this refactor merged before we can start adding polars mtype support in sktime ?

No, there is no such conditionality, the refactor would just make it more convenient to add new data container types.

@fnhirwa
Copy link
Contributor

fnhirwa commented May 15, 2024

As per the last conversation with @fkiraly I'll pick up this [ENH] darts adapter #5043 and liaise with @yarnabrina

Commenting here to coordinate so It can't collide with any other ongoing task.

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 15, 2024

Re polars, to get back to coordinating tasks, current discussion sounds like:

Does this make sense, and does this align with your preferences?

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 15, 2024

As per the last conversation with @fkiraly I'll pick up this [ENH] darts adapter #5043 and liaise with @yarnabrina

Excellent - the issue is #1624, could you kindly comment there so I can assign you?

@fkiraly
Copy link
Collaborator Author

fkiraly commented May 18, 2024

@julian-fong, polars issue in skpro: sktime/skpro#342

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality implementing algorithms Implementing algorithms, estimators, objects native to sktime implementing framework Implementing or improving framework for learning tasks, e.g., base class functionality interfacing algorithms Interfacing existing algorithms/estimators from third party packages module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting
Projects
None yet
Development

No branches or pull requests

6 participants