Skip to content

GSoC 2020

Markus Löning edited this page Mar 21, 2021 · 1 revision

Students, we want you!

sktime will be applying as a mentoring organization for Google Summer of Code 2020.

This is our Ideas Page. Join the sktime team for a summer full of coding, learning and fun. Be part of our diverse community and join our efforts to advance machine learning and time series analysis capabilities!

We explicitly encourage female students to apply.

Why sktime?

Time series data is ubiquitous in many applications. Examples include sensor readings from industrial processes, spectroscopy wave length data from chemical samples, or bed-side monitor medical data from patients. Developing advanced time series analysis capabilities for researchers and practitioners is one of the major challenges of contemporary machine learning.

sktime is a new Python toolbox for machine learning with time series and, to the best of our knowledge, the first unified toolbox for time series. Our ambition is to provide for time series what scikit-learn provides for tabular data. This involves extending scikit-learn to the different time series learning tasks, such as time series classification, clustering, forecasting and anomaly detection. To find out more, check out our paper published at the Workshop on Systems for ML at NeurIPS 2019.

How to apply?

  1. Read our how to get started guide,
  2. Try to solve one of the entrance tasks via a PR on GitHub. We will give preference to students who have at least tried to solve one of these tasks,
  3. Contact us informally to discuss applying or apply by sending us your CV and covering letter to info@sktime.org.

What we are looking for

We're actively looking for contributors and your help is extremely welcome. Therefore, if

  • you are interested in time series, machine learning (ML), statistics, API design and software architecture,
  • you like coding in Python,
  • you are familiar with the basic data science ecosystem in Python, including numpy, pandas and scikit-learn,
  • you enjoy working with a vibrant team of experienced ML scientists and software engineers,
  • you always wanted to join an open-source community,

then GSoC with sktime is for you! You'll spend the summer working with our enthusiastic and open-minded team of developers who are creating one of the first comprehensive time series ML toolboxes out there.

What we expect

GSoC is a marathon, not a sprint, and we expect good performance over the whole project. This means that you are in daily contact with your mentors and wider community and that you work full time on the project.

In addition to the individual project work, all students will be required to:

  • peer-review a fellow student's work in the middle and at the end of GSoC,
  • write weekly blog posts about your contribution and a final summary post at the end of the project,
  • have a good time web-socializing with the other students.

Finally, our goal, apart from improving sktime, is to onboard new long-term developers and we would really like you to stay around after GSoC.

Projects

Please find below a list topics to help you get started. But please don't hesitate to propose your own topic to work on.

Title Mentors Short Description Difficulty What you need to know
Time series classification @TonyBagnall beginner classification with scikit-learn
Time series regression (refactoring classification) beginner
Time series clustering (time series distances, kernels, 2nd degree transformers) beginner
Forecasting @mloning model selection, composition, reduction medium
Develop a framework @fkirlay hard

More projects details will follow soon. In the meantime, check out our development roadmap and good first issues!

Mentors

Name GitHub Website
Markus Löning @mloning
Tony Bagnall @TonyBagnall website
Jason Lines @jasonlines
Aaron Bostrom
Franz Király @fkiraly website
George Oastler @goastler

More details on mentors will follow shortly.