PyCon CZ

PyCon CZ 23
15–17 September
Prague

sktime – python toolbox for time series: pipelines and benchmarking a workshop with Franz Kiraly & Benedikt Heidrich

Sunday 17 September 14:00 (3 hours)
Room 343

Sktime is a widely used scikit-learn compatible library for learning with time series. Sktime is easily extensible by anyone, and interoperable with the pydata/numfocus stack. This tutorial is an introduction to advanced forecasting techniques using sktime and its objectives encompass:

  • Learn how to create forecasting pipelines that integrate forecasters and feature extraction methods.
  • Explore more sophisticated and advanced forecasting scenarios and realise them using graphical pipelines and auto-ML.
  • Learn how to evaluate the performance of forecasters.
  • Learn how benchmarks can be created, similar to M4/M5 competitions.

Additionally, this tutorial provides an introduction to reproducibility tools, such as auditably storing model blueprints and fitted models together with a methodological primer.

It is structured to cover the following topics:

Basic forecasting pipelines

  • transformations of endogenous and exogenous data
  • common feature sets: lags, window summaries, dates, holidays
  • feature union and subsetting
  • combination with tuning and reduction

Advanced forecasting pipelines

  • multiplexing, auto-ML
  • graphical pipelines
  • pipeline diagnostics
  • persising pipelines

Performance evaluation

  • metrics for point forecasts
  • metrics for probabilistic forecasts
  • using metrics for multi-instance, hierarchical data

Full benchmarks

  • single dataset and multiple dataset benchmarks
  • data sets, data set collections
  • serializing model blueprints and fitted models

The tutorial sheds light on certain experimental aspects of the benchmarking module and the graphical pipeline, highlighting opportunities for contributions and improvements. Developed collaboratively by an inclusive community, sktime aims to foster ecosystem integration within a neutral and charitable sphere.

The tutorial welcomes engagement, encouraging contributions from individuals across the world.

Requirements

Bring your own laptop.

The tutorial notebooks can be run on binder.

Alternatively, participants can run the notebooks from a clone of the tutorial repository on their local laptop.

What do you need to know to enjoy this workshop

Python level

Medium knowledge: You use frameworks and third-party libraries.

About the topic

No previous knowledge of the topic is required, basic concepts will be explained.

Benedikt Heidrich

I completed a Master of Science degree in informatics in 2019 with the Karlsruhe Institute of Technology. I am working towards a PhD in Informatics, which I finish this year. My research focuses on using deep generative models in energy systems and coping with concept drift in energy time series forecasting.

Additionally, I investigate how general pipeline architecture has to be designed for time series analysis tasks.

Sunday 17 September

14:00 Room 301

Moje první API ve Flasku

Miroslav Brabenec, Petra Číhalová & Lenka Erbenová
Beginner’s track only in Czech
14:00 Room 302
14:00 Room 343
14:00 Room 346
14:00 Room 347

Transform Your Data Game: Mastering Data Modeling and Analytics with dbt

Jozef Regináč, Jaroslav Bezděk, Barbora Drinková & Pavel Ježek
PyData track
14:00 Room 351
14:00 Room 349

Master ESP32‑CAM with help of MicroPython

Tomislav Arnaudov & Marc Martínez Badenes